Looking for a few easy scripts that will do this for meStill waiting for this to star

alc

New member
Mar 3, 2007
300
2
0
Anyone got a easy script that will go though hundreds of urls and check for repeats and delete them? Also wounder if there is one that will click through to check if the sights are active and delete any with error pages?
 


one-upsmanship -- after using mattseh's script to get the unique addresses, this will load each page in curl, output it's HTTP status, and add the ones that are http200 (OK) to the status-ok.txt file
Code:
for i in `sort -u bigfuckingfile.txt | uniq -u`; 
do 
  page=$(curl -is $i);
  out=$(echo $page | head -n1 | cut -d\  -f2);
  echo "Status:\t$out \t$i";
  [[ "$out" == "200" ]] && echo $i >> status-ok.txt;
done
suck my balls, matt
 
just for the sake of completeness --
Code:
# example input: bigfuckingfile.txt
http://www.google.com
http://www.google.com
http://google.com
http://google12345.com

# script output:
Status:	 	http://google12345.com
Status:	301 	http://google.com
Status:	200 	http://www.google.com

# example output file: status-ok.txt
http://www.google.com