URL extractors?



I got a custom coded in C#.NET, just pass a regex and it will spider the web for you. Hit me up if you need it.
 
How many text docs are you wanting to extract from? I'm assuming you have saved webpage source with some sort of scraper correct? I've used a great tool called textpipe but it's pricey - the demo will do up to 100 docs though.
 
If you're trying to extract from a webpage, create a bookmark with this code in the URL

Code:
javascript:(function(){var x,n,nD,z,i; function htmlEscape(s){s=s.replace(/&/g,'&');s=s.replace(/>/g,'>');s=s.replace(/</g,'<');return s;} function attrQuoteEscape(s){s=s.replace(/&/g,'&'); s=s.replace(/"/g, '"');return s;} x=prompt("Show links with this word/phrase in link text or target url (leave blank to list all links):", ""); n=0; if(x!=null) { x=x.toLowerCase(); nD = window.open().document; nD.writeln('<html><head><title>Links containing "'+htmlEscape(x)+'"</title><base target="_blank"></head><body>'); nD.writeln('Links on <a href="'+attrQuoteEscape(location.href)+'">'+htmlEscape(location.href)+'</a><br> with link text or target url containing "' + htmlEscape(x) + '"<br><hr>'); z = document.links; for (i = 0; i < z.length; ++i) { if ((z[i].innerHTML && z[i].innerHTML.toLowerCase().indexOf(x) != -1) || z[i].href.toLowerCase().indexOf(x) != -1 ) { nD.writeln(attrQuoteEscape(z[i].href) + '<br>'); } } nD.writeln('<hr></body></html>'); nD.close(); } })();
 
  • Like
Reactions: Jake232