Info Scrapers / Spiders - What to gather?

Status
Not open for further replies.

Bofu2U

Automation Specialist
May 18, 2007
8,635
146
0
Vegas, Baby.
www.offerstreamdigital.com
I've taken upon a personal project to write a scraper / spider but I ran into a small problem. What to scrape / spider? :)

I can gather almost anything.. but I don't know what to test it with. What kind of information should I collect?

Ok, all honesty in this. What would be beneficial to me to gather, besides like.. blog posts. No email addresses etc, already wrote one for that.

(New to WF, if this is in the wrong section I apologize)
 


What is that you want to use the info for, is it to getter content or is it to getter links so you can spam them, or do you have more legit reasons to build a scraper (strange thing is that i can't think of one :D)
 
See, that's the thing. I enjoy writing scrapers and filters to gather info, but no ideas for using them. I highly modified a search engine before, just looking for something that may get me some money rollin' heh.

No, no black hat or spamming etc. All legit.
 
Links to tutorials. Make a bigass tutorial index all with your crawler. Then tell me when you do it. :)
 
I can do that.. but the problem with that is, I would still have to weed out the good from the bad etc. Heck, if I wanted that sort of thing, just crawl every single affiliate marketing blog on BumpZee. That'll give me a lot. :)
 
Just do what chatmasta said, but just index all tutorials on like pixel2life and good-tutorials etc. then look for duplicates and get rid of them. Then launch it as the biggest tutorial engine known to man.

And post a few tutorials on some other sites to get the traffic flowing
 
Need a few more ideas.. Scraping emails and all of that just is too easy, I want something else that will actually be worthwhile. Someone PM'd me asking me to scrape PDF articles.. Which is a possibility, but once again, where's the profit potential... heh.

Someone else has to have some sort of idea...
 
Just do what chatmasta said, but just index all tutorials on like pixel2life and good-tutorials etc. then look for duplicates and get rid of them. Then launch it as the biggest tutorial engine known to man.

And post a few tutorials on some other sites to get the traffic flowing

I scraped those sites just to get tutorial site URLs. But I didn't grab the actual links....that's a bit dodgy to me and you will probably get in trouble for it. They specifically prohibit it.
 
If you're so uncreative that you have to ask what to scrape, you need to reconsider what you're doing. For real.
 
He is right Bofu2U, our ideas will only get you so far, then you need to think yourself.

chatmasta: the index sites don't know that the tutorial sites submitted the tutorials themselves to your site
 
Status
Not open for further replies.