Scraper Sites

revlimiter · Aug 7, 2009

Hi all,
Just wondering if there is a (legal) way for scraper sites to get away with scraping content via RSS feed? Is citing the author, providing linkback, and a "copyright [rssfeedwebsite]" in the footer enough?

And if AdSense gets suspicious of the domain in use due to too many reports, is there another ad agency to choose from?

Just curious!
Thanks!

hirop · Aug 7, 2009

If you're hosting in the US just be sure to honor DMCA complaints. Since it's RSS it's not scraping (they're providing machine readable data, you're not trying to get semantic data from marked up data) and there's a lot of tolerance for aggregators.

At minimum the bits you gave are good. A lot of bloggers will bitch if you use nofollow, index but aggregators like Technorati get away with it and people even opt-in since they provide value.

If you want to be totally in the clear copyrightwise, hit Creative Commons Search and search google with filetype:xml inurl:rss, or inurl:/feed/ or whatever. Make sure "allow commercial" and "allow remixes" are checked off. Naturally add your niche keywords. Scrape that for new blogs to add. Some nerd niches are decently represented but usually the selection is pretty crap.

Search

Search

Scraper Sites

revlimiter

$333.33/day is my goal!

hirop

GIGO