Scraping Data - Sources?

eliquid

Serpwoo.com
May 10, 2007
7,207
205
63
A/B Testing
Hey, trying to use the knowledge of WF to help me out on a project.

I am looking for places to scrape data of local companies.

Example, say I needed all the lazer hair removal business's in Lexington, Kentucky... OR just Kentucky... OR all the United States.

I know I can hit up places like YellowPages, CitySearch, etc and scrape their content on a city level and such, but what other big data sources can I hit.

I am looking for free and public access, but I am not 100% opposed to paid sources if the data is good. Primarily though, I wanna hit the free sources first if possible but please list the paid ones if you know them.

I have thought about scraping off Google too, but would like a couple of consolidated sites if I could.
 


^^ thanks Nick, but I got a few of the easy ones like YellowPages and CitySearch ( and similar clones ), looking for more though if possible that I might not know of since I don't play in the data space much.
 
Company Profiles & Company Information on Manta

Why are you opposed to just scraping Google places though?

I am trying to get the most complete dataset I can. Not saying that scraping Google won't give me that, but trying to find out all the sources I can upfront to determine best starting points that give me the best scrape I can get off the bat.

might as well hit yelp's api while you're at it

noted, thanks

Yahoo local is pretty good for some categories.

Thanks
 
I don't know how doable this is, but for example ehow / ezine often have articles that are used for references to local businesses. You'll find the local company data often in the footer of the article, don't know if it's worth the trouble just a suggestion.

Otherwise YP / maybe buy some software like digital phonebooks etc. and scrape it directly from there.
 
I don't know how doable this is, but for example ehow / ezine often have articles that are used for references to local businesses. You'll find the local company data often in the footer of the article, don't know if it's worth the trouble just a suggestion.

Otherwise YP / maybe buy some software like digital phonebooks etc. and scrape it directly from there.


Maybe try scraping state incorporation sites?

Noted and thanks for your all's suggestions