I have updated my yellow pages ripper see original post.
I have emailed it to everyone that got the original script. If you never got the script and meet the requirements of the original post pm me with your EMAIL ADDRESS and I will get you a copy.
Here are the changes.
There are two changes.
The first is a problem I discovered that could lead to website URLs to show multiple times, for other that the correct listing. As far as I can tell this has been corrected.
The second update is an addition to the script. When you perform a search you can now have the script pause between retrieving each page from yp.com. I added this in case anyone was concerned with them picking you up for abusive behavior and i.p. banning you.
On the search form there are two entries for the option "Delay", min and max. The default values for each are 0, no delay. To use this feature you must enter a value for max and optionally a value for min (min can always be 0). Don't make the min greater than the max, it just wasn't designed to work that way.
The script will pick a random time between min and max between each page and pause that amount of time. Example, enter min=2 and max=10 and there will be a delay of between 2 and ten seconds between each page fetch. Once again, leaving them both at 0 results in no delay at all.
I had also gotten a request to rip email addresses as well. After looking into this I have decided to not add this feature. I did a search which resulted in 2,500 records and there were only 8 with email addresses. It looks like only premium listings have these.
I have emailed it to everyone that got the original script. If you never got the script and meet the requirements of the original post pm me with your EMAIL ADDRESS and I will get you a copy.
Here are the changes.
There are two changes.
The first is a problem I discovered that could lead to website URLs to show multiple times, for other that the correct listing. As far as I can tell this has been corrected.
The second update is an addition to the script. When you perform a search you can now have the script pause between retrieving each page from yp.com. I added this in case anyone was concerned with them picking you up for abusive behavior and i.p. banning you.
On the search form there are two entries for the option "Delay", min and max. The default values for each are 0, no delay. To use this feature you must enter a value for max and optionally a value for min (min can always be 0). Don't make the min greater than the max, it just wasn't designed to work that way.
The script will pick a random time between min and max between each page and pause that amount of time. Example, enter min=2 and max=10 and there will be a delay of between 2 and ten seconds between each page fetch. Once again, leaving them both at 0 results in no delay at all.
I had also gotten a request to rip email addresses as well. After looking into this I have decided to not add this feature. I did a search which resulted in 2,500 records and there were only 8 with email addresses. It looks like only premium listings have these.