Actually, I do need to do some scraping. Don't need it this minute, but I can shove it in a database for future use, so sign me up as a tester as well if wanted. I still don't quite understand the concept, so will hold my criticism until I get to play. Not sure how this is going to be quicker, better or more efficient than just writing a quick Perl script myself.
BTW... how much is this bad boy going to cost down the road?
I just want to emphasise, that as a person who scrapes, this is interesting because it is a real browser.
Javascript.
CSS.
Images.
Modern scraping is growing beyong a simple GET in a lot of cases.
Ohhhh... now that could definitely come in handy. So this scraper will actually execute the Javascript? I don't care about CSS or images, because those can be easily scraped too.
However, I have gotten stuck a few times when writing bots due to Javascript dynamically changing form field values, and things of that sort. This will resolve that? If so, kick ass!
Ohhhh... now that could definitely come in handy. So this scraper will actually execute the Javascript? I don't care about CSS or images, because those can be easily scraped too.
However, I have gotten stuck a few times when writing bots due to Javascript dynamically changing form field values, and things of that sort. This will resolve that? If so, kick ass!
doyouevenscrapebro?
Please consider adding xpath support (and common usages like internal links, title etc) and regex (it's useful for some stuff chuckers)
Please consider adding xpath support (and common usages like internal links, title etc) and regex (it's useful for some stuff chuckers)
Get a 404 error upon clicking the e-mail activation link:
http://systemizer.net/confirm-email/sdgdsgsd
$ systemizer account --create