Need some alpha testers for my hosted browser automation/scraping system

affiliatearmy · Jan 4, 2013

I'll give it a go. Do scraping all the time

Kiopa_Matt · Jan 4, 2013

Actually, I do need to do some scraping. Don't need it this minute, but I can shove it in a database for future use, so sign me up as a tester as well if wanted. I still don't quite understand the concept, so will hold my criticism until I get to play. Not sure how this is going to be quicker, better or more efficient than just writing a quick Perl script myself.

BTW... how much is this bad boy going to cost down the road?

NathanRidley · Jan 4, 2013

Kiopa_Matt said:
Actually, I do need to do some scraping. Don't need it this minute, but I can shove it in a database for future use, so sign me up as a tester as well if wanted. I still don't quite understand the concept, so will hold my criticism until I get to play. Not sure how this is going to be quicker, better or more efficient than just writing a quick Perl script myself.

BTW... how much is this bad boy going to cost down the road?

Hey Matt, ok so... Systemizer is ultimately going to be a sort of universal automation Lego kit for the web, in a sense, but with extra love for internet marketing folks. The first component of the product is what we're testing here, and it's basically browser automation in a real Chromium-based browser, and operated via a REST API so there's no setup on your end other than scripting up the calls you want and getting back results. You can operate your browser session in real time, responding to command results as they are returned, and you can run multiple sessions in parallel if you want, with each session completely private and isolated from other sessions, so no cookie interference between sessions or anything like that. This first release is an alpha version, so there's plenty of room for responding to feedback and suggestions.

Regarding price, hell I have no idea. Initially I just want to get a sense of how heavily, and in what ways, people will use it. People who helped me test early on won't pay a whole lot, in any case.

mattseh · Jan 5, 2013

I just want to emphasise, that as a person who scrapes, this is interesting because it is a real browser.

Javascript.
CSS.
Images.

Modern scraping is growing beyong a simple GET in a lot of cases.

digga121 · Jan 5, 2013

I can try, I have high itrader, and have been around! I also currently automate tasks with ubot, and php but would love a more stable way of doing things. Feel free to message me whenever!

NathanRidley · Jan 7, 2013

I've now finished all the documentation I intended to write for the alpha version. Tonight after work I intend to try and finish up the context extraction engine, then I will deploy the docs and updated version and you can all have at it. Sorry for the delay! (When I started this thread, I might have underestimated how much I had left to do to make it testable by people other than myself...)

Kiopa_Matt · Jan 7, 2013

mattseh said:
I just want to emphasise, that as a person who scrapes, this is interesting because it is a real browser.

Javascript.
CSS.
Images.

Modern scraping is growing beyong a simple GET in a lot of cases.

Ohhhh... now that could definitely come in handy. So this scraper will actually execute the Javascript? I don't care about CSS or images, because those can be easily scraped too.

However, I have gotten stuck a few times when writing bots due to Javascript dynamically changing form field values, and things of that sort. This will resolve that? If so, kick ass!

NathanRidley · Jan 7, 2013

Kiopa_Matt said:
Ohhhh... now that could definitely come in handy. So this scraper will actually execute the Javascript? I don't care about CSS or images, because those can be easily scraped too.

However, I have gotten stuck a few times when writing bots due to Javascript dynamically changing form field values, and things of that sort. This will resolve that? If so, kick ass!

Yep, this is a fully cloud-scalable Chromium-based browser. You create your own private session then open windows using the REST API and automate away, with everything you'd expect a real Chrome browser to do to the pages within the windows you open.

Meatytreats · Jan 7, 2013

Sweet! Looking forward to giving it a go!

dchuk · Jan 7, 2013

Kiopa_Matt said:
Ohhhh... now that could definitely come in handy. So this scraper will actually execute the Javascript? I don't care about CSS or images, because those can be easily scraped too.

However, I have gotten stuck a few times when writing bots due to Javascript dynamically changing form field values, and things of that sort. This will resolve that? If so, kick ass!

doyouevenscrapebro?

mattseh · Jan 7, 2013

Please consider adding xpath support (and common usages like internal links, title etc) and regex (it's useful for some stuff chuckers)

Kiopa_Matt · Jan 7, 2013

dchuk said:
doyouevenscrapebro?

Not anymore, but used to loads back when I was doing contract work.

dchuk · Jan 7, 2013

mattseh said:
Please consider adding xpath support (and common usages like internal links, title etc) and regex (it's useful for some stuff chuckers)

lies, regex is the devil

NathanRidley · Jan 8, 2013

mattseh said:
Please consider adding xpath support (and common usages like internal links, title etc) and regex (it's useful for some stuff chuckers)

Where we're going, you don't need xpath support...

Regexes are supported though.

NathanRidley · Jan 8, 2013

We have liftoff!

I have just deployed the updated API, background workers and browser API documentation.

Go here: Api Documentation « Systemizer API

1. Read the front page!
2. Look at the Account API documentation for instructions on how to create an account
3. Look at the Session API documentation for instructions on creating a session
4. Read the Browser API section for instructions on how to automate your browser

Ask me any questions you like, preferably in this thread. Expect it to crash as soon as someone does something I didn't expect - remember, this is an ALPHA test. That's not even beta!

Kiopa_Matt · Jan 8, 2013

Get a 404 error upon clicking the e-mail activation link:

http://systemizer.net/confirm-email/sdgdsgsd

NathanRidley · Jan 8, 2013

Kiopa_Matt said:
Get a 404 error upon clicking the e-mail activation link:

http://systemizer.net/confirm-email/sdgdsgsd

I've emailed you. Your account should work now, though I inadvertently scrambled your password due to MongoHQ's interface which lets you edit records but then screws up binary data when you hit save. I'm about to fix the bug so I don't have to manually do anything from MongoHQ, but in the mean time, your account should be working. On the bright side, the password is reserved for future use anyway, so not having one at this stage is no problem and when it's required, the system will email you and ask you to set one, so it's nothing to worry about right now.

NathanRidley · Jan 8, 2013

Alright, signup issues are solved! Well at least they appear to be. I've stepped through the whole process. So anyone else who wants to sign up, please feel free. Alpha tester accounts have a refill rate of 50 browser sessions per day, with a starting quota of 250 sessions just to get you started. After you go below 50 sessions remaining, the cap won't refill past 50. These values can be tweaked per account, so let me know if you need yours adjusted and we can discuss that.

NathanRidley · Jan 10, 2013

Anyone have any interest in node.js support for this? My other project uses node.js so I've been building a client library for it, for my own needs.

Mahzkrieg · Jan 10, 2013

Almost all of my time is spent building web apps and I get very little practice doing anything else.

So, for fun, I hacked together the start of a Ruby CLI client so that dchuk can finally register.

Code:

$ systemizer account --create

Steps you through a little parameter wizard.

Unorganized source so far: https://gist.github.com/4507671

Cool stuff. Thought it'd be vaporware for sure.

If anyone wants to try it, you have to `gem install ___` the gems at the top of the file, then run it with `ruby filename.rb account --create`.

Need some alpha testers for my hosted browser automation/scraping system

New member

Banned

New member

import this

ajaxking

New member

Banned

New member

New member

Senior Botter

import this

Banned

Senior Botter

New member

New member

Banned

New member

New member

New member

New member