the Google API is fucking crazy cheap, $0.25 per 1000 API calls.
Anyways as for caps on API usage, no idea.
To get started with the API you need a My Client Center (MCC), go here: Google AdWords - My Client Center
API development token takes 1 - 2 weeks to get approved according to Google. I however was approved in about 3-4 days.
5|1|54|https://adwords.google.com/o/Targeting/|0BC91872D430CA67A5237BB62909C258|_|invoke|3|15s|19y|15n|15o|g|TiAction (Search.SearchInput.KEYWORD_IDEAS.RelatedToKeyword)|ATbciIPQBuWx_WecIzve5Ihro3s:1276948076390|19s|10g|15v|15x|c|h|i|106|17t|110|en_US|ul|12g|11f|17k|11i|KEYWORD|COMPETITION|GLOBAL_MONTHLY_SEARCHES|AVERAGE_TARGETED_MONTHLY_SEARCHES|TARGETED_MONTHLY_SEARCHES|IDEA_TYPE|AD_SHARE|EXTRACTED_FROM_WEBPAGE|SEARCH_SHARE|KEYWORD_CATEGORY|NGRAM_GROUP|12x|19z|hh|US|13a|hu|en|139|17q|bm|13h|16x|17o|bl|banana man|1|2|3|4|1|5|5|6|2016479548|32818055562133504|6|7|1|8|0|9|0|0|10|2|1000000000|0|1000000000|0|11|12|13|1|14|15|16|-4|0|0|0|17|5|0|0|0|0|0|0|0|0|0|18|0|19|1|0|0|0|0|0|0|0|0|0|7|0|0|0|20|21|235|22|0|23|24|50|0|25|0|0|26|0|27|9|28|29|28|30|28|31|28|32|28|33|28|34|28|35|28|36|28|37|27|3|-22|28|38|28|39|27|4|40|41|1|42|43|0|44|41|1|45|46|47|48|49|2|50|51|52|53|-43|54|0|0|0|
Working on a similar project for a client. Here's the thing, the Google API is fucking crazy cheap, $0.25 per 1000 API calls. Why deal with scraping, using proxies and solving captchas when just dishing out for the API is probably way cheaper unless you have a great source of free proxies?
Actually I like this solution better than mine.
Do you know if they will rate limit you if you start banging the absolute crap out of the API? And by absolute crap I mean 1,000 a request a minute, although not over an extended period of time.
I assume they don't care as long as the bills are paid.
If you where going to scrape it all you would do is watch the header data made with the ajax calls to see what data is passed to where, and what is returned. The returned data is probably just standard XHTML generated by a backend script. That's normally how these things work, it isn't fucking rocket science.
*facepalm* I can't believe I missed this the first time I looked at the code, but the sent data is GWT RPC (Google Web Tools) and the returned is in GWT JSON with a JavaScript function call to 'OK'.
I am not 100% sure, but last time I looked at the API, you could only query one keyword at a time. While scraping you can do 100 at a time. Is that still true? If I send a query for 100 keywords, will I pay for each one or per request?
That's another thing, I know that in the past they did rate limit you.
Not sure what you mean here. Please elaborate?
If anyone is interested in purchasing a subscription to a script/tool I have please email Sales@BannerBlindness.com.
The tool will allow you to auto scrape the google KW suggestion tool. Put in one root keyword and u will daisy chain out a sick list of KW's
you can still use the old tool if you want
https://adwords.google.com/select/KeywordToolExternal?forceLegacy=true
Looking at the payment structure of the API request it's turns out not to be a straight 1:1. In other words they charge you not only on the request made (which costs you 5 API call credits) but also they charge you on the data returned as well (at a rate of 0.1 API credits per keyword).
It roughly works out to be $25 per 900,000 keywords returned. Not a lot, but can start to add up if you're just browsing keywords or you researching a lot of them.
The data being sent by the JavaScript is in a format called GWT, as a RPC (remote procedure call), which is a serialiation format that Google uses. It was written in Java (not JavaScript) initially, but people have made ports to other languages.
The data being returned is in GWT JSON format.
What this means is that if you want to retrieve the data easily, you have to find a GWT client that is capable of sending GWT RPC calls to a server and then interpreting the returned GWT JSON.
I looked at a few versions, and there is a PHP Server API but not one for the client. So that means, if you use PHP, that either you write an interpreter yourself or use alturnative methods.
Again I would recommend using the login method if you want to bypass the API. But if you're trying to build this into a tool for mass use, then that wont work very well, unless you run a server as a go between (proxy so to speak) which handles the login and data retrieval on behalf of the user.
I know I'm repeating myself but seriously look at loging into Google as a solution, they will hand you the data nicely formatted in CSV.
It took me less than 10 minutes to write the code to do that, but it would take me about a day to reverse the JavaScript and have a working client.
For me, it's a waste of time to chase the JavaScript, but then I might be missing something crucial.
Is there a reason you just want to scrape the keywords without having an account?
i think it anyway can't show all the data that semrush provides for example
and there are a lot of tools better than keyword tool. but it is only ny opinion
I'm just doing a lot of volume and I'm afraid they'll block my account if I login and scrape.
Where do all these sites get the data from? I always thought they just scraped Google as well.
Looking at the payment structure of the API request it's turns out not to be a straight 1:1. In other words they charge you not only on the request made (which costs you 5 API call credits) but also they charge you on the data returned as well (at a rate of 0.1 API credits per keyword).
It roughly works out to be $25 per 900,000 keywords returned. Not a lot, but can start to add up if you're just browsing keywords or you researching a lot of them.