The WF PHP Functions War Chest



Ok, so I thought the functions found at http://www.wickedfire.com/sell-buy-trade/56203-google-suggest-scraper-free-php-script.html were great, but I could come up with a couple quick improvements. I added a link to the Google results, as well as the number of results the query returns. So now you can quickly identify the low competition keywords.

Usage is pretty straight forward. Just go to yourscript.php?yourquery

PHP:
<?php
function text_between($start,$end,$string) {
  if ($start != '') {$temp = explode($start,$string,2);} else {$temp = array('',$string);}
  $temp = explode($end,$temp[1],2);
  return $temp[0];
}
function gsscrape($keyword) {
  $keyword=str_replace(" ","+",$keyword);
  global $kw;
  $data=file_get_contents('http://clients1.google.com/complete/search?hl=en&q='.$keyword);
  $data=explode('[',$data,3);
  $data=explode('],[',$data[2]);
  foreach($data as $temp) {
  $kw[]= text_between('"','"',$temp);
  }
}
#simple to use, just use yourscriptname.php?keywords
if ($_SERVER['QUERY_STRING']!='') {
  gsscrape($_SERVER['QUERY_STRING']);
  foreach ($kw as $keyword) {
  gsscrape($keyword);
  }
}

#all results are in array $kw...
foreach($kw as $keyword) {
    $url = 'http://www.google.com/search?q='.urlencode($keyword);
    
    $userAgent = 'Firefox (WindowsXP) - Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6';
    
    // make the cURL request to $target_url
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
    curl_setopt($ch, CURLOPT_URL,$url);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($ch, CURLOPT_AUTOREFERER, true);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
    curl_setopt($ch, CURLOPT_TIMEOUT, 10);

    $html= curl_exec($ch);
    if (!$html) {
        echo "<br />cURL error number:" .curl_errno($ch);
        echo "<br />cURL error:" . curl_error($ch);
        exit;
    }
    
    $abc = str_replace('of about <b>', '', strstr($html, 'of about <b>'));
    
    echo '<a href="'.$url.'">'.$keyword.'</a> - '.substr($abc, 0, strpos($abc, '</b>')).' Results<br />';
}
?>

For example, Angelina Jolie gives:

angelina jolie pregnant - 3,420,000 Results
angelina jolie tattoos - 582,000 Results
angelina jolie twins - 3,250,000 Results
angelina jolie movies - 27,100,000 Results
angelina jolie pictures - 14,500,000 Results
angelina jolie breastfeeding - 527,000 Results
angelina jolie and brad pitt - 9,020,000 Results
angelina jolie quotes - 2,270,000 Results
angelina jolie w magazine - 3,210,000 Results
angelina jolie biography - 302,000 Results
angelina jolie pregnant again - 282,000 Results
angelina jolie pregnant november 2008 - 658,000 Results
angelina jolie pregnant with 7th child - 105,000 Results
angelina jolie pregnant december 2008 - 702,000 Results
angelina jolie pregnant with twins again - 391,000 Results
... and so on...

brill -might I suggest you randomize the user agent each time?
 
brill -might I suggest you randomize the user agent each time?

I like where this is going.

If someone is working on this more can we output so that the numeric results drop into the second field in Excel when copy pasted? I tried a few things and it keeps going in the first field.

Maybe I could drop it into a csv and use the hyphen as the separator.
 
OK.. I changed the scraper script a bit to circumvent an error that get_file_content throws when you get 0 results.

So here are the changes:
- introduced function getHttp which sends a curl to an url
- replaced get_file_content with the new function
- replaced the curl in the lower script with a call to the new function
- added a table to the output
- integrated the check to include the output (no more empty lines)

Here ya go:
Code:
<?php
function text_between($start,$end,$string) {
  if ($start != '') {$temp = explode($start,$string,2);} else {$temp = array('',$string);}
  $temp = explode($end,$temp[1],2);
  return $temp[0];
}

function getHttp($url)
    { 
        $userAgent = 'Firefox (WindowsXP) - Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6';
        
        // make the cURL request to $target_url
        $ch = curl_init();
        curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
        curl_setopt($ch, CURLOPT_URL,$url);
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
        curl_setopt($ch, CURLOPT_AUTOREFERER, true);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
        curl_setopt($ch, CURLOPT_TIMEOUT, 10);

        $html= curl_exec($ch);
        if (!$html) 
        {
            echo "<br />cURL error number:" .curl_errno($ch);
            echo "<br />cURL error:" . curl_error($ch);
            exit;
        }
        return $html;
    }

function gsscrape($keyword) {
  $keyword=str_replace(" ","+",$keyword);
  global $kw;
    $url='http://clients1.google.com/complete/search?hl=en&q='.$keyword;
    $data = getHttp($url);
    $data=explode('[',$data,3);
    $data=explode('],[',$data[2]);
    foreach($data as $temp) {
        $kw[]= text_between('"','"',$temp);
    }
}

#simple to use, just use yourscriptname.php?keywords
if ($_SERVER['QUERY_STRING']!='') {
  gsscrape($_SERVER['QUERY_STRING']);
  foreach ($kw as $keyword) {
  gsscrape($keyword);
  }
}
    echo"<table>";
#all results are in array $kw...
foreach($kw as $keyword) {

    if ($keyword !='')
    { 
        $url = 'http://www.google.com/search?q='.urlencode($keyword);
        $html=getHttp($url);
        $abc = str_replace('of about <b>', '', strstr($html, 'of about <b>'));
        echo '<tr><td><a href="'.$url.'">'.$keyword.'</a></td><td>'.substr($abc, 0, strpos($abc, '</b>')).' Results<br /></td></tr>';
    }
}
echo'</table>';
?>

::emp::
 
If someone is working on this more can we output so that the numeric results drop into the second field in Excel when copy pasted? I tried a few things and it keeps going in the first field.

Drop it into Excel and Click Data | Text to Columns and use the hyphen as the delimiter.
 
damn... excel

Here ya go, with ; as separator
Code:
#simple to use, just use yourscriptname.php?keywords
if ($_SERVER['QUERY_STRING']!='') {
  gsscrape($_SERVER['QUERY_STRING']);
  foreach ($kw as $keyword) {
  gsscrape($keyword);
  }
}
echo 'Url;Results<br>';
#all results are in array $kw...
foreach($kw as $keyword) {
    if ($keyword !='')
    { 
        $url = 'http://www.google.com/search?q='.urlencode($keyword);
        $html=getHttp($url);
        $abc = str_replace('of about <b>', '', strstr($html, 'of about <b>'));
        echo '<a href="'.$url.'">'.$keyword.'</a>;'.substr($abc, 0, strpos($abc, '</b>')).'<br />';
    }
}
?>

Just replace from #simple .. to the end of the script.
Copy, paste into a .csv, import.

::emp::
 
would it not be better to just have a single post linking to relevent threads, this one is a bit of a mess now and the original author loses some of the credit (me :D)
 
would it not be better to just have a single post linking to relevent threads, this one is a bit of a mess now and the original author loses some of the credit (me :D)

Feel free to leave a link if it's something that you've got on another thread or your blog. However, we're doing this for the good of WF ... not the e-cred.

I've got lots of stuff I'll be posting in the next few days ... empty house!!
 
WF is made up of it's members, whats wrong with wanting credit :|

If you didn't notice, your post shows up next to your screen name and right above your sig. Feel free to self promote if that's what you're into. Again, drop a link if it's important to you ... but leave the code here

__________________________

How about a quick hack of emp's gscrape script above. The original takes each result returned from the first gscrape query and runs it back through gscrape ... this produces a good subset of broad results.

The below hack (only a few lines different) takes the original query (angelina jolie) and appends ' a', ' b' .... this gets results from every letter of the alphabet

Here's the change (line 7)

PHP:
#simple to use, just use yourscriptname.php?keywords
if ($_SERVER['QUERY_STRING']!='') {
  $alpha = array('a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z') ;
  foreach ($alpha as $letter) {
    gsscrape($_SERVER['QUERY_STRING'].'+'.$letter);
  }
}
Here's the results (250 total)

HTML:
angelina jolie and brad pitt    1,200,000 Results
angelina jolie age    2,430,000 Results
angelina jolie and jennifer aniston    7,600,000 Results
angelina jolie anne hathaway    2,200,000 Results
angelina jolie affair    2,320,000 Results
angelina jolie and twins    3,070,000 Results
angelina jolie adoption    1,620,000 Results
angelina jolie and brad pitt split    152,000 Results
angelina jolie awards    13,900,000 Results
angelina jolie and brad pitt married    2,490,000 Results
angelina jolie breastfeeding    516,000 Results
angelina jolie biography    303,000 Results
angelina jolie brad pitt    8,530,000 Results
angelina jolie breastfeeding twins    29,200 Results
angelina jolie brother    812,000 Results
angelina jolie birthday    3,980,000 Results
angelina jolie babies    12,300,000 Results
angelina jolie bodyguard    273,000 Results
angelina jolie backwards dress    86,300 Results
angelina jolie baby    8,730,000 Results
angelina jolie children    7,130,000 Results
angelina jolie changeling    1,300,000 Results
angelina jolie charity    2,260,000 Results
angelina jolie catwoman    269,000 Results
angelina jolie critic's choice awards    128,000 Results
angelina jolie critic's choice    133,000 Results
angelina jolie cheats on brad    384,000 Results
angelina jolie cheating    712,000 Results
angelina jolie crying    824,000 Results
angelina jolie childhood    892,000 Results
angelina jolie diet    1,970,000 Results
angelina jolie dress backwards    85,900 Results
angelina jolie dad    245,000 Results
angelina jolie dress up    1,240,000 Results
angelina jolie drugs    4,750,000 Results
angelina jolie diaper bag    101,000 Results
angelina jolie date of birth    226,000 Results
angelina jolie diet plan    201,000 Results
angelina jolie dresses    6,150,000 Results
angelina jolie diet and exercise    71,700 Results
angelina jolie ethnicity    1,150,000 Results
angelina jolie eye color    266,000 Results
angelina jolie eye makeup    216,000 Results
angelina jolie eyes    8,920,000 Results
angelina jolie eating disorder    202,000 Results
angelina jolie eyebrows    233,000 Results
angelina jolie email    7,520,000 Results
angelina jolie expecting again    61,700 Results
angelina jolie education    2,490,000 Results
angelina jolie elizabeth mitchell    219,000 Results
angelina jolie filmography    138,000 Results
angelina jolie father    874,000 Results
angelina jolie films    13,000,000 Results
angelina jolie fansite    410,000 Results
angelina jolie family    11,800,000 Results
angelina jolie fashion    11,100,000 Results
angelina jolie facts    6,950,000 Results
angelina jolie foundation    1,800,000 Results
angelina jolie fan club    604,000 Results
angelina jolie face shape    69,500 Results
angelina jolie golden globes    1,870,000 Results
angelina jolie gallery    1,710,000 Results
angelina jolie golden globes 2009    1,380,000 Results
angelina jolie gossip    6,100,000 Results
angelina jolie girl interrupted    203,000 Results
... 10,000 character limit exceeded
angelina jolie wiki    917,000 Results
angelina jolie wanted    6,210,000 Results
angelina jolie wallpaper    857,000 Results
angelina jolie weight    3,080,000 Results
angelina jolie w magazine photos    719,000 Results
angelina jolie without makeup    210,000 Results
angelina jolie wanted tattoos    763,000 Results
angelina jolie workout    1,090,000 Results
angelina jolie weight loss    940,000 Results
angelina jolie xiii tattoo    25,700 Results
angelina jolie xmas    4,370,000 Results
angelina jolie xxl    548,000 Results
angelina jolie xvideo    134,000 Results
angelina jolie young    8,440,000 Results
angelina jolie yellow dress    142,000 Results
angelina jolie yahoo    4,140,000 Results
angelina jolie younger years    932,000 Results
angelina jolie young pictures    1,340,000 Results
angelina jolie yoga    1,200,000 Results
angelina jolie yearbook    71,700 Results
angelina jolie yahoo answers    359,000 Results
angelina jolie young photos    1,270,000 Results
angelina jolie younger pics    1,450,000 Results
angelina jolie zodiac sign    840,000 Results
angelina jolie zahara    531,000 Results
angelina jolie zahara hair    60,400 Results
angelina jolie zahara hiv    24,800 Results
angelina jolie zahara aids    37,600 Results
angelina jolie zimbio    1,080,000 Results
angelina jolie znowu w ciąży    45,000 Results
angelina jolie zombie    1,040,000 Results
angelina jolie zone    2,230,000 Results
angelina jolie zdjecia    486,000 Results
full list of angelina jolie keywords
 
the problem being, you've credited the script in your post as emps, and it isnt, I wrote it initially.

I dont think I'll bother in future if what I post gets credited to others in some other thread.

I dont think you speak for all of WF either, but I may be wrong.
 
^ Lol... that was exactly his point. The original script was not by me, it was his. I just changed it.

Still, I am for keeping it all in this thread. We kinda focused on this one, cool script, but I am sure there are more to come.
::emp::
 
@backbanana still, whatcha have to gain by keeping everything. As far as I can see, ya got load of rep, respect by members AND an overhauled script with some new ideas thriwn in for free (the abc thing, error resistence, etc..)

As for me, I always won by sharing.

::emp::
 
Who said anything about keeping anything? I came here, I shared, it then got posted elsewhere to be developed with credit given to others, and my own credit lost. I'm really not asking for much.

I really dont see how a disorganised thread full of various version of various php scripts is a better idea than a single post linking to various threads containing the development of a single script, which is far easier for those interested in a particular script to keep up to date with.

I wont post here any more, I've given my opinion on this thread and it's organisation, I'll leave it there.
 
the problem being, you've credited the script in your post as emps, and it isnt, I wrote it initially.

I tried to edit my post but was >10 by the time I finished

edit: originally BackBanana's script

I dont think I'll bother in future if what I post gets credited to others in some other thread.

While I see what you are saying, that's the benefit of posting the actual code on this thread. People can't (and don't want to) follow code across the net. I pulled something that was above and edited it with code I've been using to scrape suggest for 2 years.

I was unaware you wrote the original and did not intentionally give credit to emp. I also got you confused with cliqz above. However, it's childish to threaten not sharing something in the future because you aren't glorified for the act. Give that one some thought and then go contribute to some open source software .. it really is a good thing for everyone.

I dont think you speak for all of WF either, but I may be wrong.

That's a job I don't want
 
While I see what you are saying, that's the benefit of posting the actual code on this thread. People can't (and don't want to) follow code across the net. I pulled something that was above and edited it with code I've been using to scrape suggest for 2 years.

follow code across the net? what are you talking about?

so now i'm childish? I'm willing to share, for free, but I do not subscribe to your conditions, and I certainly do not have to. There are plenty of people selling rubbish on here but they arent treated the way you are me.
 
dude... chill
Cool and intetesting your script may be, there are at least 5 other scripts in this thread.

::emp::
 
Since I'm not mentally retarded I am able to follow links provided and realize who provided the original code.

But now that you've freaked out like this I am forced to credit fairy dust with the creation of this script.

Many thanks fairy dust, your check is in the mail.