The WF PHP Functions War Chest

Damn, I didn't realize this thread was here. I'll post this PHP awesomeness in here anyways. It's another way to get GEOIP info fast and easy.

Here's the class

PHP:
/* 
    Provided by http://www.todayisearched.com 
*/  
class GeoIP {  
  
    /* 
        Returns all possible Geo Ip information in an associative array.  The following pieces of information 
        are returned: 
        IP - IP ADDRESS LOOKED UP 
        CODE - COUNTRY CODE 
        COUNTY - COUNTRY NAME 
        FLAG - PATH TO IMAGE OF THE COUNTRY'S FLAG 
        CITY - CITY NAME 
        REGION - STATE NAME 
        ISP - ISP NAME 
        LAT - LATITUDE CORDINATE 
        LNG - LONGITUDE CORDINATE 
 
        USAGE: 
        $ipinfo = GeoIP :: getGeoArray('xxx.xxx.xxx.xxx'); 
        echo $ipinfo['CITY'] . ', ' . $ipinfo['STATE']; 
    */  
    public static function getGeoArray($ip) {  
        $file = "http://www.ipgp.net/api/xml/". $ip;  
  
        $xml_parser = xml_parser_create();  
        $fp = fopen($file, "r");  
        $data = fread($fp, 80000);  
        xml_parse_into_struct($xml_parser, $data, $vals);  
        $iplookup = array();  
        foreach ($vals as $v) {  
            if (isset($v['tag']) && isset($v['value'])) { 
                $iplookup[$v['tag']] = $v['value']; 
            } 
        } 
        xml_parser_free($xml_parser); 
        fclose($fp); 
 
        return $iplookup; 
    } 
 
    //shortcut to get the city name 
    public static function getCity($ip) { 
        $a =  self :: getGeoArray($ip); 
        return $a['CITY']; 
    } 
 
    //shortcut to get the state name 
    public static function getState($ip) { 
        $a =  self :: getGeoArray($ip); 
        return $a['REGION'];  
    }  
  
}

Here's the include
PHP:
include_once "geoip.php";  
$ipinfo = GeoIP :: getGeoArray($_SERVER['REMOTE_ADDR']);

And here's what you put in your page..

PHP:
echo 'Hello, Your City And State Is ' . $ipinfo['CITY'] . ', ' . $ipinfo['REGION'];

See, it's simple as fuck. I know I just saw something else floating around here but this script is awesome.

Today I Searched | PHP Geo IP Address Lookup

Enjoy it
 


I tried using your script but I couldn't get it to work. I'm not a programmer, but shouln't it end with ?>

Once I get it to work it will be awesome. Thanks for sharing :)

Depending on your configuration, you may need to end php files with ?> It wont hurt to add it. If you don't give me any indication of the errors you're getting, I can't really help you beyond that.
 
Depending on your configuration, you may need to end php files with ?> It wont hurt to add it. If you don't give me any indication of the errors you're getting, I can't really help you beyond that.

The error message I get is:

Parse error: syntax error, unexpected T_CONST, expecting T_OLD_FUNCTION or T_FUNCTION or T_VAR or '}' in /homepages/41/d179916135/htdocs/www.simonshomepage.co.uk/gs-scrape.php on line 31

Cheers
 
Takes an mysql Resource from a select and turns it into an array.

PHP:
function mysql_to_array($resource) {
    $count = 0; // Count our Rows
    
    while($i=mysql_fetch_array($resource)) {
        foreach ( $i as $k=>$v ) { // For each Row
            if (!is_numeric($k)) { // Get the name of the collumn
                $array[$count][$k] = $v; // Set the value (build an array of rows)
            }
        }
        $count++; // Increase to the next Row
    }
    return $array; //Return entire array
}
 
Takes an mysql Resource from a select and turns it into an array.

PHP:
function mysql_to_array($resource) {
    $count = 0; // Count our Rows
    
    while($i=mysql_fetch_array($resource)) {
        foreach ( $i as $k=>$v ) { // For each Row
            if (!is_numeric($k)) { // Get the name of the collumn
                $array[$count][$k] = $v; // Set the value (build an array of rows)
            }
        }
        $count++; // Increase to the next Row
    }
    return $array; //Return entire array
}


That's a great way to get rid of boilerplate, but why not something a little more flexible/simple?

[high=php]
function r2a($res, $method="mysql_fetch_assoc") {
if ( !mysql_num_rows($res) ) {
return array();
}

// Rewind the resource so that we get everything
mysql_data_seek($res, 0);
$result = array();

while ( $row = call_user_func($method, $res) ) {
$result[] = $row;
}

// Rewind the resource so that it can be easily used again
mysql_data_seek($res, 0);

return $result;
}
[/high]
 
This PHP function adds second/minute/hour/day/month/year to a unixtimestamp and returns as Y-d-m H:i:s.

<?php
function utime_add($unixtime, $hr=0, $min=0, $sec=0, $mon=0, $day=0, $yr=0)
{
$dt = localtime($unixtime, true);
$unixnewtime = mktime(
$dt['tm_hour']+$hr, $dt['tm_min']+$min, $dt['tm_sec']+$sec,
$dt['tm_mon']+1+$mon, $dt['tm_mday']+$day, $dt['tm_year']+1900+$yr);


return date("Y-d-m H:i:s", $unixnewtime );
}
?>
Usage :
$timeinunix = mktime();
$hour = 1;
$minute = 30;
$second = 60;
$month = 1;
$day = 1;
$year = 1;
utime_add($timeinunix, $hour, $minute, $second, $month, $day, $year);

This will add 1 year, 1 month 1 day, 1 hour, 30 minutes and 60 seconds to current time and display as "Y-d-m H:i:s" format.
 
Im not a php programmer though I needed a simple script to fetch Shopping.com results and parse them based on a keyword associated with WP posts.

So I asked around and threw some shit together and came up with this ....

For it to work properly you have to create a custom field in your posts, or modify to grab post title ....

Code:
<?php 
                                                                
                                                                        $shoppingKey = get_post_meta($post->ID, 'SHOPPING KEYWORD', true);
                                    $baseURL = curl_init('http://publisher.usb.api.shopping.com/publisher/3.0/rest/GeneralSearch?apiKey=wait4months4yourown&trackingId=xxxxxxxx&keyword=' . $shoppingKey );     
                                    curl_setopt_array($baseURL, array(
                                        CURLOPT_RETURNTRANSFER    => true
                                    ));
                                    
                                    $data = curl_exec($baseURL);
                                    $response = new SimpleXMLElement($data);
                                    foreach($response->categories->category->items->offer as $product)
                                    {
                                        printf(
                                            '
                                                                                        <ul class="compWrap">
                                                                                        <li class="compImg"><img src="%5$s" alt="">
                                                                                        <li class="compDescript">%2$s</li>
                                                                                        <li class="compCost">%4$s</li>
                                            <li class="compURL"><a href="%3$s" alt="%1$s">Go To Store ...</a></li>
                                                                                        </ul>
                                                                                        <br class="clear" />
                                            ',
                                            $product->store->name,
                                            $product->description,
                                            $product->offerURL,
                                                                                        $product->basePrice,
                                                                                        $product->imageList->image->sourceURL
                                        );
                                    }

                                                                ?>
 
  • Like
Reactions: guerilla
Look, I've been a fan of simple_html_dom for a while now because it's easy to use and all that, but the trouble is that a) it's slow and b) it's a fucking memory hog (even with the memory leak work around). I realized this while parsing 30+ pages of Google SERPs with 100 results each. The solution is simple, if only slightly more verbose.

Code:
function parseSerp($data) {
    $result = array();
    $doc    = new DOMDocument();
    $rank   = 1;
    
    libxml_use_internal_errors(true);
    $doc->loadHTML($data);
    
    $resultNode = $doc->getElementById("res");
    foreach ( $resultNode->getElementsByTagName("li") as $entry ) {
        $title = $entry->getElementsByTagName("a")->item(0);
        $href  = $title->getAttribute("href");
        
        if (
            // It's valid (not a google property link)
            Serp_Google_Entry::isValidEntry($href) &&
            // it doesn't already exist (sometimes the same url can show up twice) 
            !isset($result[$href])
        ) {
            $entry = DOM_getElementByClassName($entry, "s", 0);
            
            $result[ strtolower($href) ] = new Serp_Google_Entry(
                $title->nodeValue, $href, 
                $entry ? DOM_innerHTML($entry) : "", 
                $rank++);
        }
    }
    
    return $result;
}

// Helpers

// Get's an array of elements that have $className in their class attribute
function DOM_getElementByClassName($referenceNode, $className, $index=false) {
    $className = strtolower($className);
    $response  = array();
    
    foreach ( $referenceNode->getElementsByTagName("*") as $node ) {
        $nodeClass = strtolower($node->getAttribute("class"));
        
        if ( $nodeClass == $className || preg_match("/\b" . $className . "\b/", $nodeClass) ) {
            $response[] = $node;
        }
    }

    return $index === false ? $response : @$response[$index];
}

// Get's the innerHTML of a node
function DOM_innerHTML($node) {
    $doc = new DOMDocument();
    $doc->appendChild($doc->importNode($node, true));
    return trim($doc->saveHTML()); 
}

class Serp_Google_Entry {
    public $title;
    public $url;
    public $description;
    public $rank;
    
    public function __construct($title, $url, $description, $rank) {
        $this->title       = $this->clean($title);
        $this->url         = $url;
        $this->description = $this->clean($this->extractDescriptionText($description));
        $this->rank        = $rank;    
    }
    
    public function __toString() {
        return "{$this->title} :: {$this->url}\n {$this->description}\n\n";
    }
    
    public function extractDescriptionText($desc) {
        $desc = str_replace(array("<em>", "</em>", "<b>...</b>"), array("", "", "..."), $desc);
        return substr($desc, 0, strpos($desc, "<"));
    }
    
    private function clean($str) {
        return 
            preg_replace("`\s+`", " ",
            strip_tags(
            $str
            ));
    }
    
    static function isValidEntry($url) {
        $url = parse_url($url);
        return isset($url['host'])  && 
            !in_array($url['host'], array("www.google.com", "maps.google.com"));
    }
}
How much faster? About 297% faster. When using simple_html_dom, it took about 62.57 seconds. With this method, only 1.86 seconds. Not bad right? That's it, enjoy.
 
  • Like
Reactions: kpaulmedia
Converting relative to absolute links in PHP.

Problem: You scrape a page, and want to follow links on each page. But it's coded with relative links. i.e.. You're scraping "example.com/account", and the developer has used hrefs like:
  • "/logout" (so, should be an absolute of example.com/logout)
  • "../index.html" (absolute: example.com/index.html)
  • "./thisdir.html" (absolute: example.com/account/thisdir.html)

Two functions below handle those (and more) combinations, converting relative to absolute links. These have been modified for WF from a class I'm writing, so let me know if I haven't removed any $this->'s or something stupid.

Function: RewriteRelativeLinks (HTML, Requested URI) . Put in HTML code (e.g. straight from cURL), and it finds every href in the page and (using the next function, RFC1808RelAbs) rewrites to absolute, first using any <base > tags it finds in the code, or alternatively if it can't find base tag, using the requested uri you specify. Easily modified to rewrite Image SRC's. See comments in code to add class details for debugging purposes.

Code:
function RewriteRelativeLinks($HTML,$RequestedURI){
//WF v1
	$dom = new DOMDocument(); 
	libxml_use_internal_errors( true ); 
	libxml_clear_errors(); 
	//Check for a <base> tag. 
	$dom->loadHTML($HTML); 
	$base=$dom->getElementsByTagName('base');
	if(($base->length>0)&&($baseURI=$base->item(0)->getAttribute('href'))){	}
	else { $baseURI=$RequestedURI; 	}
//	echo 'Base set to '.$baseURI;
	$allLinks=$dom->getElementsByTagName("a");
	if ($allLinks->length>0){
		foreach ($allLinks as $i=>$link){
			$url=trim($link->getAttribute('href'));
			$newurl=RFC1808RelRewriteAbs($url,$baseURI);
//			echo "<p>".$url." becomes ".$newurl;
//			$link->setAttribute('class','LinkModified');	//Use this for debugging if you want to work out what links are changed. 
			$link->setAttribute('href',$newurl);
		}
		$newhtml=$dom->saveHTML();
		return $newhtml;
	}
}

Function: RFC1808RelRewriteAbs(Relative URL, Base URI) Shitty function name, but you don't call it directly. Takes a relative uri and a base, and combines to give you an absolute uri. Originally based on s4 of RFC1808. Messy as fuck but worked for me. Handles users,passwords,ports,different schemes, and most strange paths.
e.g. RFC1808RelAbs('/accounts','http://google.com') should return 'http://google.com/accounts'.

Code:
function RFC1808RelAbs($url,$base){
//WF v1
	#1: Determine base: This function assumes the base has been determined correctly. 
	#2: Parse url into parts.
	$up=parse_url($url);
	$bp=parse_url($base);
	#2a: If url empty, inherit entire base and finish. 
	if (strlen($up)==0) return $base;
	#2b: If url starts with a scheme, interpet as absolute. Note this will pass on any user/pass and port settings, too.
	if ($up['scheme']) return $url;
	#2c: otherwise, new url inherits the base scheme.
	$new['scheme']=$bp['scheme'];
	#3: if url host is empty, inherit from base. Else skip to #7.
	if (!$up['host']){ $new['host']=$bp['host']; }
	#4: if url path starts with slash, skip to 7 (i.e. if does not start with a slash, continue)
	if (substr($up['path'],0,1)<>"/"){ 
		#5: if url path is empty, inherit base, and...
		if (!$up['path']){
			$new['path']=$bp['path']; 
			#5-a: Params inherited from base if not set
			#5-b: Query inherited from base if not set
			if ((!$up['query'])&&($bp['query'])){$new['query'].="?".$bp['query']; }
			if ((!$up['fragment'])&&($bp['fragment'])){$new['fragment']="#".$bp['fragment']; }
		} 
		#6: last segment after last slash removed
		$explodedBasePath=explode("/",$bp['path']);
		end($explodedBasePath);
		unset($explodedBasePath[key($explodedBasePath)]);
		#6..: url path appended
		$new['path']=implode("/",$explodedBasePath).'/'.$up['path'];
		$explodedPath=explode("/",$new['path']);
		foreach ($explodedPath as $i=>$segment){
			if ($segment==".") unset($explodedPath[$i]); #6-a: All ./'s removed.
			elseif ($segment=="") unset($explodedPath[$i]); 	//If segment is empty, unset it. 
		}
		$new['path']=implode("/",$explodedPath);	
		#6-b: If ends with ".", remove the period. 
		//TODO: This shouldn't exist because it's been removed in the foreach loop above. 
		if (substr($new['path'],(strlen($new['path'])-1),1)==".") { $new['path']=substr($new['path'],0,(strlen($new['path'])-1)); }
		
		#6-c: all "<segment>/../"'s removed iteratively, from left to right.
		reset($explodedPath);
		while ($current=current($explodedPath)){
			if ($current=="..") { 
				$key=key($explodedPath); //Key for the dots. 
				prev($explodedPath); //move back one
				$key2=key($explodedPath); //Key for the previous segment
				unset($explodedPath[$key]);  //Unset the dots
				unset($explodedPath[$key2]); //Unset the previous segment.
				reset($explodedPath); //reset and start again from the far left to avoid ../../'s being removed. i.e the second dots remove the first ones. 
			}
			else next($explodedPath);
		}
		$new['path']=implode("/",$explodedPath);
		#6-d: if ends with "<segment>/..", this is removed. - already done with the above stuff;
		//Add the trailing slash if it was on the original url. This was removed when imploding/exploding, and required to (in theory) differentiate folders from files.
		if (substr($url,(strlen($url)-1),1)=="/") {
			$new['path'].="/"; 
		}
	}
	else { #4: if starts with a slash, treat as absolute. 
		//..but remove the first slash to avoid double slashes.
		$new['path']=substr($up['path'],1,strlen($up['path']));

	}
	//Import query and fragment:
	if ($up['query']){$new['query']="?".$up['query']; }
	if ($up['fragment']){$new['fragment']="#".$up['fragment']; }

	if ($bp['user']) $new['userpass']=$bp['user'];
	if ($bp['pass']) $new['userpass'].=':'.$bp['pass'].'@';
	if ($bp['port']) $new['port']=":".$bp['port'];

	#7: everything's combined. 
	$newurl=$new['scheme']."://".$new['userpass'].$new['host'].$new['port'].'/'.$new['path'].$new['query'].$new['fragment'];	
	return $newurl;
}

So, example usage in its simplest form (untested but you get the idea):
Code:
$url='example.com';
$ch=curl_init($url);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
$html=RewriteRelativeLinks(curl_exec($ch));
...and the end result are absolute links you can play with.
 
Get website pagerank

Code:
<?php

function getPageRank($url)
{
$pagerank = -1;
$fp = fsockopen("www.google.com", 80, $errno, $errstr, 30);
    if($fp)
    {
        $out = "GET /search?client=navclient-auto&ch=".CheckHash(hashURL($url))."&features=Rank&q=info:".$url."&num=100&filter=0 HTTP/1.1\r\n";
        $out .= "Host: www.google.com\r\n";
        $out .= "Connection: Close\r\n\r\n";
        fwrite($fp, $out);
        while (!feof($fp))
        {
            $data = fgets($fp, 128);
            $pos = strpos($data, "Rank_");
            if($pos === false)
            {
            }
            else
            $pagerank = substr($data, $pos + 9);
        }
        fclose($fp);
    }
return $pagerank;
}

function strToNum($Str, $Check, $Magic) 
{
    $Int32Unit = 4294967296;
    $length = strlen($Str);
    for ($i = 0; $i < $length; $i++)
    {
        $Check *= $Magic;
        if ($Check >= $Int32Unit)
        {
            $Check = ($Check - $Int32Unit * (int) ($Check / $Int32Unit));
            $Check = ($Check < -2147483648)? ($Check + $Int32Unit) : $Check;
        }
        $Check += ord($Str{$i});
    }
return $Check;
}

function hashURL($String) 
{
    $Check1 = strToNum($String, 0x1505, 0x21);
    $Check2 = strToNum($String, 0, 0x1003F);

    $Check1 >>= 2;
    $Check1 = (($Check1 >> 4) & 0x3FFFFC0 ) | ($Check1 & 0x3F);
    $Check1 = (($Check1 >> 4) & 0x3FFC00 ) | ($Check1 & 0x3FF);
    $Check1 = (($Check1 >> 4) & 0x3C000 ) | ($Check1 & 0x3FFF);

    $T1 = (((($Check1 & 0x3C0) << 4) | ($Check1 & 0x3C)) <<2 ) | ($Check2 & 0xF0F );
    $T2 = (((($Check1 & 0xFFFFC000) << 4) | ($Check1 & 0x3C00)) << 0xA) | ($Check2 & 0xF0F0000 );

return ($T1 | $T2);
}

function checkHash($Hashnum)
{
    $CheckByte = 0;
    $Flag = 0;

    $HashStr = sprintf('%u', $Hashnum) ;
    $length = strlen($HashStr);

    for($i = $length - 1; $i >= 0; $i --)
    {
        $Re = $HashStr{$i};
        if (1 === ($Flag % 2))
        {
            $Re += $Re;
            $Re = (int)($Re / 10) + ($Re % 10);
        }
        $CheckByte += $Re;
        $Flag ++;
    }

    $CheckByte %= 10;
    if(0!== $CheckByte)
    {
        $CheckByte = 10 - $CheckByte;
        if (1 === ($Flag % 2) )
        {
            if (1 === ($CheckByte % 2))
            {
                $CheckByte += 9;
            }
            $CheckByte >>= 1;
        }
    }
return '7'.$CheckByte.$HashStr;
}

?>

Usage: call as $pr = getPageRank($yoururl);
 
  • Like
Reactions: gutterseo
+rep I've been needing something like this.

Care to explain what's going on in the supporting functions (strToNum, checkHash and hashURL)? I don't recognize the algorithm.
 
anyone got any php that can take a page and return one or two keywords that are the topic? Eg if I ran it against this page it might output 'php functions'?
 
Quick script to generate tons of unique usernames (for automated blogposts or whatever other clever uses you can come up with..)

pre-requisite: twitter account

parse.php:
Code:
<?php
$twitteru = 'foo'; // your twitter username
$twitterp = 'bar'; // your twitter password
$dump_to_file = 'screennames.txt'; // where to output the usernames

$fh = fopen("http://$twitteru:$twitterp@stream.twitter.com/1/statuses/sample.json","r");
$fhw = fopen($dump_to_file,'a');

while($line = fgets($fh)) {
        $tweet = json_decode($line);
        if (!empty($tweet->user->screen_name))
                fwrite($fhw, $tweet->user->screen_name."\n");
        unset($tweet);
}
?>

# php parse.php &
# tail -f screennames.txt
 
Nice. Will break on Aug 16 due to basic/oauth auth though, right?

Since it's using the streaming API, I don't think it'll be affected
Streaming API

Currently the Streaming API is largely intended for integrating services together, not for direct end-user display. Therefore only Basic Auth is supported. As end-user display features are enabled, OAuth will be also supported.

-- Twitter API Wiki / Authentication