Mod_Rewrite Tutorial - First Timer

Status
Not open for further replies.

LazyD

$monies = false;
Dec 7, 2006
656
12
0
Wine Cuntry
wildfoxmedia.com
Since it seems alot of people have trouble with mod_rewrite I figured id write up a little tutorial...

If anything needs to be changed after im done writing this please let me know...

The Basics
If you arent familiar with mod_rewrite and htaccess then heres a little introduction. Mod_rewrite gives you the ability to do a variety of things including masking dynamic URLs and masking affiliate links to name a couple. In this tutorial I plan to cover just the couple basic things I just mentioned..

First thing first, in the root of your public_html or www folder make a file called .htaccess - Notice that the file is ".htaccess" with the period in front, it is imperitive that you make sure it named exactly like this. The best way to do this if you are using Notepad is to Save As, then drop the box down in the save dialog to all files, then in the Save As name box type - ".htaccess" including the quotes.

Masking Dynamic URLs
Say you have a site written in PHP where the majority are URLs like
Code:
index.php?variable=2152&variable2=hello
Whether or not SE's say they can parse and index URLs like that it is always best practice to clean up those URLs for the sake of the SEs and your visitors.

So, for this example we will use a website that I have used mod_rewrite on - http://statesrealty.us

Without mod_rewrite the URLs would look like:
Code:
http://statesrealty.us/index.php?State=StateHere&County=CountyHere
These are pretty clean for dynamic URLs, compared to ones stuffed with numbers and other random shit. Still, we want to clean these up...

So we have our .htaccess file open, the first lines we want to put in there are:
Code:
Options +FollowSymlinks
RewriteEngine on
These just tell Apache to turn the mod_rewrite engine on.. Simple enough..

Next comes the tough part... Here is the line out of my .htaccess file
Code:
RewriteRule ^(.*)/(.*)\.html$ index.php?State=$1&County=$2 [L]
You may be thinking this is garbage, but it makes sense, it really does.. Lets break it down mmmk

First, we have RewriteRule, this is simply telling Apache/Mod_Rewrite that this is a new rule for it to follow.

Next is
Code:
 ^(.*)/(.*)\.html$
This part of the rule is telling mod_rewrite what you want to turn the URL into. The ^ signifies that it is the start of the string, pretty much saying its the first thing to come after http://statesrealty.us/. The 1st (.*) pretty much means "The $1 after State= Goes here" and the 2nd (.*) means "The $2 after County= Goes here", the / between them signifies that there will be another folder there(This will be explained better a little down the line).

After that, we have \.html$, in mod_rewrite you are required to escape special characters, this means you need to put a "\" before any spceial characters, in this instance, we are saying we want the URL to end in .html therefore we have to escape the . otherwise it will be parsed as something else and we will have issues. You can easily change the extension to php or whatever you want by replacing the html with your preferred extesnion. You can also make the URL look like a folder with no file extension by replacing:
Code:
^(.*)/(.*)\.html$
with
Code:
^(.*)/(.*)$

Again, if you wanted to, you could even make it appear to be in its own folder by changing the following:
Code:
^(.*)/(.*)\.html$
with
Code:
^United States/(.*)/(.*)\.html$

Lastly, the $ signifies the end of the line...

Next we will jump over and grab:
Code:
index.php?State=$1&County=$2
This looks pretty straight forward except for the $1 and $2, for anyone that is familiar with PHP this should resemble a URL where $_GET vars are being utilized. This is the "source" if you will, mod_rewrite will get the both (.*)s from the section in front of it plug them into the $1 and $2 respectively.

For instance, using our example stuff above, this URL:
http://statesrealty.us/index.php?State=California&County=Sonoma
Is the same as:
http://statesrealty.us/California/Sonoma.html

mod_rewrite will take the first variable, in this case California and reference it to the State field in our original URL, then Sonoma comes next, it is referenced to our County field in the original URL.

One last heads up... If you are going to put multiple RewriteRules into your htaccess file you are required to put the Rule with the most variables at the top.. example:

RewriteRule ^(.*)/(.*)\.html$ index.php?State=$1&County=$2 [L]
RewriteRule ^(.*)\.html$ index.php?State=$1 [L]

If you put them the other way around, it will not work...

Redirect/Masking Affiliate Links
It may be just a personal habit, but I hate the look of a URL to my affiliate link that goes through an affiliates system, example:

http://publishers.xy7.com/z/34040/CD4095

If I was a visitor to a site, especially not one thats completely computer saavy I would be somewhat wary of clicked a URL that looked like that since I have no idea what is does...

Instead, looking a URL like http://statesrealty.us/Free-Home-Valuation/ looks much nicer...

This is very easy to accomplish via your htaccess file...

To mask the above publishers.xy7.com link you can do the following:
Code:
redirectMatch 301 /Free-Home-Valuation/ http://publishers.xy7.com/z/34040/CD4095/

redirectMatch 301 is saying, if the URL matches /Free-Home-Valuation/ I want you to do a 301 (Moved Permanently) redirect to the URL that comes after it, in this case that is http://publishers.xy7.com/z/34040/CD4095/

Now when I put a link on my page to the Home Valuation affiliate offer, instead of linking to that scary publishers.xy7.com URL I link to http://statesrealty.us/Free-Home-Valuation/ and when clicked my user is automatically redirected to the offer...

This tutorial was written pretty quick and my writing skills arent the greatest, plus mod_rewrite is just fuckin hard to explain sometimes so if anything needs to be clarified please feel free to let me know....
 


Hey, well, I hire this shit out.. but yeah, I'm sure the community will like this for those that need to know how. :) +rep for effort :)
 
Ummm just a little word of advice, you might wanna change that state country mod_rewrite regex to something like ([a-zA-Z]+) so it is just another way to stop being hacking you
 
Thanks for going out of your way, I really know shit about Mod_Rewrite, thanks for the help.

Respect given!

Jer
 
Thanks all for the kind words...

illusion, I am currently working on that, recently ive realized all the sites in my network solely rely on mysql_real_escape_string as the only form of security for $_GET variables, ive gone through and updated it with a regex for another site and im almost done adding it to the rest, thank you for pointing that out, and most of all for not hacking my ass. If you have any insight on how I can add a + to my regex I would be greatful, I have states and counties that are 2 words and display as New+Hampshire but for the life of me I cant get the + to pass regex..:)
 
Thanks all for the kind words...

illusion, I am currently working on that, recently ive realized all the sites in my network solely rely on mysql_real_escape_string as the only form of security for $_GET variables, ive gone through and updated it with a regex for another site and im almost done adding it to the rest, thank you for pointing that out, and most of all for not hacking my ass. If you have any insight on how I can add a + to my regex I would be greatful, I have states and counties that are 2 words and display as New+Hampshire but for the life of me I cant get the + to pass regex..:)

[a-zA-Z+]+

just put it in your set
 
The plus outside the square brackets means repeating. The plus inside includes it in the search.
[a-zA-Z\+\ ]+

That would include + signs and spaces. (I escape everything not alphanumeric out of habit.
 
Turns out I had it correctly, but when the $_GET var was being pulled it was turning the "+" into a " " and therefore wasnt passing through eregi or preg_match. Its all fixed...
 
"+" has no special meaning in character classes, everything is taken as a plain character except "-", "\" and "^", "-" defines a range. If you want "-" in your character class, make it the first char in that class, for example [-a-z] will be a class of - OR any character from the range a-z.
 
Status
Not open for further replies.