Whatterz


Enabling Search Engine Safe URLs with Apache and htaccess

by Simon. Average Reading Time: about 3 minutes.

An increasingly popular technique among websites and in particular, blogs, is the idea of making URLs search engine friendly, or safe, on the premise that doing so will help search engine optimisation. By removing the obscure query string element of a URL and replacing it with keyword rich alternatives, not only makes it more readable for a human being, but also the venerable robots that allow our page content to be found in the first place.

For example, the following is WordPress’ default URL configuration for a post:

http://www.domain.com/?p=1635

However, buy using a URL-rewriting available in the Apache webserver, we can achieve a far better result, such as the following:

http://www.domain.com/search-engine-safe-urls

NB. It is also possible to achieve a similar result with an ISAPI rewrite for Microsoft’s IIS webserver, but this topic will not be included in this post.

To get your website working with SES URLs you need to enable both the mod_rewite module and AllowOverride directive in the Apache configuration file.

Uncomment (remove #) from the following to enable the re-write rule:

LoadModule rewrite_module modules/mod_rewrite.so

Change the AllowOverride directive from none to all

<directory />
    Options FollowSymLinks
    AllowOverride all
    Order deny,allow
    Deny from all
</directory>
 
<directory "C:/WebRoot">
    # Possible values for the Options directive are "None", "All",
    # or any combination of:
    #   Indexes Includes FollowSymLinks SymLinksifOwnerMatch ExecCGI MultiViews
    #
    # Note that "MultiViews" must be named *explicitly* --- "Options All"
    # doesn't give it to you.
    #
    # The Options directive is both complicated and important.  Please see
    # http://httpd.apache.org/docs/2.2/mod/core.html#options
    # for more information.
    #
    Options Indexes FollowSymLinks
 
    #
    # AllowOverride controls what directives may be placed in .htaccess files.
    # It can be "All", "None", or any combination of the keywords:
    #   Options FileInfo AuthConfig Limit
    #
    AllowOverride All
 
    #
    # Controls who can get stuff from this server.
    #
    Order allow,deny
    Allow from all
</directory>

On Apache webservers, .htaccess (hypertext access) is the default name of directory-level configuration files. An .htaccess file is placed in a particular directory, and the directives in the .htaccess file apply to that directory, and all its subdirectories. It provides the ability to customize configuration for requests to the particular directory. In our case, enabling search engine safe (SES) URLs.

By setting the AllowOverride directive to All in effect defers configuration settings to the .htaccess file.

An example .htaccess file could include the following code to rewrite the URLs:

RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php/$1 [L,QSA]

Search engine friendly URLs are implemented with Rewrite engines. The rewrite engine modifies the URL based upon a number of rewrite conditions and rules.

The RewriteBase directive explicitly sets the base URL for per-directory rewrites. The RewriteCond directive defines a rule condition, so in this case handling missing files or directories. Finally, the RewriteRule directive is the real rewriting workhorse. In this example, we’re getting everything in the URI — i.e. not including the protocol (HTTP/S) and domain name — based upon a regular expression. This is then appended to the default file reference — index.php — as a back reference. The [L,QSA] refers to the rule being the last rule and append any query string parameters to the default file. It is important to note that this is all done on the server side, the user will never see the website address changing in the browser’s address bar. Furthermore, simply transposing the index.php filename with your default file name — e.g. index.cfm, default.aspx — will have the same result. Indeed, the above rewrite rules are becoming a de-facto standard for web applications.

To fully understand mod_rewrite rules above, look at the Apache mod_rewrite documentation.

Once you have your SES functionality in place on the webserver, it is then the responsibility of your application framework to understand the URL construction and handle it accordingly. Fortunately, frameworks such as ColdBox and Fusebox for ColdFusion, Zend and Symfony for PHP, all contain functionality to do this, but that is the subject of an entirely different post.

Users of web applications prefer short, neat URLs to raw query string parameters. A concise URL is easy to remember, and less time-consuming to type in. If the URL can be made to relate clearly to the content of the page, then errors are not only less likely to happen, but our good friends the search engine robots are able to draw a stronger assumption of the pages’ relevance and content.

This article has been tagged

, , , , , , , , , , , , , , , , , , , , ,

Other articles I recommend

Apache .htaccess query string redirects

One of the most common tasks performed by Apache and htaccess is the manipulation of a URL and configuring a redirect for a specific page.

Apache RewriteRule and query strings

At first glance, the way the Apache mod_rewrite module handles query strings can be a little intimidating. mod_rewrite works by sitting on your server in a file called htaccess, and “catching” requests for URL‘s. It then checks these URL request against a series of rules and conditions you have set. If the request meets any of the rules and conditions, it applies then necessary changes to the URL, then reprocesses the request with the changes you have directed.

Configuring Your First Local Apache Website

Apache is controlled by a series of configuration files but the one we will be dealing with here is httpd.conf. This file contains instructions on how Apache should run. Several companies offer GUI-based Apache front-ends, but it’s easier to edit the configuration files by hand.

  • http://www.secretwebresults.com URL Directory

    thanks for explaining this complicated stuff.

  • http://link name

    :-),

  • JonahClint

    Search engine friendly urls can indeed influence the position the search engine places your page in the results. Since it's not even hard to turn the page links into search engine friendly ones, it just takes a bit of interest into your site's ranking to actually do it. I did this for two auto dealers Los Angeles sites and the results came within weeks from the change.