Code Igniter: mod rewrite

This article explains how to take away “index.php” from your CI application URLs. However, it does NOT remove the need for Index.php, which is the CI front controller i.e. even though Index.php will not appear in the URL, it still needs to be present at the top level of your site (above the /system/ directory).  To quote the User Guide,

You can easily remove this file by using a .htaccess file with some simple rules.

You need to perform the following steps to get this working:

1. Create a .htaccess file to configure the rewrite engine

2. Set $config[‘index_page’] to an empty string

3. Make sure your apache uses the mod_rewrite module

4. Make sure apache is configured to accept needed .htaccess directives

5. Restart apache and test

1. Create your .htaccess file

Create a new file named .htaccess and put it in your web directory

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /

#Removes access to the system folder by users.
#Additionally this will allow you to create a System.php controller,
#previously this would not have been possible.
#'system' can be replaced if you have renamed your system folder.
RewriteCond %{REQUEST_URI} ^system.*
RewriteRule ^(.*)$ /index.php?/$1 [L]

#When your application folder isn't in the system folder
#This snippet prevents user access to the application folder
#Submitted by: Fabdrol
#Rename 'application' to your applications folder name.
RewriteCond %{REQUEST_URI} ^application.*
RewriteRule ^(.*)$ /index.php?/$1 [L]

#Checks to see if the user is attempting to access a valid file,
#such as an image or css document, if this isn't true it sends the
#request to index.php
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?/$1 [L]
</IfModule>

<IfModule !mod_rewrite.c>
# If we don't have mod_rewrite installed, all 404's
# can be sent to index.php, and everything works as normal.
# Submitted by: ElliotHaughin

ErrorDocument 404 /index.php
</IfModule> 

The above configuration behaves as follows:

1. If your installation is not in the server root you will need to amend the RewriteBase line from “RewriteBase /” to “RewriteBase /folder/”

2. Checks to see if someone has entered a URL starting with “system”, all requests like this get routed to index.php, this is a security feature that removes the possibility of anyone directly accessing your system folder. You can use the same syntax to hide other folders inside your root if you want.

3. If the URL doesn’t start with “system”, the web server will check to see if there is a corresponding physical resource matching the URL, such as an image, script file, or directory.

4. If such a resource exists, that resource is returned by the webserver with no rewriting performed. If no such resource exists the url is rewritten to index.php (passed to codeigniter)

Notes for Windows users:
To create this file you must open Command Prompt and type:
copy con .htaccess [Enter]
[Press CTRL + Z]
A blank .htaccess file will be created. Now you can edit it using Notepad or your favorite text editor and copy the script above.

Note: Most Windows editors will assume that you are attempting to save an .htaccess file as a file with an extension and no filename. The Crimson Editor can be used to create and save .htaccess files and other files that have no filename.

Note: If your site is placed in subfolder specify the path in the “RewriteBase /subfolder/” line.

Note: When using the above example on some systems it may be necessary to specify the uri_protocol configuration value to achieve reliable results. (Otherwise values with periods that are passed via URI will be converted to underscores in CodeIgniter 1.7.1 eg: some.value becomes some_value)

$config['uri_protocol'] = 'QUERY_STRING'; 

2. Set $config[‘index_page’]  to an empty string

Open your

system/application/config/config.php 

and find the line that assigns $config[‘index_page’] a value, usually:

$config['index_page'] = "index.php"; 

and change it to:

$config['index_page'] = ''; 

Save the file.

3. Make sure your apache has mod_rewrite activated

This means that the apache must be configured to load the mod_rewrite module (or it might have it compiled-in). For module inclusion, usually you have to look for a line like this in httpd.conf or a file loaded by it (hint: use some quick file search utility to grep files with lines containing ‘rewrite’ string):

LoadModule rewrite_module /usr/lib/apache2/modules/mod_rewrite.so 

If you’re running Apache2 type

a2enmod 

in the console and when prompted

rewrite 

to enable mod_rewrite.

On a Windows machine this line might look this way:

LoadModule rewrite_module modules/mod_rewrite.so 

If it is commented out (# in front), make sure to uncomment it and save the file. Checking if the corresponding module exists may be a good idea as well (but it usually does).

4. Make sure apache accepts needed .htaccess directives

This means that apache is explicitly configured to allow .htaccess files to override those directives that you use in your .htaccess file from step 1. above.

It seems to be sufficient if you add these two lines to your <Directory> section where you configure the document root for your CI application:

<Directory "/some/absolute/path/htdocs">
...
Options FollowSymLinks
AllowOverride FileInfo
...
</Directory> 

There might be other Options listed, just make sure you have FollowSymLinks as well.

Should you get a 500 Internal Server Error, try the following syntax:

<Directory "/some/absolute/path/htdocs">
Options Indexes Includes FollowSymLinks MultiViews
AllowOverride AuthConfig FileInfo
Order allow,deny
Allow from all
</Directory> 

5. Restart apache and test your application

Works? Congratulations!

Doesn’t work? Ehrrr… well, do not give up; equip yourself with patience, double check all steps above and if it still does not work, post on the forum giving all details of your setup.

How does URL rewriting work?

<IfModule mod_rewrite.c>
...
</IfModule> 

Do what is inside only if Apache has the mod_rewrite feature (by in place compilation, or loaded module).

RewriteEngine On 

Activate the URL rewriting engine, if not already done (in main Apache configuration file.

RewriteBase / 

Define the part of the URL that won’t change nor be used for rewriting. In fact, this part will be removed before processing, and prepended after processing. This’s a good way to use subfolder-independent rewrite rules. For example, if your CodeIgniter index.php is placed in a virtual host directory, like /tests/, set RewriteBase to /tests/.

RewriteCond %{REQUEST_FILENAME} !-f 

Condition to meet for RewriteRule activation. Here, we test if the requested filename does not exist.

RewriteCond %{REQUEST_FILENAME} !-d 

Same as above, but we test for directory existence.

RewriteRule ^(.*)$ index.php/$1 [L] 

If RewriteCond conditions are met, this rule will be applied. It inserts index.php before the requested URI. The $1 represents the part of string enclosed by parentheses in left expression. The [L] means that this rule is the last one if rule is applied (thus stopping rewriting).

Configuring mod_rewrite in the httpd.conf file

The Apache mod_rewrite docs say

While URL manipulations in per-server context are really fast and efficient, per-directory rewrites are slow and inefficient…

. If you have access to your httpd.conf file, you’ll have better performance if you configure the rewrite rules in there.

You can add something like this to your httpd.conf:

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_URI} !^(/index\.php|/img|/js|/css|/robots\.txt|/favicon\.ico)
RewriteRule ^(.*)$ /index.php/$1 [L]
</IfModule> 

Configuring mod_rewrite and virtual hosting with Apache 2.2

<VirtualHost *>
ServerName www.mydomain.com
DocumentRoot /path/to/ci/directory
<Directory /path/to/ci/directory>
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php/$1 [L]
</Directory>
</VirtualHost> 

.htaccess “Down For Maintenance” Page Redirect

I recently needed to move one website from a shared web host to our internal server. After some discussion, we decided to simply add a “Site Down For Maintenance” page to the site to prevent users from submitting orders during the hosting change. Using the following .htaccess code snippet, we were able to send all users to a maintenance.html page no matter which page they requested:

copyRewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_URI} !^/maintenance\.html$
RewriteRule ^(.*)$ http://domain.com/maintenance.html [R=307,L]

Once we posted the maintenance.html page and .htaccess code on both the old hosting environment AND new hosting environment, we switched the DNS settings. Before making the switch, we had ported the website’s code to a “utility” domain and made adjustments so that the website would function well in the new hosting environment. Now that the DNS had been changed, we wanted to make sure that the website would function well on the new domain within the new hosting environment. Unfortunately the code above blocks EVERYONE from accessing any file besides the maintenance.html file. Fortunately my gifted IT team had the answer:

copyRewriteEngine On
RewriteBase /
RewriteCond %{REMOTE_ADDR} !^11\.111\.111\.111
RewriteCond %{REQUEST_URI} !^/maintenance\.html$
RewriteRule ^(.*)$ http://domain.com/maintenance.html [R=307,L]

The above code sends all users to maintenance.html EXCEPT those with the specified IP, which just so happened to be us. We got to test the website while others were locked out. When we were satisfied with the website, we removed the .htaccess code and the site was back up immediately!

Quoted from: davidwalsh.name

Preventing your site from being indexed, the right way

It keeps amazing me that I keep seeing people use robots.txt files to prevent sites from being indexed and thus showing up in the search engines. You know why it keeps amazing me? Because robots.txtdoesn’t actually do the latter, even though it does prevent your site from being indexed.

Let’s go through some terms here:

Indexed / Indexing
The process of downloading a site or a page’s content to the server of the search engine, thereby adding it to it’s “index”.

Ranking / Listing / Showing
Showing a site in the search result pages (aka SERPs).

So, while the most common process goes from Indexing to Listing, a site doesn’t have to be indexed to be listed. If a link points at a page, domain or wherever, that link will be followed. If the robots.txt on that domain prevents the search engine from indexing that page, it’ll still show the URL in the results if it can gather from other variables that it might be worth looking at.

If my explanation above doesn’t make sense, have a look at Matt Cutt’s video explanation:

So, if you want to effectively hide pages from the search engines, and this might seem contradictory, you need them to index those pages. Why? Because when they index those pages, you can tell them not to List them. There’s two ways of doing that: by using robots meta tags, like this (and I’ve got an article on robots meta tags that’s more extensive):

<meta name="robots" content="noindex,nofollow"/>

The issue with a tag like that is that you have to add it to each and every page. That’s why the search engines came up with the X-Robots-Tag HTTP header. This allows you to specify an HTTP header calledX-Robots-Tag, and set the value as you would the meta robots tags value. The cool thing about this is that you can do it for an entire site. So, if your site is running on Apache, and mod_headers is enabled (it usually is), you could add the following single line to your .htaccess file:

Header set X-Robots-Tag "noindex, nofollow"

And it’d have the effect that that entire site can be indexed, but will never be shown in the search results. So, get rid of that robots.txt file with Disallow: / in it, and use the X-Robots-Tag instead!

quoted from: Joost de Valk @yoast.com