Using Far-future and Cache Headers to Increase Your Website's Performance

Created by josh
November 12, 2016 6:56:58 AM PST


Understand the importance of allowing clients to cache your site's resources - no matter what devices are being used

In today's modern world, users are browsing the web on a variety of different devices - all shapes and sizes. For mobile users, it's important that web pages are delivered to quickly adapt to any device's screen size while keeping an intuitive user experience. But that's just the start. An efficient web application should load quickly and efficiently, while keeping data usage/costs in mind.

 

How do we deliver web pages quickly and efficiently? By understanding headers and how browsers interact with them for caching resources.

 

A quick 101 on the 304 HTTP status code:

By definition, the 304 status code indicates "Not Modified." According to the docs, "If the client has performed a conditional GET request and access is allowed, but the document has not been modified, the server SHOULD respond with this status code."

When your browser first visits a new site (or first visits a site since you last cleared the cache), it reads the source code and downloads the resources it needs to render the page - images, CSS, javascript.. you name it. Each of these resources needed to make the page render properly means that your browser is making several separate calls during the page load.

 

In this example HTML snippet, this page makes three calls to the server after the page begins to load:

Now that was a pretty simple example... think about really large pages with banners, icons, pictures down the page, sidebars, large (and sometimes multiple) CSS/JS files... that's a lot of requests just to see one page.

 

Each of those requests take time and data.

 

As a web application developer, you have ways to tell browsers to hold on to the resources they just requested, so that next time they visit the page, the browser can simply use everything it has already downloaded without having to make several requests. This allows browsers to simply use their cache instead of requesting the same images, CSS files, javascript scripts, etc. again.

 

There are three common ways to make this happen:

1) Set up your web server to deliver far-future headers, such provided by the Apache extension mod_expires.

2) Use cloud storage, such as Amazon AWS S3, to upload your assets and set the headers programatically through their SDK or web-based GUI.

3) [Preferred Choice!] Use PHP to serve assets and intervene with headers based on the request from the client.

 

#1 and #2 are both very valid ways of doing this, but I went with number 3 since it's my preference to store this system in my PHP codebase vs. reconfiguring my server or going to a third-party resource. However, because of the URL structures, I did set a rewrite rule in Apache so that all image filenames in the assets folder of my site would look the same before and after implementation.

 

Apache rewrite rules:

# Images
RewriteRule ^/assets/img/(.*)$         /service/images.php?src=$1

# JS or CSS
RewriteRule ^/assets/(css|js)/(.*)$    /service/css_and_js.php?type=$1&src=$2

 

The rewrite rule ensures that the URL "https://www.example.com/img/banner.png" doesn't have to be changed to something like "http://www.example.com/services/images.php?src=banner.png" after implementation (although, that URL does not technically work).

Next, we write the services. Both services are controllers:

  • images.php
  • css_and_js.php

... while the object they both use is a class file:

  • Headers.class.php

 

images.php

/**
 * Created by Josh L. Rogers.
 * Copyright (c) 2016 All Rights Reserved.
 *
 * images.php
 *
 * Sets image headers (requires rewrite rule)
 *
 */

require('config.php');

// set img bucket
$local_image_bucket = $_SERVER['WEB_ROOT'] . '/assets/img/';

// get param
$filename = isset($_GET['src']) && is_file($local_image_bucket . $_GET['src'])
    ? (string)$local_image_bucket . $_GET['src']
    : false;

// 404 if img not provided or doesn't exist locally
if (!$filename) {

    http_response_code(404);

    exit;

}

// get the client headers - see what they're asking for
$client_headers = apache_request_headers();

// invoke headers object - pass in the entire request and the filename
$headers = new \Data\Common\Headers($client_headers, $filename);

$headers->images();

// compress with gz (if available)
if (!ob_start('ob_gzhandler') || !stristr($client_headers['Accept-Encoding'], 'gzip'))
    ob_start();


// echo out file
echo file_get_contents($filename);

ob_end_flush();

 

css_and_js.php (very clever names for these services, amiright?...)

/**
 * Created by Josh L. Rogers.
 * Copyright (c) 2016 All Rights Reserved.
 *
 * css_and_js.php
 *
 * Sets css and js headers (requires rewrite rules)
 *
 */

require('config.php');

$type = isset($_GET['type']) && ($_GET['type'] == 'css' || $_GET['type'] == 'js')
    ? (string)$_GET['type']
    : false;

// set bucket
$local_bucket = $_SERVER['WEB_ROOT'] . '/assets/' . $type;

// get param
$file = isset($_GET['src']) && is_file($local_bucket . '/' . $_GET['src']) && $type
    ? (string)$local_bucket . '/' . $_GET['src']
    : false;

// get client headers
$client_headers = apache_request_headers();

// invoke headers object with client headers and filename
$headers = new \Data\Common\Headers($client_headers, $file);

$headers->CSSandJS();

// compress with gz (if available)
if (!ob_start('ob_gzhandler') || !stristr($client_headers['Accept-Encoding'], 'gzip'))
    ob_start();

// echo out file
echo file_get_contents($file);

ob_end_flush();

 

Headers.class.php

/**
 * Created by Josh L. Rogers.
 * Copyright (c) 2016 All Rights Reserved.
 *
 * Headers.class.php
 *
 * Various header checks from client and/or server
 *
 */

namespace Data\Common;

class Headers
{

    public function __construct(array $apache_request_headers, $filename)
    {

        if (!is_readable($filename)) {

            http_response_code(404);

            return false;

        }

        $this->client_headers = $apache_request_headers;
        $this->filename = (string)$filename;
        $this->filetype = mime_content_type($filename); // this only works for images (css and js are text/plain)
        $this->last_modified = filemtime($filename);

        return true;

    }

    /**
     * @return bool
     */
    public function CSSandJS()
    {

        // always send headers
        header('Last-Modified: ' . gmdate("D, d M Y H:i:s", $this->last_modified) . ' GMT');
        header('Etag: ' . hash('md5', $this->last_modified));
        header('Expires: ' . gmdate('D, d M Y H:i:s', strtotime('+1 years')) . ' GMT');

        // exit if not modified
        if (isset($this->client_headers['If-Modified-Since']) && ($this->client_headers['If-Modified-Since'] == gmdate("D, d M Y H:i:s", $this->last_modified) . ' GMT')) {

            // 304 Not Modified
            header('Last-Modified: ' . gmdate('D, d M Y H:i:s', $this->last_modified) . ' GMT', true, 304);
            header('Cache-control: max-age=' . (int)60 * 60 * 24 * 7 * 365);
            header('X-304: true');

            exit;

        }

        if (substr($this->filename, -3) == '.js')
            header('Content-type: application/js');
        

        if (substr($this->filename, -4) == '.css')
            header('Content-type: text/css');

        header('Cache-control: max-age=' . (int)60 * 60 * 24 * 7 * 365);
        header('X-304: false');
        
        return true;

    }

    /**
     * @return bool
     */
    public function images()
    {

        // always send headers
        header('Cache-control: max-age=' . (int)60 * 60 * 24 * 7 * 365);
        header('Etag: ' . hash('md5', $this->last_modified));
        header('Expires: Thu, 31 Dec 2099 23:59:59 GMT');
        header('Content-type: ' . $this->filetype);

        // exit if not modified
        if (isset($this->client_headers['If-Modified-Since']) && ($this->client_headers['If-Modified-Since'] == gmdate("D, d M Y H:i:s", $this->last_modified) . ' GMT')) {

            // 304 Not Modified
            header('Last-Modified: ' . gmdate("D, d M Y H:i:s", $this->last_modified) . ' GMT', true, 304);
            header('X-304: true');

            exit;

        }

        header('Last-Modified: ' . gmdate("D, d M Y H:i:s", $this->last_modified) . ' GMT');
        header('X-304: false');

        return true;

    }

}

 

As you can see, all we're doing here is we're telling the new visitor - here is the resource, don't come back to download it again if it hasn't been modified since you last saw it. Then when the client comes back, their browser will simply check the etag and modified date again to verify it doesn't need it - in which case, the php file exits without delivering the content the client presumably has already.

 

Happy Friday!

Josh