WEB Advent 2009 / Automate Your Deployment

Deploying your web site to a server can be an exercise in frustration and fraught with mistakes. There’s a lot to remember during deployment, from using the correct transfer system to ensuring that caches are cleaned. We aren’t very good at doing the same thing twice, in exactly the same way. Computers, however, are. Automation of deployment steps reduces our stress, and fewer mistakes are made.

Getting your house in order

Before we start, there’s a certain amount of housekeeping that needs to be in place to make deployment easy and consistent.

Source code control

It all starts with control of your source code. If you’re not using a version control system such as Subversion or Git, then sort this out, first! These systems add discipline to your development processes. You should use branches for each new feature (and bug fix) that you develop.

The most important reason to branch is that many clients will approve work in a different order than it is requested. You may have multiple features in development, simultaneously, and need to put them live independently of each other, as they are approved. Branching allows this to happen without accidentally releasing features that are not yet ready.

You also need to be able to handle an urgent bug fix for the live site. This situation always seems to occur when you are in the middle of a significant block of work that has dramatically changed the code in question. A bug that is noticed on the live site needs to be fixed, immediately. If you are not using branches, then you have to separate the work you have been doing, which is not fun and is prone to mistakes.

Using branches ensures that you can always have a version of the site ready to go live. There are two approaches that can be used to organize your source code repository to achieve this: trunk-based deployment, where trunk is always ready to go live, and live-based deployment, where you create a branch called live that is always ready to go live.

With trunk-based deployment, you use branches for integration work that may be required between features. With live-based deployment, you create a branch called live and use trunk for integration work. The latter allows you to cherry-pick and merge just the wanted code into the live branch, and then deploy. In general, for smaller teams, trunk-based deployment works well, and for larger teams, the additional control provided by live-based deployment is very beneficial.

Database considerations

While you are making deployment of the source code easier, you should also pay some attention to your database management strategies. Specifically, you need to control the schema across your live, staging, test, and development servers. As you develop new features on your web site, you will find that you need to change the database schema. The best way to handle this is to version your database with a special table in the schema containing the current version number. You can then write the schema changes that describe how to change to the next version. These are known as up deltas and you should write a corresponding down delta for each of these that describes how to revert to the previous version of the schema.

There are a number of tools that will manage this type of database migration for you. The most commonly used projects are DbDeploy, LiquiBase, and the migration code in Doctrine. Of course, you could also write your own scripts in PHP.

Code considerations

You need to ensure that your source code is prepared for automatic deployment. This means that you need to be aware that the same source is running in different environments, and the code must be able to handle this. Usually, this means that you need to control your configuration, and you should load the correct settings based on where the code is deployed. One mechanism is to automatically detect the environment, based on the URL being used to access the site. This usually means that you have specific keywords in the URL for development, testing, and staging, and then assume that you are loading the live configuration if none of these are present. Another option is to set an environment variable (using SetEnv) within the Apache VirtualHost definition, and then use the getenv() function to retrieve it within your source code.

Automation strategies

When deploying to a server, there are a number of things that you need to consider when planning your strategy. The most important is to consider how you will transport the files to the server. FTP, rsync, SVN (checkout or export) are often easiest, depending on what is available on the server. Once the files are uploaded, you need to consider file permissions on the server and ensure that your HTTP daemon can write to the necessary locations, such as cache folders. You also need to ensure that the files within caches are cleaned out, and — if required — you must prime these caches with new data.

The deployment plan

For every different way that a site needs to be deployed, you should write a deployment plan. The deployment plan is a set of step-by-step instructions on how to deploy your site to the server. By declaring these steps, you make it possible to follow a specific procedure for deployment. When complete, you should test the procedure manually before automating it.

The typical steps in a deployment plan are:

  1. Tag the release.
  2. Publish an “under maintenance” page.
  3. Transfer the files to the server.
  4. Set permissions on these files, as required.
  5. Delete old cache files.
  6. Run database migration scripts, as necessary.
  7. Remove the “under maintenance” page.

Obviously, a specific plan will detail exactly how each step will be performed at a level that even your boss could perform!

Tools for automation

With a deployment plan in place, it’s time to automate! There are a lot of tools out there; you should pick the one that you suits you best. Common tool choices are Phing, Ant, and Capistrano, but many projects have custom deployment scripts.

Write your own

Essentially, a deployment plan boils down to a set of steps that are usually discrete command-line scripts. At its most basic level, an automated deployment script merely has to glue these commands together. It doesn’t matter what you use, but given that I know PHP, it would be my choice. Each step should be its own discrete method, and a typical method would use system() or exec() to run external applications. For example:

function tag($date)
{
    $cmd = "svn cp -m \"Tag for automatic deployment\"  
        {$this->baseUrl}/{$this->website}/trunk {$this->baseUrl}/{$this->website}/tags/$date";

    ob_start();
    system($cmd, $returnValue);
    $output = ob_get_clean();

    if (0 < $returnValue) {
        throw new Exception("Tagging failed.\n" . $output);
    }

    return "Tagged to $date\n";
}

Of course, all key variables would be set using the local configuration, as the core algorithm is generic. An entire deployment script would then simply require a number of similar methods that are called in the correct order.

The obvious advantage of writing a custom deployment script is that it can be custom-designed to exactly fit your way of working. There are also a number of disadvantages to writing your own script — such as maintenance — so you might want to explore one of the common build systems. There are many choices, but I recommend Phing, which is written in PHP based on Apache Ant.

Phing

Phing is easily installed using PEAR and integrates easily with Subversion and the DbDeploy database migration system. It uses XML configuration files that contain targets that can depend on other targets. This allows creation of targeted deployment recipes that can be easily maintained.

A typical Phing build script that does the same as the tag() function, above, looks like this:

<?xml version="1.0" encoding="UTF-8" ?>
<project name="BRIBuild" default="deploy" basedir=".">
    <tstamp>
        <format property="date" pattern="%Y%m%d-%H%M" />
    </tstamp>
    <property file="build.properties" />

    <property name="trunkpath" value="${svnpath}/${website}/trunk" />
    <property name="tagpath" value="${svnpath}/${website}/tags/${date}" />

    <target name="deploy" depends="tag" />

    <target name="tag" description="Tag trunk">
        <exec command="svn cp -m 'Tag for automatic deployment' 
            ${trunkpath} ${tagpath}" />

        <echo msg="Tagged trunk to ${date}" /> 
    </target>
</project>

This can be executed on the command line simply by typing:

phing deploy

One nice thing about XML is that with well-chosen tag and attribute names, the file becomes self-explanatory. The build.properties file that is referenced above allows for all configuration values to be stored outside of your build file, and each target contains a number of tasks. In this case, we use the exec task to execute the Subversion command-line client to perform the tag for us. Again, it is easy to create a set of targets that perform all of the tasks in your deployment plan. A set of meta targets, like deploy in the script above, are then used to tie the actions together in the correct order.

Phing addresses the main disadvantages of writing a custom deployment script. Due to being a standard, it also provides integration with a wide eco-system that provides hooks into continuous integration systems like Xinc. There is also very good documentation available for Phing. It even supports many different file transports, including FTP, so there’s no excuse not to use it.

Rollback

No discussion on automatic deployment is complete without mentioning rollback. With automated deployment and a good staging system, mistakes tend to take place at a higher level where the wrong code been deployed or a release has been made at the wrong time. Rollback is the process of backing out of a deployment and returning the server to its previous state. Like your deployment, you should automate the rollback as a Phing target. The steps you are likely to take include:

  • Move the symlink back to the previous version of the deployed code.
  • Delete the deployed directory.
  • Roll the database back, using the down delta in your migration tool.

Obviously, you should be careful with database rollbacks. If your up delta causes a destructive change to the data, it may not be possible to go down again. In those cases, make sure you plan for the situation in your rollback strategy.

I hope that this article has persuaded you to consider an automated deployment and rollback system. The lowered blood pressure alone is worth the time investment, and the number of deployment mistakes you will make will decrease dramatically. Fortunately, writing deployment scripts in a build system like Phing is easy — we set up our internal deployment scripts in less than half a day. With automated deployment and rollback, going live is no longer a scary process and is (almost) a stress-free exercise.

Other posts