I may be a pedant, but I’m here to tell you that continuous deployment is a total misnomer. It’s discrete. A true continuous process would have code flowing naturally from your brain, through your fingers, keystroke by keystroke, to production, through some sort of bizarre stream of consciousness pipeline.
Now that I’ve got that out of the way, I’ll give you the common definition, which is that you ship changes on a commit-by-commit basis, or alternatively, as frequently as you desire. What most people think of is push-per-change, which is often not actually the case.
In practice, most CD setups are automated, not automatic. What this means is that although every step from code commmit to ready-to-push may be automated, at the end of the day there’s some kind of Big Red Button or one-line script or IRC bot that you activate in order to deploy.
Why would I want to do CD?
Hipster cred. Seriously, though, the goal is to minimize the amount of work needed between finishing an idea and making it real. That’s a worthy goal.
A side benefit is that building the tools, processes, and skills needed for CD will make your working life better.
Good practices for CD
I put up a list of practices in a talk and called them “requirements.” Laura Beth Denker called me on it. She said these things are not true requirements for doing CD, because people do it with none of these requirements satisfied. She’s right. Let me suggest instead that it would be a good idea to do some, if not all, of these things before adopting CD. Most of these practices are a good idea in any case.
The practices in the following sections are not all-encompassing, and I’m sure there are great practices of which I am totally unaware.
Continuous integration
This is the practice of having developer changes land on a shared version control system as they are completed. (The original version, in Extreme Programming, says “several times a day” but velocity is irrelevant in my view.) CI presumes you are using a VCS. You are, right? Since GitHub made version control a commodity, there is no longer any excuse to not have a source repopository.
Build-on-commit
“But, I’m writing code in PHP/Python/Ruby, and I don’t need to build anything. That’s for compiled languages.” It may be true that, in your environment, the build part of the process is a NOOP, however, it may involve steps like:
- Minification of CSS and JS
- Running a localization script to combine templates with string files
- Converting your dynamic code to some kind of intermediate format
The net result of a build process like this is to spit out at the other end a build artifact which will be deployed.
Test-on-commit
This is where you take that build and run your tests on it. Tests. Yep, I said it. I sincerely hope this part of your process isn’t a NOOP. There are great tools for this now. My team uses Jenkins, and lots of people like Travis CI.
Recently, we’ve been automatically running tests on pull requests using Leeroy. Travis CI does this, too. It’s so nice to avoid a code review, because I can see that your pull request doesn’t pass tests!
Good test coverage
Having 100% unit test coverage is nice, but it can make developers over-confident. Full test coverage doesn’t cover integration testing, or performance testing, and in an ideal world, your automation would do both of those things. The most important thing is to know what your tests cover and have a realistic assessment of what the holes are, and the level of risk.
A staging environment that reflects production
The developer’s own machine is a horrible staging environment. Get a real one. Make it as close to production as possible. The biggest issue I see in staging environments is what I call the “single box of fail.” In staging, you have one machine. In prod you have, say, ten web heads, one DB master, three DB slaves, some Redis or Memcache machines, and a queue, all running on different servers. In this environment, you will have failures.
I have been bitten by this a hundred times, and it still bites me. It bit me last Wednesday, in fact, when it turned out our PostgreSQL puppet manifests were ever-so-slightly different in staging and production. Don’t be me. Make your own novel mistakes.
Managed configuration
Speaking of Puppet, you should use something to manage configuration of your machines. Puppet, Chef, CFEngine — pick something.
Single-button deployment
You should have scripted, single-button deployment to all of the machines in your infrastructure. (You should also be able to deploy to individuals or groups.) Again, there are a ton of tools that will do this for you. Pick one, or roll your own.
Failure plan
You need a plan for what you’re going to do if things go horribly wrong. In practice, there are two basic approaches:
- Rollback plan
- This can be as simple as being able to deploy a previous build. Remember, you need reverse data migrations.
- Feature flagging
- Build your code so that you can turn features off and on, or turn them on for sub-groups of users (beta testers, or employees, or a random 10% of the user population). This requires more forethought, but it’s like writing tests — an investment.
Track, trend, monitor
Measurement is key to process improvement. From page load times, to conversions, to TCP sockets, every measurement gives you insight into something. And, remember, if you don’t have good visualization tools for your information, it’s only data.
Excellent source code management
Know thy tools, craftsman. At least one person on your team should know your SCM tool inside and out.
High levels of trust
If you have separate development and operations teams, or even if you have an awesome devops team, you won’t get far without trust. It doesn’t matter when things are going well; it’s when everything turns to crap and you’re firefighting when the finger pointing starts. So, start from the premise that people are trustworthy. Trust them to do their jobs. (And, if they can’t, coach them or sack them. It’s that simple.) It doesn’t matter what your job title is. You’re all on the same team. You all want the same thing.
Realistic risk assessment and tolerance
Know what your risks are, and have an appropriate level of tolerance. If you are trying to build something with five nines of uptime, you will want more assurance before deployment, and it may be that CD is not for you. Know what parts of your site are acceptable to break, and what parts need to be up, and act accordingly. Availability requirements are not the same across all components of a system. Managing this is key to resilient systems.
Excellent code review
Code review has its place; on many projects I work on, every line of code is reviewed. On things that are less mission critical, or have very experienced teams, this is less the case. Code review is useful as both developmental (“I know a better way to do that”) and copy editing (“You made a critical typo here”).
Finally
If there are only two things you remember after you read this article, I would like them to be these:
- You should build the capability for continuous deployment, even if you never intend to do continuous deployment.
- As with everything else in life, the only true way to get good at deployment is to deploy a lot.
I’d be interested to hear about your thoughts and practices for CD — get in touch!