WEB Advent 2009 / 1,500 Lines of Code

Even the best of us can only write 1,500 lines of code a day, so we need to make those lines count.

There were so many great articles in PHP Advent this year, I couldn’t think of a good topic — I like to believe my peers stole all the good ideas. :-)

A decade of PHP

This Advent marks the day I’ve been a professional PHP programmer for a decade. Ten years ago, I was consulting for a startup consisting mostly of Asians, which meant a lot of smoke breaks. During one of those particular breaks, a programmer asked me, “Hey, Terry, have you heard of PHP?”

“No, what’s that? Some new designer drug?”

“Haha! No, it’s a web language like ASP 1, but it’s free. You should check it out; you might like it more than Perl.”

Instead of smoking a pack of this in honor of that day, let’s begin with some startup stories spanning that decade.

The John Henry of C++

A few startups ago, I worked with a guy who was a better programmer than you.

One day, we got into an argument over a piece of open source import code — written in Python — that he had ported to C++. He had just finished telling me how much faster he had made it, when I asked, “What’s the point in that? Now that you’ve rewritten it, you own the maintenance of it. (Stoyan would disagree.) There is no evidence this code is even the bottleneck.” 2

The point? The code was crap, and he had fixed it — the massive improvement in efficiency was an added bonus.

“Look you’re right. It’s true I prefer to use a crappy, ugly, underperformant language like PHP, and you crank out C++ like John Henry drives steel. But, while you’re busting code with a hammer in each hand, I’m the guy with the steam-powered jackhammer. Sure, you win, but your heart will burst, and you’ll die.”

“And here’s the thing,” I added with a devilish grin, “There’s only one you; there are a lot of people like me out there.”

1,500 lines of code a day

That philosophy started out with an observation I made while being the director of engineering at an earlier startup. It was the kind of startup that all programmers have to cut their teeth on at some point — the kind where you, like Laura, have to put in 170-hour weeks and program uphill both ways.

— Who would have thought 10 years ago that I’d be writing a PHP Advent article drinking Château d’chatalé?

Although I did not actually put in 170-hour weeks 3, I did sleep under my desk. I also didn’t program uphill both ways, but I did spend my days programming and my nights integrating all my engineers’ daily commits. It was during this nightly integration that I made a startling observation. No matter what the programming language, no engineer — including myself — wrote more than about 1,500 lines of code a day.

This observation affected my engineering judgements and decisions for the next decade. It is a maxim I live by today. If we can only write 1,500 lines of code a day, then a quarter of a million lines of code takes about a year. If we can only write 1,500 lines of code, we had better choose the right language and use it in the best way to maximize the expression of our creativity.

When I told this to a friend, he added, “1,500 lines a day… except in Java. J2EE programmers only write five lines of code a day.” When I laughed, he added, “But they are five really good lines.”

— Java: It only takes three years to put in a hard days work!

The Death of PHP

Writing software is about making choices, and it helps to ask what language we choose to develop in. That is a choice.

Obviously, I think PHP is very frequently the right choice. The reason I choose PHP is that it is a web-based templating language that is simple, scalable, and pragmatic.

Choices have consequences. Everyone knows what consequences are. If not, there’d be a One Language to rule them all. And, we’re not Java developers. ;-)

One consequence of PHP is that it is now stuck between a rock and a hard place.

— Unfortunately, the rock and the hard place are not a bottle of Jameson and a pint of Guiness. Given the drinking habits of most PHP developers at major conferences, it is crazy not to be very, very worried for the future of the language.

On one end, the ubiquity of rich, Ajax-driven, web sites means the inherent advantage of a templating language is no longer there, having been replaced by a much larger demand for design, as Helgi mentioned. If your view is entirely in HTML and CSS, your controller in Javascript, and all your web end is doing is feeding the API data as JSON, as Christian discussed, then being able to embed stuff in a template is no longer a plus. So many real-world apps are now written this way that discussions of MVC and templating systems sound like they are so last Advent.

On the other end, social networks have sped up demands of data, so that they live more in RAM in the form of memcached than on disk in the form of a relational database. When making a web page was tied to a disk-bound database, performance discussions are pushed to database performance discussions, which really is a discussion of disk performance. Discussing app server performance is as pointless as porting some ancillary import code from Python to C++, because the PHP part was always waiting on the database. In those cases, it makes sense to trade off performance for simplicity, scalability, and directness.

As Ilia has already noted, web performance is now a complicated beast. With the advent of memcached and a highly connected social graph — which must be partitioned 4 — more computations have been pushed from the database to the app server. Now the inherent performance of PHP may become the rate-determining step, and PHP has never been a speed demon.

Good programming is a practice. It is reasonable to ask ourselves if PHP is still a good language to be applying our practice on.

Perhaps we should move on to Ruby on Rails?

Rails Fails

For years, I worked at a C- and C++-based web startup and kept quiet. But, when I returned to a startup based on PHP a few years ago, it was time to unleash a few years of pent-up Ruby on Rails fury in the form of an article. As Matt mentioned earlier, this article created a firestorm of defense from the Rails community and provided some easy jokes for my talks for the next couple years.

“First they ignore you, then they laugh at you, then they fight you, then you win.”

— Mahatma Ghandi

The ideas behind the article were simple:

  • Rails does not represent a threat to PHP.
  • If the PHP world had Twitter, we’d be trumpetting it, not pillorying it. 5
  • Nobody cares about this, because your language or framework choice isn’t going to get you a date.

In my days before startups — college years — the most “different” dorms 6 were the ones where residents 7 tended to walk around campus barefoot. It always caught me as a bit strange that all these people were being unique by doing exactly the same thing — walking around barefoot. 8 The fad that was Ruby on Rails reminded me of that; it always seemed to me that Ruby adoption was chasing something new, simply because it was new — being different without Thinking Different.

A couple of years ago, a few ex-Ruby developers who worked in the field of search told me that the Ruby world had even made a Rails version of MapReduce. That’s akin to trying to write a web server in PHP, and indeed they mentioned that the software was incapable of implementing any sort of reduce function—it was just a MapNoReduce. We had a good laugh about that over a couple beers and pizza. Championing such pointless stupidity while pillorying Twitter 9 — that’s the Rails community of the time.

Two and a half years later, I see that Twitter has broken into the top 20 web sites — being one of only two Rails-based sites in the top 100. 10 I can gloat about how the Rails community shouldn’t have been such assholes about a showcase application like Twitter.

— The creator of Ruby on Rails has two words for us.

But instead, I’ll point out that I was wrong.

Rails and PHP

Don’t fear. Rails is still no threat to PHP. 11 (I don’t think using Rails is going to get you more women, either.)

My mistake was implying that because Rails can’t be run easily in a shared hosted environment, Ruby might never find a niche as a web development language:

“Ruby is really good at what it does. The problem is, for what Ruby does really well, I can download WordPress. [Ruby is] really good at building those apps that have already been built before. PHP is good at finding out what the next WordPress is.”

For Rails and Rails apps to have anywhere close to the uptake of PHP and PHP apps like WordPress, SugarCRM, Joomla, Drupal 12, Magento, and MediaWiki, it’d have to run in such an environment. Without shared hosting, open source Rails applications were doomed, the entire thesis of what Matt said explains this symbiosis for PHP: PHP is really almost always mod_php and integrates so well with the preferred web server for shared hosting, Apache, there is an acronym for that — LAMP 13. All mod_ruby instances will share the same Ruby interpreter in Apache. Besides performance not scaling the way Apache was designed to scale, it means that someone else’s Rails code can trample your Rails code on the same server.

— Whenever @spooons gives me a batch of these to give away, it never lasts very long.

Ick!

Strength from weakness

I can trace the exact moment I found out that I was wrong. It was seven months before I wrote that article and the day after I became a PHP Terrorist. 14

— If the PHP world puts your face on a deck of cards, you must be a PHP terrorist.

In the Fall of 2006, I was being interviewed by Cal Evans. He asked me, “What was the most interesting thing you saw today at ZendCon?”

“The S3 data storage that Amazon and SmugMug showed is impressive, but the EC2 cloud computing stuff was the most interesting, because it is disruptive.”

In the book, The Innovator’s Dilemma, Clayton M. Christensen introduces the concept of disruptive technology, an innovation that breaks the way things have been done previously.

Virtual hosting is disruptive.

Yes, it’ll never be as cost-efficient as shared hosting, managed hosting, or colocation, but last I checked, film cameras are still cheaper than digital, and notebook hard drives store less and cost more than their desktop companions. All disruptive technologies start out inferior to that which they eventually supplant. Regardless of whether you were sitting there in that talk in 2006 or looking at all the Web 2.0 startups built on top of slicehost or EC2 today, virtual hosting was a rare instance where the disruption it would create is obvious.

If you are a Ruby on Rails developer who can’t choose shared hosting, what can you do? You do virtual hosting. And what do you need to do that? You need tools to manage it. And what language do you write those tools in? You write it in the language you know, Ruby.

The best documentation on doing virtual hosting is written by Ruby developers. The best tools for managing and setting virtual hosts are in Ruby. Very often, the best web services built around virtual hosting are written in Rails.

Ruby found a niche in the web world.

The flaws in the design of Ruby on Rails forced Ruby to adopt technologies that work around them. Ruby is not a particularly good language, and Rails is not a particularly good web framework. Ruby on Rails never had a chance treading the same path as PHP via symbiosis with MySQL, Apache, and Linux and ubiquity on shared hosts. But to extent that was a limitation, they were able to turn that into a strength, because it forced them to adopt things solutions like virtual hosting, and the Ruby attitude spread adoption of New New Things like Git that actually turned out to be useful — you get the MapNoReduce, but you also get GitHub.

PHP was never about the language

What lessons can we learn from that?

If we want to find PHP’s strengths, we have to look at PHP’s weaknesses. Just as Rails didn’t tread PHP’s path, PHP won’t tread Rails’. I once said Rails is like a rounded rectangle, and PHP is like a ball of nails. When you throw PHP at something, it sticks.

— In fact, that became a conference shirt.

PHP has always been a web templating language that has never been especially performant. This weakness, which is causing its share to be eroded by Javascript UI libraries on one end, and the demands of app server performance on the other, points to the salient fact of the language: PHP is glue code.

PHP is simple, direct, and scalable. Simplicity made it a dynamically-typed programming language — performance would never be the best. The directness meant it is a templating language first — as went the Web, so went PHP. And, scalability meant it’d be dependent upon other software by pushing the difficult problems to them and outside PHP.

As Esser noted, PHP was never Ruby or Java. You’d never see us make a MapNoReduce when a real MapReduce is available called Hadoop, written in Java. Although CakePHP is based on Ruby on Rails, it assumes the existence of the Apache web server; Rails’s “convention over configuration” eliminates even that.

It was never about the language, and we always knew it. We invented the term LAMP in homage to other parts. It’s just a matter of finding more parts to bind to, to overcome those weaknesses.

There is no “Not Invented Here” here, because almost all of the PHP world was not.

Without PHP, it’d just be LAM(e)

Now, some of you might be saying, “I can see where you are headed here, but, this is the PHP Advent; shouldn’t you be advocating PHP? Aren’t you a PHP terrorist?”

As Ben said, PHP’s not just a language. PHP, the language, is only supposed to do one thing. Solve the web problem, and do it well.

A Java person says, “Look at it this way, Java is like a knife. A knife can do many things. A good one is essential for cooking; it can cut open packages, cut a cord, and it really comes in handy in a fist fight. Java can do a lot of things and do them very well.”

A PHP person, if they had to fight, would rather have something that is designed to solve that one problem, and solve it well.

A PHP person says, “That knife is nice, but I’d rather have a gun.”

PHP is not about the language; it’s about the attitude.

— Next time I meet this really short girl who notices my scar, I’ll tell her I got it in a knife fight with a Java developer.

Be a force multiplier

Let’s go back to the John Henry story. My basic rant centered around two points:

  1. If it’s not the bottleneck, it doesn’t need to be performant.
  2. Writing in a more performant language means writing more lines of code to do the same amount of work.

The first tells us that the solution to our problems is the same one it always has been: Don’t solve things in PHP; move the tough stuff outside the language, and bind to it. In the early days, it was the database; now it has to be something else. I’ll get to that later.

The second requires more explanation.

It is easy to forget that programming is not construction. We are not construction workers; we are engineers. We do not build things; our compiler does that. We design design documents that our compilers compile to machine code (or virtual machine code in the case of scripting languages). Our design documents may be written in a variety of ways called programming languages.

Some languages, like C, generate about 3 to 5 lines of machine code for every line of C. If your design document is that detailed, the builder (compiler) is bound to get it exactly the way you specified it, and you can make things very efficient in terms of performance.

But, higher level languages work differently. Something like BASIC generates about 15 to 20 lines of assembly code for every line of C. It’s less efficient in terms of performance, but far more efficient in terms of time — it does five times more. If you can only write 1,500 lines of code, your code does more work this way.

It’s hard to estimate how many lines of work a line of PHP does, because PHP is not simply a high level language; it is a scripting language, a dynamically-typed one at that, and one that is built on so many libraries. Just as Linux means not only the kernel, but also the entire GNU stack on top of it, PHP is not only the Zend interpreter, but also all the extensions and practices that we’ve stacked on top of it. In other words, a line of PHP leverages between thirty and thousands of lines of machine code.

Even if you are the John Henry of C++, it’s hard to compete with that, and your heart is going to burst trying.

The attitude of PHP means adopting tools that act as force multipliers for the lines of code you do write — we all seek that steam-powered jackhammer to make our coding lives easier.

Sure, we write only 1,500 lines of code a day, but nobody said those lines have to all be PHP.

Be the Borg

As I mentioned earlier, over half a decade ago, memcached changed the Web. It was originally written in Perl — later C — for a Perl-based web site. Instead of writing their own version of this like the Java world, the PHP world adopted it. Those that did thrived. 15 You can see on the homepage of memcached, where the plurality of sites are PHP 16 by a large margin: Wikipedia, Flickr, YouTube 17, Digg, WordPress, Craigslist, &c. If you look today, you see most of the recent contributors to it and libmemcached hail from the LAMP world or work on PHP-based websites — Facebook being the most notable website driving development.

If it’s better, why fight it? Why not join it? In other words, if it makes code management easier, who cares if GitHub is written in Rails? If it automates your deployment, as Rob mentioned, who cares if Capistrano is written in Ruby? And, if Twitter gets you noticed for a job, like Snook mentioned, or located like Andrei described, who cares what language it’s in?

It’s not like you are leaving PHP. The pragmatic attitude stays the same; it just adds the biological and technological distinctiveness of another language or project to your own.

Resolution for a new year (and a new decade)

I know it is the Advent, but I’d like to look ahead to the beginning of the new year — the beginning of a new decade.

A decade ago, someone on their smoke break pointed me in a direction that would consume the next 10 years of my life.

Five years ago, a good New Year’s resolution might have been to learn memcached, but what about now? How will we level up in the World of PHPCraft, as Sara asks? What was true years ago is no longer true today, for software is infinitely malleable, and change will happen, as Luke says.

They say it takes a decade of diligent practice to achieve mastery. I’ve put in my ten years, but I don’t feel like a master of anything. I can’t give you something specific with any certainty. Hopefully, I can — as my peers have already — point you in a direction.

If you believe virtual hosting is disruptive, then it couldn’t hurt to adopt some tools written for managing your AWS or slicehost, even if they’re Ruby-based. The same can be said for mastering other tools to manage and automate deployments, testing a la Greg, and other operational considerations. As design takes over development, similarly, development means more than just programming.

Furthermore, there are a lot of projects out there to overcome PHP’s weaknesses. Just one example is Gearman, which is a distributed API infrastructure. Getting PHP to work with Gearman has already been explained indirectly by Sean Coates. Other areas are Hadoop, MogileFS, CouchDB, and Tokyo Cabinet.

The Pragmatic Programmer recommends you learn a new language every year. I don’t necessarily agree, because what language you choose to learn is very important. From the weaknesses of PHP, two languages seem to be stand out choices.

Heading in the direction of the user interface, you would have traditionally studied design and gotten more than a passing understanding of CSS and HTML, which Marco points out is essential today. Now ,with Ajax-based UI, the development language to learn is definitely Javascript, as Ed has already mentioned. If I could add anything, I’d add a good UI framework like YUI or JQuery, or you could go into multi-domain problems like using server-side code to improve client-side bottlenecks, as Ilia mentioned.

Of course, that’s Javascript for the Web. Ed noted how Javascript is more prevalent than that, and Derick showed how PHP can also be found in unusual places.

Heading in the direction to find solutions that offset performance, al3x, a twitter developer, recommends Scala 18, but the most exciting language here for the PHP developer is Erlang. While being a functional programming language focused on high concurrency, it may not seem like a language at all like PHP. It is, however, a very high-level scripting language that is strong where PHP is weak.

You can even go in all those directions at once as a PHP developer — Jan did — to CouchDB, a distributed MapReduce-based data store written in Erlang with a Javascript-based reduce library.

— Relax, Jan is on the job.

Summary

I used to joke that Matt didn’t want to create WordPress; he just wanted to post his crummy photos on his blog, and PHP was the shortest distance for him to that solution. WordPress was the result. He’s my boss now, so I can’t get away with saying that 19, but the spirit still holds true, as does the reason I made a mistake about Ruby on Rails. It’s not about writing a piece of blog software, it’s about making the next WordPress.

— Matt accidentally was late to this USA Today photo shoot, because I talked his ear off at lunch that day.

Maybe one of you will do that. That may seem an insurmountable goal now, but, as Lorna said, you get there one step at a time. When, a decade hence, you ring in the New Year sipping on a glass of Château d’chatalé, you can recall this article being part of the moment that pointed you in the right direction.

It’s not about PHP; it’s about the attitude.

As Paul said, code is our conversation — we can only express ourselves in 1,500 lines a day, so we had better choose them wisely. To do so, look to integrate the strengths that overcome your weaknesses. Writing good code is about force multiplication — make every line you write do the most.

(And, maybe one day, we can aspire to getting away with only 5 lines like those enterprise Java and J2EE developers.)

Coda

What ever happened to that John Henry of C++? He’s now at Facebook as the technical lead of HPHP, the migration of their PHP codebase to C and C++. See, even after a decade of practice, we continue to fight the same battles testing our differing philosophies. :-)

They are not different for the sake of being different, but because we really are. The distinctiveness we forge with each choice we make, and every one of those 1,500 lines we choose to write, express that difference.

And that difference is one I can really celebrate this holiday season.

Comments?

Footnotes

  1. Actually, ASP is a copy of PHP. Return to footnote 1 source.

  2. Since the import code isn’t run continuously, nor is part of a long change where it is rate determining, performance gains don’t help the overall site performance. Return to footnote 2 source.

  3. Because there are only 168 hours in a week. Return to footnote 3 source.

  4. Mr. Hansson doesn’t get to shart on sharding. Basically, while read access can now be scaled as fast as RAM, write access is still on a non-volatile store like a database. Horizontally scaling this access across multiple partitions allows it to be scaled independently and for the performance characteristic of such writes to be homogenous across servers. Return to footnote 4 source.

  5. The original article is gone, but if you read the Twitter-DHH debate and look at the reality today, Zed was right. Zed wrote Mongrel, which is one of the workarounds to mod_ruby mentioned. Return to footnote 5 source.

  6. There were seven (now eight) on-campus dorms at Caltech; they were called “houses.” Return to footnote 6 source.

  7. Technically “members,” because you didn’t necessarily have to live in the house to belong to one. Return to footnote 7 source.

  8. About 10% of the student body by my estimate. It was so prevalent that the lampoon edition of the newspaper had a letter to the editor, “I like sex with black crusty feet.” Return to footnote 8 source.

  9. Translation from PR-Speak to English of Selected Portions of Rails Developer David Heinemeier Hasson’s Response to Alex Payne’s Interview. The funny things is, after reading this, you’d think it would be Alex who would be fired later, but no, it was the CTO who chose Rails. Since then, Twitter has been a much more reliable site. Return to footnote 9 source.

  10. I’m just bullshitting by assuming there might be another one. If you can’t find another Ruby on Rails site, let’s pick Hulu at 33 in the US and 155 on Alexa. Return to footnote 10 source.

  11. We’ve been over this. The threats to PHP are the demands of performance and Ajaxian Javascript-based user-interface frameworks. Return to footnote 11 source.

  12. While WordPress won best open source CMS this year, Drupal headlined politics by powering the White House, Recovery.gov, and Huffington Post. Congratulations! Return to footnote 12 source.

  13. Yes, the P can be for Perl or Python, but PHP is really the software that created the concept of the stack. Return to footnote 13 source.

  14. The origin of PHP trading cards. As part of the promotion of ZendCon, Cal Evans published a deck of playing cards with the names of PHP developers on it. (I was one of them.) It looked eerily similar to the Iraqi “Most-Wanted” deck of cards. Return to footnote 14 source.

  15. Others that missed this, like Friendster, unfortunately did not. Return to footnote 15 source.

  16. Of the 12 mentioned, 5 are PHP (Wikipedia, Flickr, Digg, WordPress, Cand raislist), 3 are Python (Yellowbot, YouTube, and Mixi), 2 are Perl (LiveJournal and TypePad, both Danga-related companies), 1 is Java (Bebo), and 1 is Ruby and Scala (Twitter). Return to footnote 16 source.

  17. YouTube was re-written in Python sometime before or around the Google buyout in 2006, but the earlier versions were written in PHP. On the other hand, Hulu is written in Rails, so you know performance is not a big factor in linking static Flash streams from a CDN to people. Because of this, I count YouTube as more Python than PHP and claim the plurality instead of saying “half.” Return to footnote 17 source.

  18. My major disagreement is that he thinks Scala replaces Ruby. I believe Amdahl’s law says you only need Scala for the parts that need to be concurrent. Return to footnote 18 source.

  19. I jest. I never used the word crummy. Matt’s photos aren’t crummy. Return to footnote 19 source.

Other posts