One of the greater challenges of writing good software is avoiding bugs. And, if that can’t be done, finding them. There is much to say about proper bug reporting, identifying issues and tracking down problems, but that is not what this post is about. This post is about what it means and what to do when someone specifically asks you, “Hey, can you optimize this database?”
Firstly, when you hear someone saying this, do not run! The problem may appear dire, but alas, it is most likely either something very simple, or something extremely complicated. When you’ve sat down and taken a deep breath, analyze. There are two types of analysis you can do:
- What does this person mean? You can figure this out by asking them more questions about the cause of their concern. Do they experience slow response times? Are there any specific pages in your application that load slowly? If it is not possible to get this information, proceed to:
- What does the application do? This will be more time consuming, but ultimately more accurate (and satisfying) than any human witness accounts you can gather. Incidentally, this task will make you painfully aware of how important it is to document the architectural components of your application, not just “the code.” For example, it’s much nicer when you know that your application more or less looks like this:
At this point, do not try to attempt to see how long things take to complete. Your job is to simply figure out how things work together. I’m saying “simply” because you can take a dummy approach here and follow through the diagram:
- I seek lolcats.
- I go through a load balancer.
- Am I being balanced to a correct web server?
- Is the load balancer strangely sticky?
- Do we need a load balancer?
- I’m at the web server—this is where the PHP magic happens.
- How are users recognized?
- What do users do?
- How do users find the lolcats?
- I’m at the line between web servers and the database.
- Is there something silly going on like queries in loops?
- Are we caching these requests? (memcached anyone?)
- Does every user need to come to the database anyway?
- I’m in the database!
- Whoa, that’s a giant query!
- Are we using proper indices?
- How big are these tables anyway?
- Too many/not enough joins?
- lolcats found!
After your analysis is complete—that is, once you know how things work together—you can start identifying points of concern, places that a lot of work is done, and where little or no work should be happening. In my experience, this mainly happens somewhere between points 4.5 and 5. (I will leave JavaScript and image load times optimization to someone more qualified…)
Independently, the PHP part of the application is simple and straightforward. The database executes each query very quickly, but what happens when the two combine? Are there any pages that send a lot of requests to the database? Do giant tables spend most of their time in a locked state? Are results that never change (and thus could be cached) loaded on each request? These are the questions you will have to answer, and you will only be qualified to do so after you have learned enough about the system.
Once you’ve analyzed the functionality, it’s time to check how often the troublesome pieces are used. Many tools can help you do this: load testing tools, page trackers, query logs, server logs. If the application is already public, you should be able to retrieve this data from the normal places. However, if you’re building a brand new application, you will have to depend a little on your Developer Sense (kinda like Spidey Sense, but better for building web apps) and estimate which requests/pages will get most traffic. I am going to leave this paragraph a bit short because the way to go about this estimation can be a topic in itself—and frankly, I have some peppermint bark to eat instead.
At the top of this article, I mentioned that optimization problems are either easy or difficult, but there is no middle ground. The reason for this is application design, or architecture. (That’s a dirty word to some.) Application design is a game that software architects play, where they try to remain really flexible so that you can add new features to the application, but at the same time meet specific requirements. There is also usually some discussion in regards to “Premature optimization is the root of all evil.” Sure, we have denormalized lookup tables, but is that kosher? In any case, if the software architects know what they’re doing—and with a little bit of luck (e.g. clients won’t change their mind at the last moment)—the application is designed well and the problems you will have to fix will be easy. Not necessarily obvious, but easy. Under this category fall things like (but not limited to):
- A query was slow because it was missing an index—by the way, indexing is kind of cliché in optimization. Sometimes I wonder why we can’t just throw some indices at our energy problem; surely it would make things more efficient.
- Caching was not turned on for a specific feature
- Data was not being retrieved from pre-generated tables
These and similar scenarios are things you should hope for if you do not have lots of time or energy to fix the underlying causes, but sometimes you will uncover larger problems in the application: poor design choices or no design at all will yield to a lot of refactoring or even rewriting code. The most common example of poorly designed applications are the various forum softwares floating around teh Interwebs. Forums are a good example because they are usually installed by developers with little experience who are looking for off-the-shelf products. They also tend to grow rapidly, with little experience on the maintainers’ parts, leading to poor performance. The only solution here is to redesign this part of the application. While some fixes may improve performance, those fixes will inevitably be temporary and won’t withstand additional stress on your system. (This is not to say those fixes aren’t ok—you should just remember that they’re not perfect solutions.)
At this point, you will be able to go to the person who asked you to “optimize the database” and provide some recommendations: either easy fixes or difficult fixes.
Over the years, I have learned and have taught others that any system you deal with has finite resources. Yes, it may be scalable—resources can be added—but at any given time, its total resources are fixed. Therefore, optimization of a system is a game of what you’re willing to give up to gain something else. This should be included in your recommendations, as well. The important part here is to provide very specific details while keeping the “big picture” (bleh, what a corporate phrase) in mind.
Here’s an example: Query X
takes 4 seconds to execute. Execution time can be decreased to 0.008 seconds on the current hardware by adding an index on field name
. It will take a developer one hour to add and test an index. This query is executed on the home page, which is responsible for approximately 70% of the traffic to the site.
Once your have your recommendations collected in a detailed fashion, you will want to talk to the stakeholders for the application and inform them of your findings. If not already available, you will want to guide them to have proper expectations of the system (“no, we can’t teleport things into the future”) and guidelines for performance: page load times, supported browsers, “How long will I wait for this page to load on my iPhone?” These guidelines should be very specific to the application and can only be created by those who have the most knowledge of the application. The analysis you have performed should be very helpful in creating these guidelines.
Optimizing systems is interesting because it is broad and specific at the same time. If you’re tasked with “optimizing a database,” you will often find yourself feeling the need for a psychic, or that you’re required to be an omniscient being, knowing what’s going on in every part of the system at the same time. Trust me, though, you can handle it.
Happy holidays!