When I chose to write about output buffering for this year’s PHP Advent, my depth of knowledge on the subject was very limited. I picked a topic that I could learn well, and then explain thoroughly without writing an entire book. It’s a feature that will likely be new to beginners, but which even intermediate and advanced users may not have used much. Output buffering has simple, practical applications, and it can also play a roll in more complicated systems. It is one of those tools that you might not realize you need if you don’t know that it exists, and it is my pleasure to introduce you to it.
PHP’s output buffering trinity
A typical PHP installation actually has three different layers of output
buffering. The layer closest to the client is controlled by the
output_buffering
directive in php.ini
. This setting
can be set to On
, Off
, or an integer that
represents the number of bytes at which PHP should flush the buffer. The
purpose of this buffer is to control how much data is sent to the browser at
a time. The options are fairly self-explanatory; Off
sends the
data immediately, and On
collects the entire output of the
script and sends it all at once. I’ll call this layer the output buffering
layer.
I call the next layer the flush layer. It is another control that
can simply be turned On
or Off
using the
implicit_flush
directive in php.ini
. When
implicit_flush
is on, every output operation flushes immediately
to the output buffering layer; otherwise, you have to call
flush()
to manually flush this buffer. By default,
implicit_flush
is disabled, except when using the CLI SAPI.
This is a sensible default, because the constant flushing can generate a lot
of overhead, particularly when output_buffering
is disabled.
If the purpose of the output buffering layer is to control how
much data is output, this layer’s purpose is to control when data is
output.
The last layer is the userspace output buffer, which is controlled by the
various ob_*
functions. It provides far greater
control than the other layers, as well as greater flexibility. While this
layer can be used to control how much data is sent and when the sending
occurs, those are just two of its many tricks. The true purpose of this
layer is to provide control over which data is output. I call this layer the
ob layer, and it is the primary focus of this article.
The ob layer
Let’s start with a simple example:
<?php
ob_start();
echo "Here is some text.\n";
header('X-Some-Header: Some value');
ob_flush();
The above example is very simple, and it should be fairly obvious what is
going on. First, we create an output buffer by calling
ob_start()
. From this point on, anything we output will be
stored in this buffer. When we call ob_flush()
, the contents of
the buffer created by ob_start()
are flushed to the next
output buffer layer, which should be the flush layer, in this case. It’s
that simple to create an output buffer.
One use of output buffers that you are sure to hear about is the ability to send a header after you output something. Since output is held in a buffer, you can still send headers and avoid the infamous “headers already sent” warning. This is particularly useful if you need to use a function that writes directly to the output, but you aren’t quite ready for it to do so. Some people argue that buffering output so that you can send headers later adds to the complexity of the code. Regardless of whether you agree, this feature only scratches the surface of output buffer utility. Let’s take a look at another example:
<?php
function output_handler($output) {
return "<OB>\n" . $output . "</OB>\n";
};
ob_start('output_handler');
echo "This output just got handled.\n";
ob_end_flush();
echo "Some text outside of the buffer.\n";
You’ll notice two important additions to this code. First,
ob_start()
takes a callback or closure as its first argument. (I
highly recommend using a string callback, which I’ll explain later.) The
function referenced by that argument should take the content of the output
buffer as its first argument, and it should return a string containing the
processed output. The second thing you ought to notice is the call to
ob_end_flush()
. This will flush the current buffer and close it,
so that future output does not use it. If you ran this code, you would see
that only the content from the first echo
is wrapped in the
<OB>
tags:
<OB>
This output just got handled.
</OB>
Some text outside of the buffer.
At this point, it should be easy to start dreaming up some uses for output buffers. The output handler argument gives you a lot of flexibility to process any amount of output without having to manage concatenating all of your output into a single variable.
You can already do some neat things with what we’ve learned so far, but
PHP’s output buffer support goes much further. If you would prefer
to apply different buffers to different pieces of output, simply call
ob_end_flush()
followed by a second ob_start()
:
<?php
function handler1($output) {
return "<OB1>\n" . $output . "</OB1>\n";
};
function handler2($output) {
return "<OB2>\n" . $output . "</OB2>\n";
};
ob_start('handler1');
echo "Output from the first output buffer.\n";
ob_end_flush();
ob_start('handler2');
echo "Output from the second output buffer.\n";
ob_end_flush();
Predictably, this outputs the following:
<OB1>
Output from the first output buffer.
</OB1>
<OB2>
Output from the second output buffer.
</OB2>
Nesting output buffers
You might discover that sometimes you want to use one output buffer on
most things, and a separate output buffer on a small portion of your output.
In this case, you can nest two (or more) output buffers. To do so, simply
call ob_start()
, then call it again before calling
ob_end_flush()
. Your output will be handled by the most
recently opened buffer first, and work its way back to the first buffer
that you opened.
<?php
function parent_handler($output) {
return "<PARENT>\n" . $output . "</PARENT>\n";
};
function child_handler($output) {
return "<CHILD>\n" . $output . "</CHILD>\n";
};
ob_start('parent_handler');
echo "Part of the parent ob.\n";
echo ob_get_level() . "\n";
ob_start('child_handler');
echo "Part of the child ob.\n";
echo ob_get_level() . "\n";
ob_end_flush();
echo "Back in the parent ob.\n";
echo ob_get_level() . "\n";
ob_end_flush();
Here is the output:
<PARENT>
Part of the parent ob.
1
<CHILD>
Part of the child ob.
2
</CHILD>
Back in the parent ob.
1
</PARENT>
Because it is so easy to start a new output buffer, it is important to
keep track of which buffers you already have open. In the previous example,
you’ll notice the calls to ob_get_level()
, which always returns
an integer to describe how many output buffers are open. Since there is no
way to switch to a parent without closing the child, this also
happens to describe the level of the current output buffer.
There are two more functions which are useful for keeping track of your
open output buffers, ob_get_status()
and
ob_list_handlers()
. ob_get_status()
returns an
array with the current level and the name of the callback function, as well
as some other details about the buffers. If you call it with TRUE
as the first argument, you will not only get details about the current output
buffer level, but also all of the other levels that are currently open.
ob_list_handlers()
will return an array of all of the output
handlers that will process the current output buffer. Earlier, I recommended
using a string callback instead of a closure, because these functions
can only tell you the name of the handler function if the handler function
actually has a name.
There are some other important output handler functions.
ob_clean()
immediately discards the contents of a buffer but
leaves the buffer open. ob_end_clean()
discards the contents of
the buffer and then destroys it. ob_get_length()
returns an
integer which represents the size of the current buffer.
ob_get_contents()
returns a string of the entire contents of
the buffer, leaving the buffer in place.
There are two curiously named functions: ob_get_clean()
and
ob_get_flush()
. You might suspect ob_get_clean()
to work more or less like ob_get_contents()
, followed by
ob_clean()
, but you’d be wrong. In fact,
ob_get_clean()
actually works more like
ob_get_contents()
followed by ob_end_clean()
. It
returns the output buffer’s contents as a string, and discards and closes
the buffer. ob_end_get()
seems like a more reasonable name to
me, but would it really be PHP if all of the names made sense?
Conclusion
People are using output buffers to solve all sorts of problems already, and a bit of Googling will lead you to some interesting, terrible, and amazing ideas. Some of them are simple problems like replacing content (censoring profanity, adding HTML abbreviations, etc.) or stripping the unnecessary white space from output to reduce the output size. Others have found more complicated uses for output buffers, including templating engines, custom caching solutions, streaming, and more robust output buffer interfaces. The wide variety of applications that people have found for output buffers is a testament to their flexibility and utility. I hope this introduction has been helpful and has left you with some ideas about how to use output buffers in your next project.