WEB Advent 2010 / Text and Media Files

With a team of developers, I recently built a mission-critical web app that relied heavily upon the manipulation of files. I’d like to give you a tour of the tools we used for media conversion, PDF generation, and JS minification.

PHP functions

You can get the entire content of a file using file_get_contents($filename). Similarly, you can write the entire contents of a file using file_put_contents($filename, $contents). There are many more useful filesystem functions in the manual.

Command line

A lot of great tools run from the shell. Fortunately, you can call them easily from within your PHP script. Use either backticks (`chmod 777 mydir`) or a function (shell_exec('chmod 777 mydir')).

Media conversion

One the most useful tools for our needs was FFmpeg. Users would upload video or audio, and the media had to be converted to FLV for playback in a browser player. (We chose Flowplayer.)

The conversion is done in the shell, but the PHP extension ffmpeg-php allows you to easily retrieve information from movie files.

Here is a simple statement that will convert an uploaded file to FLV:

`ffmpeg -i $uploaded_file -b 1024k $converted_file`;

This will convert the uploaded file with a bitrate of 1024 kilobits per second.

You can specify a huge number of options for the conversion, such as video size, frame rate, metadata, subtitles, cropping, and number of audio channels. Note that FFmpeg can convert audio-only files as well. Once converted, you can feed the media file to your player.

The extension can be used to obtain some useful information about a file. For example, you might want to know whether the FLV is an audio or video file to serve it correctly. Perhaps you want to display the duration of the movie or audio in search results. You can even create a snapshot of a particular frame of the video.

// Open the media file.
$movie = new ffmpeg_movie($converted_file);

// Determine whether the file has video to display.
$has_video = $movie->hasVideo();

// Get the duration.

// Take a snapshot of a frame.
$frame = $movie->getFrame(50);
$image = $frame->toGDImage();

PDF generation

I have used a multitude of libraries to generate PDFs, and the one that struck me as easy to use and exceptionally fast is wkhtmltopdf. It converts HTML to PDF using the shell. Even better, it uses WebKit. Converting HTML to PDF means that you do not need to learn additional syntax and can use your existing templates.

It’s important to note that the shell script needs an absolute path to your resources in order to include them in the page.

The basic usage is incredibly simple:

`wkhtmltopdf $html_file $pdf_file`;

If you want to get fancy, you can set options such as headers, footers, table of contents, page size and orientation, margins, &c.

JS minification

Our app is very dynamic. It can run an entire day without refreshing a single page, because it relies so heavily on JavaScript. To better manage (and reuse) the multitude of modules, the JavaScript code was split into so many files that some pages would load up to fifty of them.

This obviously created a problem in terms of loading time. The files were small, but the cost of having so many requests was great. Additionally, most browsers are limited to two concurrent requests.

A simple solution is to both minify and merge into a smaller number of files. To minify, we used YUI Compressor. Once again, the tool runs in the shell. Here is the basic usage:

$minified_content = `java -jar yuicompressor-x.y.z.jar --type js input.js`;

The type option can be set to either js or css. Other options include charset and line-break.

Once you get the minified output, you can merge all the content in a single file (or a small number of files).


The tools presented above allowed us to accomplish much with very little effort. They can apply to almost any project. When you do use them, please give them credit.

Thanks for reading!

Other posts