WEB Advent 2009 / JSON Gotchas

JSON is a subset of the JavaScript language. JavaScript itself is based on ECMAScript, which is standardized in the ECMA-262 standard. Just this month, the 5th edition of the language standard was released. Section 15.12.1.2 gives an overview of the syntactic grammar, which defines values (such as strings and numbers), arrays, and objects. Arrays are delimited by square brackets; objects are delimited by curly braces. Here are a few examples:

var a = [1, 2, "three", true];
var o = {"name1": "value1", "name2": 2};
var mixed = [{"day": 13, "month": "December"}, {"day": 24, "month": "December"}];

JSON is very useful when building RIAs — what we often refer to as Ajax apps. These kinds of apps are constantly communicating with the server, sending and retrieving data. Since bandwidth is scarce, and unresponsiveness due to latency must be avoided, it is vital that data being sent is serialized in a format that is as concise as possible. Here, JSON comes into play, since it is much less bloated than something like XML. Another benefit of JSON is that it is trivial to convert a JSON string back into a JavaScript value:

var a = eval('([1, 2, "three", true])');
var o = eval('({"name1": "value1", "name2": 2})');
var mixed = eval('([{"day": 13, "month": "December"}, {"day": 24, "month": "December"}])');

Note the extra parentheses around the JSON strings. These turn the JSON strings into JavaScript expressions.

Notation or not?

If you have a closer look at the specification regarding the notation of objects, you will see that the keys (the parts on the left of the colons) are of type JSONString. A JSONString is defined in section 15.12.1.1 (Lexical Grammar) as an arbitrary list of string characters, delimited by double quotes. This is a more strict definition than what is supported in the JavvaScript implementations of modern browsers. For example, as far as most browsers are concerned, it is perfectly fine to use single quotes as the string delimiters:

var o = eval("({'name1': 'value1', 'name2': 2})");
var mixed = eval("([{'day': 13, 'month': 'December'}, {'day': 24, 'month': 'December'}])");

If an object member’s key does not contain special characters (and if it also conforms with a few other rules), you can even get rid of its delimiters, saving a few more bytes:

var o = eval("({name1: 'value1', name2: 2})");
var mixed = eval("([{day: 13, month: 'December'}, {day: 24, month: 'December'}])");

There is, however, an issue with this approach: most JSON libraries adhere to the specification, not to the browsers’ behaviors. PHP’s JSON extension (shipped with versions 5.2+), for instance, expects double quotes:

var_dump(json_decode("[1, 2, 'three', true]")); //NULL
var_dump(json_decode("[{'day': 13, 'month': 'December'}, {'day': 24, 'month': 'December'}]")); //NULL
var_dump(json_decode("[{day: 13, month: 'December'}, {day: 24, month: 'December'}]")); //NULL

Most frameworks behave identically, both for decoding and encoding. For example, Zend_Json in the Zend Framework requires double quotes. If you are generate a JSON string in your client app and send it to the server for decoding, make sure to adhere to the specification. If you don’t, json_decode() and Zend_Json::decode() will fail to decode the JSON string.

Security

There is a reason that eval has the same metaphone key as evil. The JSON string is executed in the same JavaScript context as the current page. It is an absolute necessity that you trust the source of the JSON data — if the source is untrusted, another safer and more restrictive parser must be used. The 5th edition of the ECMA-262 standard defines a JSON.parse() method which only accepts JSON text, but does not evaluate script code. Until browsers implement the current ECMAScript version, you can use some of the alternatives listed on the JSON home page. The appropriately named json-sans-eval library is a useful alternative, and it even uses the same method signature as JSON.parse(). If you use a different parser, at least check the source code to see whether or not eval() is used.

There are some scenarios in which you cannot avoid using eval(). For example, JavaScript supports methods in the JSON object syntax, by using anonymous functions:

var o = eval("({name: 'value', toString: function() { return this.name; }})");
alert(o.toString());

Note that when you create such a JSON string on the server, you need to make sure that you are really returning the syntax for an anonymous function, and not just a string. When using Zend Framework, you can use its Zend_Json_Expr type. Here is an example:

require 'Zend/Json.php';
$o = new StdClass();
$o->name = 'value';
$o->toString = new Zend_Json_Expr('function() { return this.name; }');
$json = Zend_Json::encode($o, false, array('enableJsonExprFinder' => true));

Alternatives

If your app exchanges specific data patterns between client and server, you might be able to shave a few more bytes off your data payload. For example, if only arrays of integral numbers are exchanged, you do not need the square brackets:

1,2,3

If your keys and values only consist of strings and digits, this object syntax may be viable:

name1:value1,name2:value2

You would,however, need to write your own client-side parser in these cases.

Other posts