XML isn’t sexy. The markup language has been a recommended standard for 12 years now, and does nothing but structure, store and transport data. But a new adaptation of the standard—the Efficient XML Interchange (EXI)—is bringing XML into sexy new places such as smart energy grids and Wall Street.
EXI was developed by a company called AgileDelta, which was already working to solve the pervasive problem of using XML in places that were either performance-critical, or had limited bandwidth or low battery life, according to AgileDelta’s John Schneider, the EXI project’s lead editor.
“There was a demanding set of requirements” for the specification, he said. “[The World Wide Web Consortium] wanted to make XML as small and as fast as anything you could design from scratch.”
The problem, of course, was that if one could make XML 2X faster or 2X smaller, someone would need it 10X faster or smaller. A number of organizations submitted technology solutions to address the problem, and the naysayers expected there would be two or three different data exchange formats for different use cases. But the W3C, which oversees XML and other Web technologies, did not want to see a fragmentation.
When the W3C got to EXI, Schneider said, it found “a single format that performed better than every other candidate in every test case across every use case, while retaining extensibility in a form that’s compact and flexible and works with all the XML tools that are already out there. It’s simply a better data format and more efficient than any format we’ve seen before,” he said.
Schneider said that EXI has a sound theoretical basis, bringing Claude Shannon’s work in information theory together with formal language theory.
He said when the team first tackled the problem of making XML as small and as fast as possible, they looked at information theory, which defines a minimum number of theoretical bits needed to represent a piece of information. The understanding here is that the more you know about what’s likely to occur in a given context, the more you can reduce the size of the information. Typical compression involves representing the most frequently occurring things in the fewest bits, but it requires analytics of what has already gone through the pipe to determine how to do the compression.
By bringing in formal language theory, in this case the formal language of XML, Schneider and his team were able to gain more precision about what will occur. “You know for a well-formed XML document what comes next due to the formal language definition,” he said. So his team created a grammar-driven algorithm to find the most frequently occurring parts of XML messages to be represented in the fewest bits, thus creating a smaller, faster XML exchange that doesn’t sacrifice extensibility and flexibility.