XML is an extremely successful technology, but it has flaws that some say could be fixed by MicroXML, a simpler, backward-compatible specification of XML.
James Clark first proposed the MicroXML specification in a Dec. 13, 2010 blog post. In it, he described MicroXML as a “subset of XML 1.0 that is not intended to replace XML 1.0, but is intended for contexts where XML 1.0 is, or is perceived as, too heavyweight.”
Interest in MicroXML has recently swelled again. On June 12, Uche Ogbuji, partner at Zepheira (an information management solution provider), wrote about MicroXML in an IBM developerWorks blog post. He explained that MicroXML is an attractive alternative because “People are dealing with the complexity of XML namespaces, and with recent XML processing specs, such as XPath, XSLT and XQuery 3.0. Some influential core XML experts looked at the bold possibility of starting over with a simplification of XML itself.”
In a June interview with SD Times, Ogbuji said that “MicroXML has been a move from not inside the establishment; that’s the best way to put it. MicroXML is a bit of a movement of outsiders, but that’s okay because some of the best things in XML have come from outsiders.
“James Clark is one of the most respected computer scientists and engineers in the XML world,” he continued. “So when he came out and said, ‘Let’s simplify XML, and actually I have some specific ideas on how we should do it,’ a lot of people really paid attention.”
MicroXML has been supported and advanced by John Cowan, who wrote an editor’s draft of a specification for MicroXML and also created its first parser, MicroLark. MicroLark is open source (Apache 2.0 license), written in the Java language, and implements several modes of parsing: pull mode, push mode and tree mode.
“MicroXML is a simplified version of XML, in the same way that XML is a simplified version of [Standard Generalized Markup Language],” Cowan told SD Times in an interview.
He said that MicroXML is not intended to compete with existing applications of XML, but to extend the range of XML into places where XML has historically been disfavored for its perceived complexity.
When asked for more details, Cowan explained, “In particular, MicroXML is not specially applicable to the Web, although it is easy to write documents which are both valid HTML5 and well-formed MicroXML; this makes them easier to process server-side before delivering as HTML.”
According to Cowan, some of the benefits of MicroXML include its simplicity. “The specification is shorter (10 pages vs. about 50 pages for XML + XML Namespaces + XML Base + xml:id), and hopefully easier to understand,” he explained. “The data model is much simpler to use. Consequently, both parsers and applications are easier to write.”
MicroXML is backward-compatible to XML. In fact, “All MicroXML documents are also well-formed XML. Almost any XML document can be converted to MicroXML without loss of information,” Cowan said.
He said MicroXML is easier to use than XML “because there are fewer special cases, and because programs based on the MicroXML data model don’t have to handle all those special cases. The entire data model of MicroXML is composed of elements, attribute-value pairs and character content.”
But MicroXML has been slow to gain attention, despite a seeming lack of active opposition. Cowan went so far as to say, “There is a lot more indifference than support. Most people who use XML are content with it, warts and all, and those who aren’t content and don’t use it have never heard of MicroXML.”
Those who have heard of MicroXML express doubts that the new specification is even necessary. Among them are blogger Anne van Kesteren, who wrote in a Dec. 21, 2010 blog post titled, “Why do we need MicroXML?”: “Dropping a few features from XML would certainly make it less complex… But I am not convinced that is really worth it. Going to XML from SGML made sense. Nobody managed to implement SGML fully. Implementing XML, while non-trivial, has been done a fair number of times.
“I do not think XML is sufficiently complex to warrant a new language,” van Kesteren continued.
In his blog, Ogbuji admitted that “MicroXML has no official standing in any recognized standards organization, but it is of great interest to XML developers for several reasons. John Cowan has already developed MicroLark—a Java implementation—and I developed one for Python.”
Ogbuji insisted that there is a great deal of interest in the MicroXML specification. “Remember, many of the most important modern specs, such as JSON and Markdown, had similarly informal roots,” he said.
To gain validity and more interest, Cowan suggested to the World Wide Web Consortium’s XML Core Working Group (of which he is a member) that they consider MicroXML, but there was insufficient interest, he said. Instead, he added, the group suggested a community group be created. They are much more informal than working groups, but that has not yet happened either, he said.
Ian Jacobs, head of W3C communications, said, “To my knowledge, MicroXML is not currently on the W3C agenda—although some of the people involved, such as John Cowan, participate in relevant W3C Working Groups. If the community of supporters of MicroXML wishes to bring this to W3C, there are several options.”
What is the MicroXML specification’s fate going forward? Will it be adopted by users or approved by the W3C? “I’d like to see it adopted by users to help them solve their problems,” Cowan said. “And if standardization by the W3C or another standards organization helps that goal, then I’m all for standardization and I’d be willing to work on it.”