As the World Wide Web Consortium (W3C) winds down its work standardizing the Extensible Markup Language (XML), it is looking back at the history that brought XML to its success today.
“W3C XML, the Extensible Markup Language, is one of the world’s most widely-used formats for representing and exchanging information. The final XML stack is more powerful and easier to work with than many people know, especially for people who might not have used XML since its early days,” Liam Quin,XML activity lead who recently announced he would be be leaving W3C after almost 17 years working with XML, wrote in a post.
XML 1.0 was first published as a W3C recommendation on Feb. 10, 1998, as a way to tackle large-scale electronic publishing problems. Today, it is a markup language used to define rules for encoding documents that are both human and machine-readable.
According to Alexander Falk, president and CEO of the software development company Altova, the evolution and success of XML has been widely misunderstood. “Today, much of what we take for granted – and sometimes don’t even think of as being related to XML anymore – is, in fact, based on XML. Every Word document, Excel spreadsheet, and PowerPoint presentation is stored in OOXML (Open Office XML) format. Every time you e-file your taxes in the U.S. (and any many other jurisdictions), the information is sent from your tax software provider to the government in XML format. Every time a public company provides its quarterly and annual financial reports to the SEC, the data is transmitted in XBRL (an XML format). Every time you talk to your Alexa device, you’re interacting with an app that uses SSML (Speech Synthesis Markup Language, an XML format). And the list goes on and on,” Falk wrote in an email to SD Times.
According to W3C’s Quin, XML can work with JSON, linked data, documents, large databases, the Internet of Things, automobiles, aircrafts and even music players. “There are even XML shoes. It’s everywhere,” he said.
But, how did we get here? The W3C created the Web Standard Generalized Markup Language (SGML) Working Group to create an SGML specification to be shared and displayed on the web and within browser plug-ins. While XML is very similar to HTML, the W3C explained the intent was not to replace HTML. XML is designed to carry data; HTML is designed to display data. XML tags are not predefined and HTML tags are, so there are still many differences among the two.
At the time the Web SGML Working Group was working on the SGML specification, there were two plug-ins: Panorama from SoftQuad and EBT/Inso, which was never released. The W3C realized the need for a standard because it was clear that it would be too complex to develop a SGML document that would support both plug-ins. “XML has some redundancy in its syntax. We knew from experience with SGML that documents are generally hard to test, unlike program data, and the redundancy helped to catch errors early and could save up to 80 [percent] of support costs (we measured it at SoftQuad). The redundancy, combined with grammar-based checking using schemas of various sorts, helped to improve the reliability of XML systems. And the built-in support for multilingual documents with xml:lang was a first, and an enduring success,” wrote Quin.
Today, Quin believes most of the work with XML is finished. “People are using the specifications in production and the rate of errata has slowed to a crawl,” he explained.
However, the end of the W3C’s specification does not mean XML is ending, it simply means it has reached a mature stage where it is widely deployed, according to Quin. “People aren’t reporting many new problems because the problems have already been worked out,” Quin wrote.
Altova’s Falk believes the future of XML looks bright. “As it gets even more ubiquitous, it will be easier for people to forget that much of the data that flows between different systems is based on XML, but that doesn’t mean it is becoming less important,” wrote Falk. “As the core of XML has matured and been refined over the years, we’ve seen a whole range of supporting standards emerge that help process, structure, transform, query, and format XML data – all coming together to establish a rich infrastructure of related technologies, including XML Schema, XSLT, XSL-FO, XPath, XQuery, XBRL, etc., that enable standards-based information processing that spans operating systems, platforms, and software products.”
“But for the most part, it’s time to sit back and enjoy the ability to represent information, process it, interchange it, with robustness and efficiency. There’s lots of opportunities to explore in making good, sensible use of XML technologies,” Quin added.