The XML Revolution

by Steven Aoki
November 2, 1998
Original draft for the Cal Poly Society for Technical Communication newsletter
Cal Poly State University, San Luis Obispo



Watch out HTML--a popular new language, XML, has burst onto the scene. Ever since the World Wide Web Consortium (WC3) approved XML as a standard on February 10, 1998, numerous industries have planned XML applications like mad. In the forefront, Microsoft, Netscape, and Adobe promised advanced XML tools in Internet Explorer 5.0 (in addition to the XML parser in Internet Explorer 4.0), Netscape Communicator 5.0, and FrameMaker 5.5 respectively. Other applications in the works include Internet payment processing (OFX), advanced multimedia (SMIL), vector graphics (VML, PGML), simplified voice-driven technology (VoxML), chemical interchange (CML), mathematic typesetting (MML), and much more.

What is XML?

Simply put, XML (eXtensible Markup Language) lets designers customize their own HTML tags. Like "styles" in Microsoft Word, designers can tag document contents by name and description rather than by appearance. Using this simple premise, XML expands upon HTML in 4 primary ways.

XML vs. HTML

Precise formatting in any medium. Due to HTML's fixed formatting rules, basic page layout capabilities like hanging indents, full justification, hypenation, kerning, and precise white-spacing remain beyond a Web designer's reach. Enter stylesheet languages like XSL (eXtensible Stylesheet Language), which help a designer to describe exactly how to format a document. Thus, a designer can specify a stylesheet for each medium desired.

Advanced searching. Whenever I type in a Web search query for one of my favorite TV shows, "Law & Order", I get flooded with sites about legal procedure instead. Since HTML does not describe content, most search engines only look for keywords. Enter meta data. A designer can describe contents in the tags to optimize searching. For example, a tag for GIF file "cardinal05.gif" can describe the image as a bird to prevent unnecessary hits from baseball fans.

Efficient data-driven web sites. Because XML resembles fields in a database, it lends itself to web sites that deal with voluminous amounts of constantly changing information. Hence, designers can input data into a database and then publish a Web page automatically.

Fidelity across XML-supported browsers. Until XML, different browsers have forced their own proprietary HTML code onto the public such as a "background midi" tag that only works in Internet Explorer or alignment tags that only work in Netscape. As a result, Web pages look different across browsers, and often choke when viewed through a lower version browser than intended. Since XML focuses on content rather than how to display it, any XML-supported browser can view an XML document.

XML vs. SGML

XML sprung from an international standard called SGML which used the same concepts. Although various industries including the U.S. Department of Defense, the U.S. Government Printing Office, publishers, and many other businesses adopted SGML successfully, SGML remained unpopular due to its confusing complexity. The small vendor markets and the need for professional consultants made SGML systems both difficult and expensive to maintain. In creating XML, publishers and Web designers stripped down SGML for the less technically literate.

Beyond the Web

In the near future, simple XML extensions on current software will allow a single XML document to work in a cornucopia of systems: databases, spreadsheets, word processors, page layout programs, World Wide Web browsers, multimedia stations, CD-ROMs, and more. Using different stylesheets, publishers can specify precisely how the document should look depending on the medium or even the audience. For example, a publisher will have the ability to reprint an American article as a Japanese CD-ROM--automatically translated with altered cliches and monetary units.

XML's future

Since XML is compatible with HTML and SGML, it probably won't extinguish those languages. But as the ideal strategy for complex data management, publishing, and online commerce, XML has the potential to revolutionize all informational exchange.

[ BACK TO PROJECTS ]