XML: Contrary to popular belief, it doesn't always kill babies

2011-02-16

XML. The "eXtensible Markup Language". What happened to you, man? You used to be so cool. People loved you. They took you to meet their parents. They went to the movies with you. They walked hand-in-hand with you down a moonlit beach! Those were heady days. Good days. Now, you're nothing. XML is the bogeyman in the closet and the monster under your bed. We've woken up from the drunken night of partying that was XML's prime to a morning hangover and a heap of bloated syntax that scars the human soul.

Look. XML isn't all that bad. XML is a medium. Sure, people have done terrible things with it (SOAP), but I choose to believe that XML was an unwilling accomplice in those deeds, not a malevolent co-conspirator. And yes, it's certainly true that there are a lot of good alternatives for XML. First, if you're making a config file, is it really called for? Have you considered a nice, human-readable flat file with key/value pairs? What about YAML? These formats probably won't drive your users insane. Please do not expect humans to edit XML config files by hand on a regular basis. Computers adore XML. It's nice and easy to parse, so structured and precise. Computers love that shit. People don't. So keep the human-editing to a minimum. Second, are you trying to define a data interchange format? JSON is a great way to go. It's extremely simple, and maps almost directly to real data structures in most programming languages. For something like a web service API, only use XML once you've shown JSON to be unsuitable through trial by fire.

Ok, so XML isn't always the right tool for the job. You know what they say: "When your only tool is XML, every problem looks like it's a schema declaration and a few XSLT transformations away from a nail". People have used it in the wrong way a lot, and that makes XML sad. So when should you use XML? I'd like to make two proposals. First, XML is a good tool to use if you're creating an actual document. I'm not saying that people should necessarily edit their documents in XML, mind you (Something like Markdown might be a better fit there), but that XML is a reasonable choice for storing documents. Microsoft might not be totally crazy for basing their docx file format on XML. Honestly, I'd expect them to be uniquely qualified by years of hindsight, shame, and regret to design a document format.

The second situation where XML is ok is when you've got a very specific form for your data: a heterogenous tree. XML is a natural choice for representing trees. The example of this that I personally experienced was in a parse tree. In the course of writing a compiler for a Java-based language, I championed XML as a parse tree format, and it paid off nicely. There just isn't a really nice way to represent a tree structure in something like JSON. XML was a good format for debugging what was going on, and furthermore, XPath turned out to be a godsend. Using a nice query syntax to select elements out of the parse tree for various semantic analyses made the rest of the compiler a lot easier. XML's great entrenched tool base and natural tree structure are sometimes exactly what a problem calls for.

If you liked this, you should click here to subscribe for regular updates

Comment