Half Notes

eXtensible Markup Language

XML (Exten­si­ble Markup Lan­guage) is a W3C stan­dard for text doc­u­ment markup. more →

eXtensible Markup Language

Photo by Massimo Valiani.

I always skip the chap­ter on XML. This entry is an attempt to be able to con­tinue to do this, yet still be able to grasp why things are done the way they are in the world of web doc­u­ment authoring.

A Def­i­n­i­tion

eXten­si­ble Markup Lan­guage or XML is not a lan­guage (like HTML) but rather a set of rules for cre­at­ing other markup lan­guages. This makes it a a meta­lan­guage – a lan­guage for describ­ing other lan­guages – which lets you design your own markup lan­guages for dif­fer­ent type of doc­u­ments, and gives some insight as to the “eXten­si­ble” aspect of its name. XML can do this because it’s writ­ten in SGML, the inter­na­tional stan­dard meta­lan­guage for text doc­u­ment markup (ISO 8879).

See also

Ele­ments and Structure

The most sig­nif­i­cant thing about XML is that it offers seman­tic markup and doc­u­ment struc­ture using ele­ments such as: <dog>Lassie</dog>. The tags <dog> and </dog> add mean­ing to Lassie for humans and machines alike. Ele­ments can con­tain other ele­ments, which con­tain yet more ele­ments, and together give a doc­u­ment cre­ate its structure:


<?xml version="1.0"?>
<movie>
  <title>Lassie Come Home</title>
  <year>1943</year>
  <plot>Hard times came for Carraclough family and they are forced to sell Lassie to the rich Duke of Rudling.</plot>
  <cast>
    <human>Roddy McDowall</human>
    <dog>Lassie</dog>
    <!--more movies added hear -->
  </cast>
<!--more movies added hear -->
</movie>

Of note is that this rep­re­sen­ta­tion is both text and data, and so can be stored in a data­base or in plain-text. This means that XML doc­u­ments are not tied to a pro­pri­etary for­mat or device that may become obso­lete and can be eas­ily shared between incom­pat­i­ble systems.

Also of note is that XML doc­u­ments may be used for all sorts of con­tent, not just Lassie movies. Some XML lan­guages use a Doc­u­ment Type Def­i­n­i­tion (DTD) that defines which ele­ments may be used in the document.

See Also

  • W3C (2000). XML Schema. XML Schemas offer a method for defin­ing XML ele­ments and doc­u­ment structure.
  • W3C (2006). The Exten­si­ble Stylesheet Lan­guage Fam­ily (XSL). Markup lan­guages describe struc­ture, not the pre­sen­ta­tion of a doc­u­ment. Like HTML, XML doc­u­ments can use Cas­cad­ing Style sheets for pre­sen­ta­tion (fast and prefer­able) or Exten­si­ble Stylesheet Lan­guage (slow but some­times necessary).

Well-Formedness

This is an impor­tant dis­tinc­tion to make before rush­ing to valid­ity: An XML doc­u­ment must be well-formed, and should be valid, but valid­ity is not essential.

Well-formed doc­u­ments com­ply with the XML rules for mark­ing up a doc­u­ment, regard­less of spe­cific lan­guage. For exam­ple, all ele­ments muct be cor­rectly nested and may not over­lap. Valid doc­u­ments are both well-formed and com­ply with the rules set for a par­tic­u­lar XML lan­guage. So, in XHTML is is invalid to put body ele­ment inside a link ele­ment, even if it is per­fectly nested.

Of note to authors: browsers may still be able to ren­der sloppy, error-ridden HTML, but they can­not do so with XML documents

See Also

  • Sall, K. (2000). XML Soft­ware Guide: XML Parsers. There are hun­dreds of explicit cri­te­ria for cre­at­ing well-formed XML doc­u­ments, many of them com­mon sense. It is always a good idea to check the syn­tax of your doc­u­ment using one of the well-formedness checker listed at the Web Developer’s Vir­tual Library.
  • Eisen­berg, J.D. (2001). How to Read W3C Specs. Learn­ing to read a DTD (they begin with <!DOCTYPE ...>) is not easy, but worth­while if you spend any­time author­ing XML doc­u­ments because it is the ulti­mate author­ity for what is and is not syn­tac­ti­cally cor­rect for a par­tic­u­lar markup lan­guage. He also talks about name­spaces, which allows you to use ele­ments from dif­fernt XML appli­ca­tions in the same document.

XML on the Web

This is a list of the XML lan­guages that are rel­e­vant to the Web. For now, they are just place­hold­ers; but I want to delve into some of these in more detail at some point.

To Read

♦ ♦ ♦

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Subscribe without commenting