DekGenius.com
[ Team LiB ] Previous Section Next Section

7.2 Introducing XSLT

A document written in XSLT, referred to as a stylesheet, describes the transformation of a particular type of XML document into another format. These other formats can include not only XML languages, such as HTML and Scalable Vector Graphics (SVG), but languages using other syntaxes, such as plain text, Comma Separated Values (CSV), and any number of others—including your choice of proprietary formats. In fact, the range of output formats is limited only by the amount of work you want to do to create the appropriate stylesheet—or to locate an appropriate stylesheet created by a third party.

XSLT can be thought of as a little language, providing complete functionality for a limited set of tasks. A little language is defined as a specialized, concise notation, designed for a specific family of problems. Much simpler than a general-purpose programming language, a little program does a limited number of things very efficiently.

Although it was designed simply to provide for the transformation of XML documents, XSLT is often used to process XML documents in other ways. XSLT can be used to generate summary statistics about XML documents, store information from an XML file in a database, or communicate data from an XML file to a mobile device. Again, the applications are limited only by your imagination.

7.2.1 A Brief Introduction to the XSLT Specification

The XSLT specification was designed with several goals in mind. First, the XSLT stylesheet itself is an XML document. This allows you to manipulate the stylesheet like any other XML document, up to and including transforming the stylesheet itself into another format via XSLT.

Next, the XSLT language is based on pattern matching. In fact, much of the pattern matching power of XSLT comes from the XPath specification, which is discussed in Chapter 6.

Third, like any good functional programming language, each XSLT function is free of side effects. The benefit this design goal creates is that the same function will have the same effect on any source node on which it is invoked, no matter how many times it has already been invoked on that or any other node.

Finally, flow control in XSLT is managed through iteration and recursion. The concept of iteration will be familiar if you've used C#'s foreach statement. The idea is that, given a collection of nodes, the same set of functions will be applied to each one in order. Recursion should also be familiar to developers experienced with modern programming languages; a recursive function is one that calls itself during its execution.

XSLT processing consists of loading an XML source document into a source tree, applying a series of templates to the nodes in the source tree, and sending the resulting data to a result tree. Where the source document comes from, and where the result document is written to, are left up to the XSLT implementation.

As I've already mentioned, an XSLT stylesheet is an XML document. Any XML document can be considered an XSLT stylesheet if it contains the following namespace declaration, traditionally mapped to the xsl prefix:

xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

The stylesheet's document element is one of xsl:stylesheet or xsl:transform, which are synonymous, according to the XSLT specification. The remainder of the stylesheet consists of a series of templates, in the form of xsl:template elements. The xsl:template element has a match attribute, the value of which is an XPath expression to be applied to the source tree. When a node in the source tree matches a template, further matching may be done. When all matching is complete, the matching node and other information from the template is written to the result tree. At the end of processing, the result tree is serialized to a document whose form is specified in the stylesheet's xsl:output element.

I'm going to construct a simple XSLT stylesheet, which transforms the inventory.xml document from Chapter 5 into an HTML representation of a catalog. Remember that, as with any programming language, there's more than one way to do it. This stylesheet represents just one way to transform the inventory document into HTML.

I'll call this file catalog.xsl. Let's examine it one element at a time. To begin with, since the XSLT stylesheet is an everyday XML document, it never hurts to have an XML declaration:

<?xml version="1.0" encoding="utf-8"?>

The root element of the stylesheet is xsl:stylesheet. Either xsl:stylesheet or xsl:transform must be present in an XSLT stylesheet, and the namespace URI and version must be included exactly as shown. Different XSLT processors may behave differently, but many will throw a warning or an error if the namespace or version is missing or different:

<xsl:stylesheet 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="1.0">

The xsl:output element indicates the output method for the transformation. The XSLT specification defines three: html, xml, and text. Specific XSLT processor implementations are free to define others. Some output methods allow method-specific attributes; for example, the html and xml output methods allow an indent attribute, to control whether the output is to be indented:

<xsl:output method="html"/>

The xsl:template element defines a template. The match attribute contains an XPath expression indicating which nodes in the document the template is to be executed for. In this case, the expression is /, which matches the document root. This template will be the first one executed when transforming an XML document:

<xsl:template match="/">

Anything within the xsl:template element that does not have the xsl prefix will be copied to output verbatim. In this case, upon reading the beginning of the source tree, this stylesheet will cause the HTML header information to be written to the result tree:

<html>
  <head>
    <title>Angus Hardware | Online Catalog</title>
  </head>

The xsl:apply-templates element indicates that any further templates are to be processed at this point. I'll define a number of other templates later in the stylesheet, and any one of them that match any elements in the source tree would now be executed:

<xsl:apply-templates/>
  </html>

The stylesheet is an XML document, remember? You have to close every element you open in order for the stylesheet to be valid:

</xsl:template>

This template matches the inventory element. Since this is the document element, the template's output is the HTML body element, followed by the output of any other matched templates:

<xsl:template match="inventory">
  <body bgcolor="#FFFFFF">
    <h1>Angus Hardware</h1>
    <h2>Online Catalog</h2>
    <xsl:apply-templates/>
  </body>
</xsl:template>

Upon matching the date element, this template will cause the element's attributes to be output, formatted as month/day/year. Here you can see again that anything within the xsl:template element that does not have the xsl prefix is sent to the output tree verbatim, including character data:

<xsl:template match="date">
  <p>Current as of 
    <xsl:value-of select="@month" />/<xsl:value-of select="@day" />/<
xsl:value-of select="@year" />
  </p>
</xsl:template>

This template outputs a table element and the table header, and applies any other templates for nodes that are found within the items context:

<xsl:template match="items">
  <p>Currently available items:</p>
  <table border="1">
    <tr>
      <th>Product Code</th>
      <th>Description</th>
      <th>Unit Price</th>
      <th>Quantity in Stock</th>
    </tr>
    <xsl:apply-templates />
  </table>
</xsl:template>

This template is applied to each item element, sending a table row to the output context:

<xsl:template match="item">
  <tr>
    <td><xsl:value-of select="@productCode" /></td>
    <td><xsl:value-of select="@description" /></td>
    <td><xsl:value-of select="@unitCost" /></td>
    <td><xsl:value-of select="@quantity" /></td>
  </tr>
</xsl:template>

And finally, the stylesheet's document element must be closed:

</xsl:stylesheet>

Example 7-1 shows the complete stylesheet.

Example 7-1. An XSLT stylesheet for inventory.xml
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="1.0">

  <xsl:output method="html"/>

  <xsl:template match="/">
    <html>
      <head>
        <title>Angus Hardware | Online Catalog</title>
      </head>
      <xsl:apply-templates/>
    </html>
  </xsl:template>

  <xsl:template match="inventory">
    <body bgcolor="#FFFFFF">
      <h1>Angus Hardware</h1>
      <h2>Online Catalog</h2>
      <xsl:apply-templates/>
    </body>
  </xsl:template>

  <xsl:template match="date">
    <p>Current as of 
      <xsl:value-of select="@month" />/<xsl:value-of select="@day" />/<xsl:value-of select="@year" />
</p>
  </xsl:template>

  <xsl:template match="items">
    <p>Currently available items:</p>
    <table border="1">
      <tr>
        <th>Product Code</th>
        <th>Description</th>
        <th>Unit Price</th>
        <th>Quantity in Stock</th>
      </tr>
      <xsl:apply-templates />
    </table>
  </xsl:template>

  <xsl:template match="item">
    <tr>
      <td><xsl:value-of select="@productCode" /></td>
      <td><xsl:value-of select="@description" /></td>
      <td><xsl:value-of select="@unitCost" /></td>
      <td><xsl:value-of select="@quantity" /></td>
    </tr>
</xsl:template>

</xsl:stylesheet>

Example 7-2 shows the HTML output resulting from processing inventory.xml with catalog.xsl, and Figure 7-1 shows a screenshot of the HTML in a web browser.

Example 7-2. HTML output from the catalog.xsl stylesheet
<html>
  <head>
    <title>Angus Hardware | Online Catalog</title>
  </head>
  <body bgcolor="#FFFFFF">
    <h1>Angus Hardware</h1>
    <h2>Online Catalog</h2>
    <p>Current as of 6/22/2002</p>
    <p>Currently available items:</p>
    <table border="1">
      <tr>
        <th>Product Code</th>
        <th>Description</th>
        <th>Unit Price</th>
        <th>Number in Stock</th>
      </tr>
      <tr>
        <td>R-273</td>
        <td>14.4 Volt Cordless Drill</td>
        <td>189.95</td>
        <td>15</td>
      </tr>
      <tr>
        <td>1632S</td>
        <td>12 Piece Drill Bit Set</td>
        <td>14.95</td>
        <td>23</td>
      </tr>
    </table>
  </body>
</html>
Figure 7-1. Output of the catalog.xsl stylesheet
figs/dnxm_0701.gif

This sort of transformation is done with a push model, in which the source document controls the structure of the result document while the stylesheet controls the appearance of the result document. The other way to use XSLT is a pull model, wherein the stylesheet controls both the structure and appearance of the result document, pulling content out of the source document as needed.

I'll show you how to construct a pull model stylesheet to transform the same hardware catalog XML file into a summary text file below. First, the XML declaration and stylesheet element remain the same as with the push model stylesheet:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="1.0">

For this stylesheet, however, I want the output to go to a plain text file. The xsl:output element takes care of this:

  <xsl:output method="text" />

Finally, the stylesheet has only one template. Because the output method is text, there's no need to put any HTML tags in the stylesheet. The text will be copied out to the result tree verbatim, except for the xsl:value-of element, which uses the sum( ) function to add up the total values of the quanity attributes of all the item elements:

  <xsl:template match="/">
Angus Hardware
Inventory Summary
========= =======

There are <xsl:value-of select="sum(/inventory/items/item/@quantity)" /> units in stock.
  </xsl:template>

Example 7-3 shows the complete stylesheet.

Example 7-3. Inventory summary stylesheet
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="1.0">
  
  <xsl:output method="text" />
  
  <xsl:template match="/">
Angus Hardware
Inventory Summary
========= =======

There are <xsl:value-of select="sum(/inventory/items/item/@quantity)" /> units in stock.
  </xsl:template>
</xsl:stylesheet>

Example 7-4 shows the output resulting from this stylesheet.

Example 7-4. Inventory summary stylesheet output
Angus Hardware
Inventory Summary
========= =======

There are 38 units in stock.

Like XPath, XSLT itself has much more functionality than I can possibly describe here. Entire books have been written about it; if you are interested in learning more about XSLT, take a look at XSLT (O'Reilly) or Learning XSLT (O'Reilly).

7.2.2 When to Use XSLT

Using XSLT is entirely appropriate when you need to present XML data in a different format. For example, you may be providing a web site that needs to communicate with a variety of devices. Some devices may speak HTML, some may speak WAP, and some may understand some totally unrelated language, such as PDF, EDIFACT, or Minitel. XSLT can transform your XML source documents into the different formats required for diverse clients.

Another appropriate use for XSLT is when you need a common intermediate format for disparate XML data formats. If you can write XML, it can be transformed into any standard or proprietary XML schema for use in another computing environment. For example, you may wish to convert a proprietary XML format into another company's published XML format.

Pull templates make up another category of good use for XSLT. For example, you can use XSLT with a pull template to create summary documents.

In short, you should use XSLT whenever you need to place the content of an XML document into a different structure.

    [ Team LiB ] Previous Section Next Section