CSharp in a Nutshell 2nd Edition-CSharp in a Nutshell 2nd Edition

8.3 Selecting Nodes Using XPath

Walking the resulting tree of XmlNode instances that is created by XML readers can be tedious and error-prone in its own right, particularly when a specific set of nodes scattered throughout the tree is desired. To address this requirement, the W3C defined a query API called XPath for selecting nodes within an Infoset (the tree-like hierarchy a DOM-compliant parser transforms XML input into, according to the W3C Specifications) representation. In the .NET XML libraries, this is implemented by specifying the IXPathNavigable interface, with any type that supports the interface in turn supporting XPath queries against it.

Without the availability of XPath, programmers must write code like that in the following example in order to list the author names found in a given book element:

using System;
using System.IO;
using System.Xml;
  
class App
{
  public static void Main(string[ ] args)
  {
    string xmlContent =
      "<book>" +
      "  <title>C# in a Nutshell</title>" +
      "  <authors>" +
      "    <author>Drayton</author>" +
      "    <author>Neward</author>" +
      "    <author>Albahari</author>" +
      "  </authors>" +
      "</book>";
    XmlTextReader xtr = 
      new XmlTextReader(new StringReader(xmlContent));
  
    XmlDocument doc = new XmlDocument( );
    doc.Load(xtr);
    XmlNode docElement = doc.DocumentElement;
  
    // This gets us title and authors
    foreach (XmlNode n2 in docElement.ChildNodes)
    {
      if (n2.Name =  = "authors")
      {
        // This gets us author tags
        foreach (XmlNode n3 in n2.ChildNodes)
        {
          // This gets us the text inside the author tag;
          // could also get the child element, a text node,
          // and examine its Value property
          Console.WriteLine(n3.InnerText);
        }
      }
    }
  }
}

Because, however, XMLNode implements the IXPathNavigable interface, we can instead write an XPath query that does the selection for us, as well as returns an XmlNodeList that we can walk in the usual fashion:

using System;
using System.IO;
using System.Xml;
  
class App
{
  public static void Main(string[ ] args)
  {
    string xmlContent =
      "<book>" +
      "  <title>C# in a Nutshell</title>" +
      "  <authors>" +
      "    <author>Drayton</author>" +
      "    <author>Neward</author>" +
      "    <author>Albahari</author>" +
      "  </authors>" +
      "</book>";
    XmlTextReader xtr = 
      new XmlTextReader(new StringReader(xmlContent));
  
    XmlDocument doc = new XmlDocument( );
    doc.Load(xtr);
    XmlNode docElement = doc.DocumentElement;
  
    XmlNodeList result = 
      docElement.SelectNodes("/book/authors/author/text( )");
    foreach (XmlNode n in result)
    {
      Console.WriteLine(n.Value);
    }
  }
}

While the preceding code may not seem like much of an improvement over the earlier XPath-free approach, the real power of XPath becomes apparent when doing far more complex queries. For example, consider the following code, which returns the list of books authored by either Drayton or Neward:

using System;
using System.IO;
using System.Xml;
  
class App
{
  public static void Main(string[ ] args)
  {
    string xmlContent =
      "<book>" +
      "  <title>C# in a Nutshell</title>" +
      "  <authors>" +
      "    <author>Drayton</author>" +
      "    <author>Neward</author>" +
      "    <author>Albahari</author>" +
      "  </authors>" +
      "</book>";
    XmlTextReader xtr = 
      new XmlTextReader(new StringReader(xmlContent));
  
    XmlDocument doc = new XmlDocument( );
    doc.Load(xtr);
    XmlNode docElement = doc.DocumentElement;
  
    XmlNodeList result = 
      docElement.SelectNodes("/book" +
                             "[authors/author/text( )='Drayton' or " +
                             "authors/author/text( )='Neward']" +
                             "/title/text( )");
    foreach (XmlNode n in result)
    {
      Console.WriteLine(n.Value);
    }
  }
}

Notice that in the XPath-related code, the conditional logic is contained within the XPath statement itself—the subquery contained inside the square brackets acts as a predicate constraining the rest of the statement. Writing the code to produce similar results in C# would be at least a page long. You can find more information on XPath in XML in a Nutshell, by Elliotte Rusty Harold and W. Scott Means, or Learning XML, by Erik T. Ray (both published by O'Reilly).

[ Team LiB ]