8.3 Selecting Nodes Using XPath
Walking the resulting tree of XmlNode instances
that is created by XML readers can be tedious and error-prone in its
own right,
particularly when a specific set of nodes scattered throughout the
tree is desired. To address this requirement, the W3C defined a query
API called XPath for selecting nodes within an
Infoset (the tree-like hierarchy a DOM-compliant parser transforms
XML input into, according to the W3C Specifications) representation.
In the .NET XML libraries, this is implemented by specifying the
IXPathNavigable interface, with any type that
supports the interface in turn supporting XPath queries against it.
Without the availability of XPath, programmers must write code like
that in the following example in order to list the author names found
in a given book element:
using System;
using System.IO;
using System.Xml;
class App
{
public static void Main(string[ ] args)
{
string xmlContent =
"<book>" +
" <title>C# in a Nutshell</title>" +
" <authors>" +
" <author>Drayton</author>" +
" <author>Neward</author>" +
" <author>Albahari</author>" +
" </authors>" +
"</book>";
XmlTextReader xtr =
new XmlTextReader(new StringReader(xmlContent));
XmlDocument doc = new XmlDocument( );
doc.Load(xtr);
XmlNode docElement = doc.DocumentElement;
// This gets us title and authors
foreach (XmlNode n2 in docElement.ChildNodes)
{
if (n2.Name = = "authors")
{
// This gets us author tags
foreach (XmlNode n3 in n2.ChildNodes)
{
// This gets us the text inside the author tag;
// could also get the child element, a text node,
// and examine its Value property
Console.WriteLine(n3.InnerText);
}
}
}
}
}
Because, however, XMLNode implements the
IXPathNavigable interface, we can instead write an
XPath query that does the selection for us, as well as returns an
XmlNodeList that we can walk in the usual fashion:
using System;
using System.IO;
using System.Xml;
class App
{
public static void Main(string[ ] args)
{
string xmlContent =
"<book>" +
" <title>C# in a Nutshell</title>" +
" <authors>" +
" <author>Drayton</author>" +
" <author>Neward</author>" +
" <author>Albahari</author>" +
" </authors>" +
"</book>";
XmlTextReader xtr =
new XmlTextReader(new StringReader(xmlContent));
XmlDocument doc = new XmlDocument( );
doc.Load(xtr);
XmlNode docElement = doc.DocumentElement;
XmlNodeList result =
docElement.SelectNodes("/book/authors/author/text( )");
foreach (XmlNode n in result)
{
Console.WriteLine(n.Value);
}
}
}
While the preceding code may not seem like much of an improvement
over the earlier XPath-free approach, the real power of XPath becomes
apparent when doing far more complex queries. For example, consider
the following code, which returns the list of books authored by
either Drayton or Neward:
using System;
using System.IO;
using System.Xml;
class App
{
public static void Main(string[ ] args)
{
string xmlContent =
"<book>" +
" <title>C# in a Nutshell</title>" +
" <authors>" +
" <author>Drayton</author>" +
" <author>Neward</author>" +
" <author>Albahari</author>" +
" </authors>" +
"</book>";
XmlTextReader xtr =
new XmlTextReader(new StringReader(xmlContent));
XmlDocument doc = new XmlDocument( );
doc.Load(xtr);
XmlNode docElement = doc.DocumentElement;
XmlNodeList result =
docElement.SelectNodes("/book" +
"[authors/author/text( )='Drayton' or " +
"authors/author/text( )='Neward']" +
"/title/text( )");
foreach (XmlNode n in result)
{
Console.WriteLine(n.Value);
}
}
}
Notice that in the XPath-related code, the conditional logic is
contained within the XPath statement itself—the subquery
contained inside the square brackets acts as a predicate constraining
the rest of the statement. Writing the code to produce similar
results in C# would be at least a page long. You can find more
information on XPath in XML in a Nutshell, by
Elliotte Rusty Harold and W. Scott Means, or Learning
XML, by Erik T. Ray (both published by
O'Reilly).
|