DekGenius.com
[ Team LiB ] Previous Section Next Section

8.1 Accessing XML Documents

Like the I/O mechanism described in Chapter 10, the XML libraries in the .NET FCL follow a pattern of an "abstract base class with concrete backing store implementation classes." The two abstract base classes themselves are XmlReader and XmlWriter, used respectively for consuming and producing XML.

8.1.1 XmlReader

The XmlReader class, as its name implies, provides the ability to consume XML documents. It is an abstract base class, intended to be subclassed for working against a particular source of XML. There are three concrete implementations of XmlReader in the FCL: XmlTextReader (used for parsing XML from any arbitrary stream of text), XmlNodeReader (used for parsing XML from an XmlNode), and XmlValidatingReader, which is an XmlReader that performs DTD and/or Schema validation against the parsed document.

Most often, an XmlTextReader is all that's necessary to begin working with XML input, as shown in the following example:

using System.Xml;
class XmlFun {
  static void Main( ) {
    XmlTextReader tr = new XmlTextReader("xmlfun.xml");
    // use tr as described later
  }
}

This code expects to find a file named xmlfun.xml within the current directory. This constructor is a shortcut notation for the more powerful Stream-based constructor that XmlTextReader also supports:

using System.IO;
using System.Xml;
class XmlFun {
  static void Main( ) {
    XmlTextReader tr = 
      new XmlTextReader(
        new TextReader(
          new FileStream("xmlfun.xml", FileMode.Open)));
    // use tr as described later
  }
}

This highlights an important point: XmlTextReader can pull XML from any Stream-based input source, including HTTP URLs and database text columns. For example, it becomes possible to parse XML out of an in-memory String-based representation:

using System.IO;
using System.Xml;
class XmlFun {
  static void Main( ) {
    string xmlContent =
      "<book>" +
      "  <title>C# in a Nutshell</title>" +
      "  <author>Drayton</author>" +
      "  <author>Neward</author>" +
      "  <author>Albahari</author>" +
      "</book>";
    
    XmlTextReader tr = 
      new XmlTextReader(new StringReader(xmlContent));
  }
}

The ability to parse in-memory strings permits the parsing of XML-generated in-process without having to write the XML first to a file. This can be particularly powerful as a means of decoupling between components.

Once the XmlReader instance is created, it can be used to extract the XML elements in a pull-driven mode. XmlReader itself serves as a cursor to the various XML constructs contained within the stream, such as elements, attributes, and so forth. The current construct can be accessed using a variety of properties: Name (to return the qualified name of the element or attribute), Value (to return the value of the raw text), and so on. The Read method is commonly used to iterate to the next element in the stream; otherwise, one of the various MoveXXX methods, such as MoveToNextElement, can be used to perform higher-level navigation within the stream. (Note that all navigation must be forward-only; once bypassed, a node cannot be retrieved.) The following code demonstrates how to pull all the XML elements from an XML stream and echo them back to the console:

using System;
using System.IO;
using System.Xml;
class XmlFun {
  static void Main( ) {
    try
    {  
      string xmlContent =
        "<book>" +
        "  <title>C# in a Nutshell</title>" +
        "  <authors>" +
        "    <author>Drayton</author>" +
        "    <author>Neward</author>" +
        "    <author>Albahari</author>" +
        "  </authors>" +
        "</book>";
      
      XmlTextReader reader = 
        new XmlTextReader(new StringReader(xmlContent));
        
      //Parse the file and display each of the nodes.
      while (reader.Read( ))
      {
         switch (reader.NodeType)
         {
           case XmlNodeType.Element:
             Console.Write("<{0}>", reader.Name);
             break;
           case XmlNodeType.Text:
             Console.Write(reader.Value);
             break;
           case XmlNodeType.CDATA:
             Console.Write("<![CDATA[{0}]]>", reader.Value);
             break;
           case XmlNodeType.ProcessingInstruction:
             Console.Write("<?{0} {1}?>", reader.Name, reader.Value);
             break;
           case XmlNodeType.Comment:
             Console.Write("<!--{0}-->", reader.Value);
             break;
           case XmlNodeType.XmlDeclaration:
             Console.Write("<?xml version='1.0'?>");
             break;
           case XmlNodeType.Document:
             break;
           case XmlNodeType.DocumentType:
             Console.Write("<!DOCTYPE {0} [{1}]", reader.Name, 
               reader.Value);
             break;
           case XmlNodeType.EntityReference:
             Console.Write(reader.Name);
             break;
           case XmlNodeType.EndElement:
             Console.Write("</{0}>", reader.Name);
             break;
         }       
      }           
    }
  }
}

In the preceding example, the actual data of interest is found in either the Name or Value property of the XmlTextReader, depending on the actual type of the Read node.

8.1.2 XmlWriter

Just as XmlReader serves as the abstract type for consuming XML, XmlWriter serves as an abstract type for producing XML-compliant data. It is always possible to emit XML "in the raw" by creating a String and appending XML tags to it, as in the following:

string xml = "<greeting>" + 
messageOfTheDay + 
"< /greeting>";

However, there are numerous problems with this technique, most notably the possibility that typos and accidental programmer errors will render the XML ill-formed and therefore unparseable. XmlWriter provides a less error-prone way of generating well-formed XML. The following code generates the XML data string of the preceding example (without the XML error), complete with document declaration:

XmlTextWriter xw = new XmlTextWriter("greetings.xml", null);
xw.Formatting = Formatting.Indented;
xw.Indentation = 2;
xw.WriteStartDocument( );
xw.WriteStartElement("greeting");
xw.WriteString(messageOfTheDay);
xw.WriteEndElement( );
xw.WriteEndDocument( );
xw.Close( );

This writes the XML to a file named greetings.xml, indenting the code two spaces on each level. To write the data to an in-memory string, pass either a System.IO.TextWriter instance into the constructor of XmlTextWriter, or else pass a System.IO.Stream object and a System.Text.Encoding enumeration instance (to tell the Writer how to treat character data; by default, it uses UTF-8).

    [ Team LiB ] Previous Section Next Section