DekGenius.com
[ Team LiB ] Previous Section Next Section

Recipe 19.12 Removing Extra Whitespace from XML Objects

19.12.1 Problem

You want to remove whitespace nodes from your XML object because they are interfering with your ability to programmatically extract the values you want.

19.12.2 Solution

Set the ignoreWhite property to true before loading external data into an XML object or before parsing any string into an XML object, or use a custom removeWhitespace( ) method to remove whitespace nodes after loading data or for legacy purposes.

19.12.3 Discussion

XML documents are often written with extra whitespace characters for the purposes of formatting so that it is easier for humans to read. Most of the time, each element appears on its own line (meaning that newline, carriage return, or form feed characters have been added), and child nodes are sometimes indented using tabs or spaces. In what follows, you can see the same XML data formatted in two ways. The first is the data formatted without the whitespace characters, and the second is the same data with formatting. Which is easier to read?

<book><title>ActionScript Cookbook</title><authors><author name="Joey Lott" /></authors></book>

or:

<book>
  <title>ActionScript Cookbook</title>
  <authors>
    <author name="Joey Lott" />
  </authors>
</book>

While the formatted XML data may be easier for you, as a human, to read, the unformatted XML data is better suited for Flash. Flash counts all of the whitespaces as their own text nodes, which means that Flash parses every newline or tab character into its own node. The result is that you might have a more difficult time trying to accurately locate the data you want.

Fortunately, when you load data from an external source, such as a static file or a server-side script, or when you parse any string into an XML object, you can tell Flash to ignore whitespace nodes when parsing the data. The ignoreWhite property is set to false by default, which means that Flash parses whitespace nodes. However, if you set the ignoreWhite property to true prior to parsing any data into the object, Flash will not create whitespace nodes. This means that you should set the ignoreWhite property for an XML object to true before calling the load( ) method or sendAndLoad( ) method, or before using the parseXML( ) method to parse a string:

// Create the XML object.
my_xml = new XML(  );

// Set ignoreWhite to true to ignore whitespace during parsing.
my_xml.ignoreWhite = true;

// Define an onLoad(  ) method and then load the data. (See Recipe 19.11.)
my_xml.onLoad = function (success) {
   // Actions to perform on load
};
my_xml.load("externalDoc.xml");

Since, in most cases, you want to set ignoreWhite to true, you can set the value of the ignoreWhite property for the entire XML class. This means that you only have to set the property once and all XML objects will inherit that value. For most Flash developers, this is a common practice.

XML.prototype.ignoreWhite = true;

You can even add the preceding line to your XML.as file. Just be aware that you've effectively changed the default behavior for all XML parsing operations.

Now, if you want to remove whitespace nodes from an XML document after the data has been loaded and/or parsed, you need to create a custom method. Such a method is necessarily recursive, meaning that it calls itself to make sure that all the nodes are checked. Here, then, is our recursive whitespace-stripping function, suitable for inclusion in your XML.as file, if not for framing:

XMLNode.prototype.removeWhitespace = function (  ) {
  var cNode, hasNonWhitespace;

  // Loop through all the child nodes.
  for (var i = 0; i < this.childNodes.length; i++) {
    cNode = this.childNodes[i];

    // If the node is a text node . . . 
    if (cNode.nodeType == 3) {

      // Check to see if any of the characters it contains are non-whitespace
      // characters. If so, set hasNonWhitespace to true and break out of the for
      // loop since all it takes is one non-whitespace character to mean the node is
      // a non-whitespace node.
      for (var j = 0; j < cNode.nodeValue.length; j++) {
        if (cNode.nodeValue.charCodeAt(j) > 32) {
          hasNonWhitespace = true;
          break;
        }
      }

      // If the node is a whitespace node, remove it with the removeNode(  ) method and
      // decrement i so that we don't skip over a node now that we've removed one.
      if (!hasNonWhitespace) {
        cNode.removeNode(  );
        i--;
      }
    } else {
      // Otherwise, it is not a text node; invoke removeWhitespace(  ) recursively.
      cNode.removeWhitespace(  );
    }
  }
};

Here is an example in which we use the removeWhitespace( ) method to remove whitespace after having loaded the data:

#include "XML.as"
my_xml = new XML(  );
my_xml.load("books.xml");
my_xml.onLoad = function (success) {
  if (success) {
    // Run the removeWhitespace(  ) method and display the results.
    this.removeWhitespace(  );
    trace(this);
  }
};
    [ Team LiB ] Previous Section Next Section