Accessing Data with Microsoft .NET Framework 4: LINQ to XML

  • 6/15/2011

Lesson 1: Working with the XmlDocument and XmlReader Classes

The XmlDocument and XmlReader classes have existed since Microsoft .NET Framework 1.0. This lesson explores each of these classes, showing benefits and drawbacks of using each in your code.

The XmlDocument Class

The W3C has provided standards that define the structure and a standard programming interface called the Document Object Model (DOM) that can be used in a wide variety of environments and applications for XML documents. Classes that support the DOM typically are capable of random access navigation and modification of the XML document.

The XML classes are accessible by setting a reference to the System.Xml.dll file and adding the Imports System.Xml (C# using System.Xml;) directive to the code.

The XmlDocument class is an in-memory representation of XML using the DOM Level 1 and Level 2. This class can be used to navigate and edit the XML nodes.

There is another class, XmlDataDocument, which inherits from the XmlDocument class and represents relational data. The XmlDataDocument class, in the System.Data.dll assembly, can expose its data as a data set to provide relational and nonrelational views of the data. This lesson focuses on the XmlDocument class.

These classes provide many methods to implement the Level 2 specification and contain methods to facilitate common operations. The methods are summarized in Table 5-1. The XmlDocument class, which inherits from XmlNode, contains all the methods for creating XML elements and XML attributes.

Table 5-1. Summary of the XmlDocument Methods

METHOD

DESCRIPTION

CreateNode

Creates an XML node in the document. There are also specialized Create methods for each node type such as CreateElement or CreateAttribute.

CloneNode

Creates a duplicate of an XML node. This method takes a Boolean argument called deep. If deep is false, only the node is copied; if deep is true, all child nodes are recursively copied as well.

GetElementByID

Locates and returns a single node based on its ID attribute. This requires a document type definition (DTD) that identifies an attribute as being an ID type. An attribute with the name ID is not an ID type by default.

GetElementsByTagName

Locates and returns an XmlNode list containing all the descendant elements based on the element name.

ImportNode

Imports a node from a different XmlDocument class into the current document. The source node remains unmodified in the original XmlDocument class. This method takes a Boolean argument called deep. If deep is false, only the node is copied; if deep is true, all child nodes are recursively copied as well.

InsertBefore

Inserts an XmlNode list immediately before the referenced node. If the referenced node is Nothing (or null in C#), the new node is inserted at the end of the child list. If the node already exists in the tree, the original node is removed when the new node is inserted.

InsertAfter

Inserts an XmlNode list immediately after the referenced node. If the referenced node is Nothing (or null in C#), the new node is inserted at the beginning of the child list. If the node already exists in the tree, the original node is removed when the new node is inserted.

Load

Loads an XML document from a disk file, Uniform Resource Locator (URL), or stream.

LoadXml

Loads an XML document from a string.

Normalize

Ensures that there are no adjacent text nodes in the document. This is like saving the document and reloading it. This method can be desirable when text nodes are being programmatically added to an XmlDocument class, and the text nodes could be side by side. Normalizing combines the adjacent text nodes to produce a single text node.

PrependChild

Inserts a node at the beginning of the child node list. If the new node is already in the tree, it is removed before it is inserted. If the node is an XmlDocument fragment, the complete fragment is added.

ReadNode

Loads a node from an XML document by using an XmlReader object. The reader must be on a valid node before executing this method. The reader reads the opening tag, all child nodes, and the closing tag of the current element. This repositions the reader to the next node.

RemoveAll

Removes all children and attributes from the current node.

RemoveChild

Removes the referenced child.

ReplaceChild

Replaces the referenced child with a new node. If the new node is already in the tree, it is removed before it is inserted.

Save

Saves the XML document to a disk file, URL, or stream.

SelectNodes

Selects a list of nodes that match the XPath expression.

SelectSingleNode

Selects the first node that matches the XPath expression.

WriteTo

Writes a node to another XML document using an XmlTextWriter class.

WriteContentsTo

Writes a node and all its descendants to another XML document using an XmlTextWriter class.

Creating the XmlDocument Object

To create an XmlDocument object, start by instantiating an XmlDocument class. The XmlDocument object contains CreateElement and CreateAttribute methods that add nodes to the XmlDocument object. The XmlElement contains the Attributes property, which is an XmlAttributeCollection type. The XmlAttributeCollection type inherits from the XmlNamedNodeMap class, which is a collection of names with corresponding values.

The following code shows how an XmlDocument class can be created from the beginning and saved to a file. Note that import System.Xml (C# using System.Xml;) and Import System.IO (C# using System.IO;) was added to the top of the code file.

Sample of Visual Basic Code

   Private Sub CreateAndSaveXmlDocumentToolStripMenuItem_Click( _
         ByVal sender As System.Object, ByVal e As System.EventArgs) _
         Handles CreateAndSaveXmlDocumentToolStripMenuItem.Click
      'Declare and create new XmlDocument
      Dim xmlDoc As New XmlDocument()

      Dim el As XmlElement
      Dim childCounter As Integer
      Dim grandChildCounter As Integer

      'Create the xml declaration first
      xmlDoc.AppendChild( _
       xmlDoc.CreateXmlDeclaration("1.0", "utf-8", Nothing))

      'Create the root node and append into doc
      el = xmlDoc.CreateElement("MyRoot")
      xmlDoc.AppendChild(el)

      'Child Loop
      For childCounter = 1 To 4
         Dim childelmt As XmlElement
         Dim childattr As XmlAttribute

         'Create child with ID attribute
         childelmt = xmlDoc.CreateElement("MyChild")
         childattr = xmlDoc.CreateAttribute("ID")
         childattr.Value = childCounter.ToString()
         childelmt.Attributes.Append(childattr)

         'Append element into the root element
         el.AppendChild(childelmt)
         For grandChildCounter = 1 To 3
            'Create grandchildren
            childelmt.AppendChild(xmlDoc.CreateElement("MyGrandChild"))

         Next
      Next

      'Save to file
      xmlDoc.Save(GetFilePath("XmlDocumentTest.xml"))
      txtLog.AppendText("XmlDocumentTest.xml Created" + vbCrLf)

   End Sub

   Private Function getFilePath(ByVal fileName As String) As String
      Return Path.Combine(Environment.GetFolderPath( _
            Environment.SpecialFolder.Desktop), fileName)
   End Function

Sample of C# Code

private void createAndSaveXmlDocumentToolStripMenuItem_Click(
   object sender, EventArgs e)
{
   //Declare and create new XmlDocument
   var xmlDoc = new XmlDocument();

   XmlElement el;
   int childCounter;
   int grandChildCounter;

   //Create the xml declaration first
   xmlDoc.AppendChild(
      xmlDoc.CreateXmlDeclaration("1.0", "utf-8", null));

   //Create the root node and append into doc
   el = xmlDoc.CreateElement("MyRoot");
   xmlDoc.AppendChild(el);

   //Child Loop
   for (childCounter = 1; childCounter <= 4; childCounter++)
   {
      XmlElement childelmt;
      XmlAttribute childattr;

      //Create child with ID attribute
      childelmt = xmlDoc.CreateElement("MyChild");
      childattr = xmlDoc.CreateAttribute("ID");
      childattr.Value = childCounter.ToString();
      childelmt.Attributes.Append(childattr);

      //Append element into the root element
      el.AppendChild(childelmt);
      for (grandChildCounter = 1; grandChildCounter <= 3;
         grandChildCounter++)
      {
         //Create grandchildren
         childelmt.AppendChild(xmlDoc.CreateElement("MyGrandChild"));
      }
   }

   //Save to file
   xmlDoc.Save(getFilePath("XmlDocumentTest.xml"));
   txtLog.AppendText("XmlDocumentTest.xml Created\r\n");

}

private string getFilePath(string fileName)
{
   return Path.Combine(Environment.GetFolderPath(
      Environment.SpecialFolder.Desktop), fileName);
}

This code started by creating an instance of XmlDocument. Next, the XML declaration is created and placed inside the child collection. An exception is thrown if this is not the first child of XmlDocument. After that, the root element is created and the child nodes with corresponding attributes are created. Finally, a call is made to the getFilePath helper method to assemble a file path to save the file to your desktop. This helper method will be used in subsequent code samples. The following is the XML file that was produced by running the code sample:

XML File

<?xml version="1.0" encoding="utf-8"?>
<MyRoot>
   <MyChild ID="1">
      <MyGrandChild />
      <MyGrandChild />
      <MyGrandChild />
   </MyChild>
   <MyChild ID="2">
      <MyGrandChild />
      <MyGrandChild />
      <MyGrandChild />
   </MyChild>
   <MyChild ID="3">
      <MyGrandChild />
      <MyGrandChild />
      <MyGrandChild />
   </MyChild>
   <MyChild ID="4">
      <MyGrandChild />
      <MyGrandChild />
      <MyGrandChild />
   </MyChild>
</MyRoot>

Parsing an XmlDocument Object by Using the DOM

An XmlDocument object can be parsed by using a recursive routine to loop through all elements. The following code has an example of parsing XmlDocument. Note that imports System.Text (C# using System.Text;) was added.

Sample of Visual Basic Code

Private Sub ParsingAnXmlDocumentToolStripMenuItem_Click( _
      ByVal sender As System.Object, ByVal e As System.EventArgs) _
      Handles ParsingAnXmlDocumentToolStripMenuItem.Click
   Dim xmlDoc As New XmlDocument()
   xmlDoc.Load(getFilePath("XmlDocumentTest.xml"))
   RecurseNodes(xmlDoc.DocumentElement)
End Sub

Public Sub RecurseNodes(ByVal node As XmlNode)
   Dim sb As New StringBuilder()
   'start recursive loop with level 0
   RecurseNodes(node, 0, sb)
   txtLog.Text = sb.ToString()
End Sub

Public Sub RecurseNodes( _
      ByVal node As XmlNode, ByVal level As Integer, _
      ByVal sb As StringBuilder)
   sb.AppendFormat("{0,2} Type:{1,-9} Name:{2,-13} Attr:", _
      level, node.NodeType, node.Name)

    For Each attr As XmlAttribute In node.Attributes
       sb.AppendFormat("{0}={1} ", attr.Name, attr.Value)
    Next
    sb.AppendLine()

    For Each n As XmlNode In node.ChildNodes
       RecurseNodes(n, level + 1, sb)
    Next
 End Sub

Sample of C# Code

private void parsingAndXmlDocumentToolStripMenuItem_Click(object sender, EventArgs e)
{
   XmlDocument xmlDoc = new XmlDocument();
   xmlDoc.Load(getFilePath("XmlDocumentTest.xml"));
   RecurseNodes(xmlDoc.DocumentElement);
}

public void RecurseNodes(XmlNode node)
{
   var sb = new StringBuilder();
   //start recursive loop with level 0
   RecurseNodes(node, 0, sb);
   txtLog.Text = sb.ToString();
}

public void RecurseNodes(XmlNode node, int level, StringBuilder sb)
{
   sb.AppendFormat("{0,2} Type:{1,-9} Name:{2,-13} Attr:",
      level, node.NodeType, node.Name);

   foreach (XmlAttribute attr in node.Attributes)
   {
      sb.AppendFormat("{0}={1} ", attr.Name, attr.Value);
   }
   sb.AppendLine();

   foreach (XmlNode n in node.ChildNodes)
   {
      RecurseNodes(n, level + 1, sb);
   }
}

This code starts by loading an XML file and then calling the RecurseNodes method, which is overloaded. The first call simply passes the xmlDoc root node. The recursive call passes the recursion level and a string builder object. Each time the RecurseNodes method executes, the node information is printed, and a recursive call is made for each child the node has. The following is the result.

Parsing Result

 0 Type:Element   Name:MyRoot        Attr:
 1 Type:Element   Name:MyChild       Attr:ID=1
 2 Type:Element   Name:MyGrandChild  Attr:
 2 Type:Element   Name:MyGrandChild  Attr:
 2 Type:Element   Name:MyGrandChild  Attr:
 1 Type:Element   Name:MyChild       Attr:ID=2
 2 Type:Element   Name:MyGrandChild  Attr:
 2 Type:Element   Name:MyGrandChild  Attr:
 2 Type:Element   Name:MyGrandChild  Attr:
 1 Type:Element   Name:MyChild       Attr:ID=3
 2 Type:Element   Name:MyGrandChild  Attr:
 2 Type:Element   Name:MyGrandChild  Attr:
 2 Type:Element   Name:MyGrandChild  Attr:
 1 Type:Element   Name:MyChild       Attr:ID=4
 2 Type:Element   Name:MyGrandChild  Attr:
 2 Type:Element   Name:MyGrandChild  Attr:
 2 Type:Element   Name:MyGrandChild  Attr:

Searching the XmlDocument Object

The SelectSingleNode method can locate an element; it requires an XPath query to be passed into the method. The following code sample calls the SelectSingleNode method to locate the MyChild element, the ID of which is 3, by using an XPath query. The sample code is as follows:

Sample of Visual Basic Code

Private Sub SearchingAnXmlDocumentToolStripMenuItem_Click( _
      ByVal sender As System.Object, ByVal e As System.EventArgs) _
      Handles SearchingAnXmlDocumentToolStripMenuItem.Click

   Dim xmlDoc As New XmlDocument()
   xmlDoc.Load(getFilePath("XmlDocumentTest.xml"))

   Dim node = xmlDoc.SelectSingleNode("//MyChild[@ID='3']")
   RecurseNodes(node)
End Sub

Sample of C# Code

private void searchingAnXmlDocumentToolStripMenuItem_Click(
   object sender, EventArgs e)
{
   var xmlDoc = new XmlDocument();
   xmlDoc.Load(getFilePath("XmlDocumentTest.xml"));

   var node = xmlDoc.SelectSingleNode("//MyChild[@ID='3']");
   RecurseNodes(node);
}

The SelectSingleNode method can perform an XPath lookup on any element or attribute. The following is a display of the result.

Search Result

 0 Type:Element   Name:MyChild       Attr:ID=3
 1 Type:Element   Name:MyGrandChild  Attr:
 1 Type:Element   Name:MyGrandChild  Attr:
 1 Type:Element   Name:MyGrandChild  Attr:

The GetElementsByTagName method returns an XmlNode list containing all matched elements. The following code returns a list of nodes with the tag name MyGrandChild.

Sample of Visual Basic Code

Private Sub GetElementsByTagNameToolStripMenuItem_Click( _
      ByVal sender As System.Object, ByVal e As System.EventArgs) _
      Handles GetElementsByTagNameToolStripMenuItem.Click

   Dim xmlDoc As New XmlDocument()
   xmlDoc.Load(getFilePath("XmlDocumentTest.xml"))
   Dim elmts = xmlDoc.GetElementsByTagName("MyGrandChild")

   Dim sb As New StringBuilder()
   For Each node As XmlNode In elmts
      RecurseNodes(node, 0, sb)
   Next
   txtLog.Text = sb.ToString()
End Sub

Sample of C# Code

private void getElementsByTagNameToolStripMenuItem_Click(
   object sender, EventArgs e)
{
   var xmlDoc = new XmlDocument();
   xmlDoc.Load(getFilePath("XmlDocumentTest.xml"));

   var elmts = xmlDoc.GetElementsByTagName("MyGrandChild");

   var sb = new StringBuilder();
   foreach (XmlNode node in elmts)
   {
      RecurseNodes(node, 0, sb);
   }
   txtLog.Text = sb.ToString();
}

This method works well, even for a single node lookup, when searching by tag name. The following is the execution result.

Search Result

 0 Type:Element   Name:MyGrandChild  Attr:
 0 Type:Element   Name:MyGrandChild  Attr:
 0 Type:Element   Name:MyGrandChild  Attr:
 0 Type:Element   Name:MyGrandChild  Attr:
 0 Type:Element   Name:MyGrandChild  Attr:
 0 Type:Element   Name:MyGrandChild  Attr:
 0 Type:Element   Name:MyGrandChild  Attr:
 0 Type:Element   Name:MyGrandChild  Attr:
 0 Type:Element   Name:MyGrandChild  Attr:
 0 Type:Element   Name:MyGrandChild  Attr:
 0 Type:Element   Name:MyGrandChild  Attr:
 0 Type:Element   Name:MyGrandChild  Attr:

The SelectNodes method, which requires an XPath query to be passed into the method, can also retrieve an XmlNode list. The previous code sample has been modified to call the SelectNodes method to achieve the same result, as shown in the following code:

Sample of Visual Basic Code

   Private Sub SelectNodesToolStripMenuItem_Click( _
         ByVal sender As System.Object, ByVal e As System.EventArgs) _
         Handles SelectNodesToolStripMenuItem.Click
      Dim xmlDoc As New XmlDocument()
      xmlDoc.Load(getFilePath("XmlDocumentTest.xml"))
      Dim elmts = xmlDoc.SelectNodes("//MyGrandChild")

      Dim sb As New StringBuilder()
      For Each node As XmlNode In elmts
         RecurseNodes(node, 0, sb)
      Next
      txtLog.Text = sb.ToString()
   End Sub

Sample of C# Code

private void selectNodesToolStripMenuItem_Click(
   object sender, EventArgs e)
{
   var xmlDoc = new XmlDocument();
   xmlDoc.Load(getFilePath("XmlDocumentTest.xml"));

   var elmts = xmlDoc.SelectNodes("//MyGrandChild");

   var sb = new StringBuilder();
   foreach (XmlNode node in elmts)
   {
      RecurseNodes(node, 0, sb);
   }
   txtLog.Text = sb.ToString();
}

This method can perform an XPath lookup on any XML node, including elements, attributes, and text nodes. This provides much more querying flexibility, because the SelectElementsByTagName node is limited to a tag name.

The XmlReader Class

The XmlReader class is an abstract base class that provides methods to read and parse XML. One of the more common child classes of the XmlReader is XmlTextReader, which reads an XML file node by node.

The XmlReader class provides the fastest and least memory-consuming means to read and parse XML data by providing forward-only, noncaching access to an XML data stream. This class is ideal when it’s possible that the desired information is near the top of the XML file and the file is large. If random access is required when accessing XML, use the XmlDocument class. The following code reads the XML file that was created in the previous example and displays information about each node:

Sample of Visual Basic Code

Private Sub ParsingWithXmlReaderToolStripMenuItem_Click( _
      ByVal sender As System.Object, ByVal e As System.EventArgs) _
      Handles ParsingWithXmlReaderToolStripMenuItem.Click

   Dim sb As New StringBuilder()
   Dim xmlReader As New  _
      XmlTextReader(getFilePath("XmlDocumentTest.xml"))

   Do While xmlReader.Read()
      Select Case xmlReader.NodeType
         Case XmlNodeType.XmlDeclaration, _
          XmlNodeType.Element, _
          XmlNodeType.Comment
            Dim s As String
            sb.AppendFormat("{0}: {1} = {2}", _
               xmlReader.NodeType, _
               xmlReader.Name, _
               xmlReader.Value)
            sb.AppendLine()
         Case XmlNodeType.Text
            Dim s As String
            sb.AppendFormat(" - Value: {0}", _
              xmlReader.Value)
            sb.AppendLine()
      End Select

      If xmlReader.HasAttributes Then
         Do While xmlReader.MoveToNextAttribute()
            sb.AppendFormat(" - Attribute: {0} = {1}", _
              xmlReader.Name, xmlReader.Value)
            sb.AppendLine()
         Loop
      End If
   Loop
   xmlReader.Close()
   txtLog.Text = sb.ToString()
End Sub

Sample of C# Code

private void parsingWithXmlReaderToolStripMenuItem_Click(object sender, EventArgs e)
{
   var sb = new StringBuilder();
   var xmlReader = new XmlTextReader(getFilePath("XmlDocumentTest.xml"));

   while (xmlReader.Read())
   {
      switch (xmlReader.NodeType)
      {
         case XmlNodeType.XmlDeclaration:
         case XmlNodeType.Element:
         case XmlNodeType.Comment:
            sb.AppendFormat("{0}: {1} = {2}",
                              xmlReader.NodeType,
                              xmlReader.Name,
                              xmlReader.Value);
            sb.AppendLine();
            break;
         case XmlNodeType.Text:
            sb.AppendFormat(" - Value: {0}", xmlReader.Value);
            sb.AppendLine();
            break;
      }
      if (xmlReader.HasAttributes)
      {
         while (xmlReader.MoveToNextAttribute())
         {
            sb.AppendFormat(" - Attribute: {0} = {1}",
                              xmlReader.Name,
                              xmlReader.Value);
            sb.AppendLine();
         }
      }
   }
   xmlReader.Close();
   txtLog.Text = sb.ToString();
}

This code opens the file and then performs a simple loop, reading one element at a time until finished. For each node read, a check is made on NodeType, and the node information is printed. When a node is read, its corresponding attributes are read as well. A check is made to see whether the node has attributes, and they are displayed. The following is the result of the sample code execution.

Parse Result

XmlDeclaration: xml = version="1.0" encoding="utf-8"
 - Attribute: version = 1.0
 - Attribute: encoding = utf-8
Element: MyRoot =
Element: MyChild =
 - Attribute: ID = 1
Element: MyGrandChild =
Element: MyGrandChild =
Element: MyGrandChild =
Element: MyChild =
 - Attribute: ID = 2
Element: MyGrandChild =
Element: MyGrandChild =
Element: MyGrandChild =
Element: MyChild =
 - Attribute: ID = 3
Element: MyGrandChild =
Element: MyGrandChild =
Element: MyGrandChild =
Element: MyChild =
 - Attribute: ID = 4
Element: MyGrandChild =
Element: MyGrandChild =
Element: MyGrandChild =

When viewing the results, notice that many lines end with an equals sign because none of the nodes contained text. The MyChild elements have attributes that are displayed.

Practice Work with the XmlDocument and XmlReader Classes

In this practice, you analyze an XML file, called Orders.xml, which contains order information. Your first objective is to write a program that can provide the total price of all orders. You also need to provide the total and the average freight cost, per order. Here is an example of what the file looks like.

Orders.xml File

<Orders>
  <Order OrderNumber="SO43659">
    <LineItem Line="1" PID="349" Qty="1" Price="2024.9940" Freight="50.6249" />
    <LineItem Line="2" PID="350" Qty="3" Price="2024.9940" Freight="151.8746" />
    <LineItem Line="3" PID="351" Qty="1" Price="2024.9940" Freight="50.6249" />
    <LineItem Line="4" PID="344" Qty="1" Price="2039.9940" Freight="50.9999" />
    <LineItem Line="5" PID="345" Qty="1" Price="2039.9940" Freight="50.9999" />
    <LineItem Line="6" PID="346" Qty="2" Price="2039.9940" Freight="101.9997" />
    <LineItem Line="7" PID="347" Qty="1" Price="2039.9940" Freight="50.9999" />
    <LineItem Line="8" PID="229" Qty="3" Price="28.8404" Freight="2.1630" />
    <LineItem Line="9" PID="235" Qty="1" Price="28.8404" Freight="0.7210" />
    <LineItem Line="10" PID="218" Qty="6" Price="5.7000" Freight="0.8550" />
    <LineItem Line="11" PID="223" Qty="2" Price="5.1865" Freight="0.2593" />
    <LineItem Line="12" PID="220" Qty="4" Price="20.1865" Freight="2.0187" />
  </Order>
  <Order OrderNumber="SO43660">
    <LineItem Line="1" PID="326" Qty="1" Price="419.4589" Freight="10.4865" />
    <LineItem Line="2" PID="319" Qty="1" Price="874.7940" Freight="21.8699" />
  </Order>
<!--Many more orders here -->
</Orders>

Your second objective is to determine whether it’s faster to use XmlDocument or XmlReader to retrieve this data, because you need to process many of these files every day, and performance is critical.

This practice is intended to focus on the features that have been defined in this lesson, so a Console Application project will be implemented. The first exercise implements the solution based on XmlDocument, whereas the second exercise implements the solution based on XmlReader.

If you encounter a problem finishing an exercise, the completed projects can be installed from the Code folder on the companion CD.

EXERCISE 1 Creating the Project and Implementing the XmlDocument Solution

In this exercise, you create a Console Application project and add code to retrieve the necessary data by using the XmlDocument class.

  1. In Visual Studio .NET 2010, choose File | New | Project.

  2. Select your desired programming language and then the Console Application template. For the project name, enter OrderProcessor. Be sure to select a desired location for this project. For the solution name, enter OrderProcessorSolution. Be sure that Create Directory For Solution is selected and then click OK.

    After Visual Studio .NET finishes creating the project, Module1.vb (C# Program.cs) will be displayed.

  3. In Main, declare a string variable for your file name and assign “Orders.xml” to it. Add the parseWithXmlDocument method and pass the file name as a parameter. Add this method to your code. Finally, add code to prompt the user to press Enter to end the application. Your code should look like the following:

    Sample of Visual Basic Code

    Module Module1
       Sub Main()
          Dim fileName = "Orders.xml"
          parseWithXmlDocument(fileName)
    
          Console.Write("Press <Enter> to end")
          Console.ReadLine()
       End Sub
    
       Private Sub parseWithXmlDocument(ByVal fileName As String)
    
       End Sub
    
    End Module

    Sample of C# Code

    namespace OrderProcessor
    {
       class Program
       {
          static void Main(string[] args)
          {
             string fileName = "Orders.xml";
             parseWithXmlDocument(fileName);
    
             Console.Write("Press <Enter> to end");
             Console.ReadLine();
          }
    
          private static void parseWithXmlDocument(string fileName)
          {
    
          }
       }
    }
  4. Add the Orders.xml file to your project. Right-click the OrderProcessor node in Solution Explorer and choose Add | Existing Item. In the bottom right corner of the Add Existing Item dialog box, click the drop-down list and select All Files (*.*). Navigate to the Begin folder for this exercise and select Orders.xml. If you don’t see the Orders.xml file, check whether All Files (*.*) has been selected.

  5. In Solution Explorer, click the Orders.xml file you just added. In the Properties window, set the Copy to Output Directory property to Copy If Newer.

    Because this file will reside in the same folder as your application, you will be able to use the file name without specifying a path.

  6. In the parseWithXmlDocument method, instantiate a Stopwatch and assign the object to a variable. Start the stopwatch. Declare variables for the total order price, total freight cost, average freight cost, and order count. In C#, add using System.Diagnostics; to the top of the file. Your code should look like the following:

    Sample of Visual Basic Code

    Private Sub parseWithXmlDocument(ByVal fileName As String)
          Dim sw = New Stopwatch()
          sw.Start()
          Dim totalOrderPrice As Decimal = 0
          Dim totalFreightCost As Decimal = 0
          Dim orderQty As Integer = 0
    
    End Sub

    Sample of C# Code

    private static void parseWithXmlDocument(string fileName)
    {
       var sw = new Stopwatch();
       sw.Start();
       decimal totalOrderPrice = 0;
       decimal totalFreightCost = 0;
       decimal orderQty = 0;
    
    }
  7. Add code to load the Orders.xml file into an XmlDocument object. You must also add imports System.Xml.Linq (C# using System.Xml.Linq;). Add code to get the order count by implementing an XPath query to get the Order elements and the count of elements returned. Your code should look like the following:

    Sample of Visual Basic Code

    Dim doc = New XmlDocument()
    doc.Load(fileName)
    orderQty = doc.SelectNodes("//Order").Count

    Sample of C# Code

    var doc = new XmlDocument();
    doc.Load(fileName);
    orderQty = doc.SelectNodes("//Order").Count;
  8. Add code to retrieve a node list containing all the line items by implementing an XPath query. Loop over all the line items and retrieve the freight and line price (quantity x price). Add the line price and the freight to the total order price. Add the freight to the total freight price. Your code should look like the following:

    Sample of Visual Basic Code

    For Each node As XmlNode In doc.SelectNodes("//LineItem")
       Dim freight = CDec(node.Attributes("Freight").Value)
       Dim linePrice = CDec(node.Attributes("Price").Value) _
                         * CDec(node.Attributes("Qty").Value)
       totalOrderPrice += linePrice + freight
       totalFreightCost += freight
    Next

    Sample of C# Code

    foreach (XmlNode node in doc.SelectNodes("//LineItem"))
    {
       var freight = decimal.Parse(node.Attributes["Freight"].Value);
       var linePrice = decimal.Parse(node.Attributes["Price"].Value)
          * decimal.Parse(node.Attributes["Qty"].Value);
       totalOrderPrice += linePrice + freight;
       totalFreightCost += freight;
    }
  9. Add code to display the total order price, the total freight cost, and the average freight cost per order. Stop the stopwatch and display the elapsed time. Your completed method should look like the following:

    Sample of Visual Basic Code

    Private Sub parseWithXmlDocument(ByVal fileName As String)
       Dim sw = New Stopwatch()
       sw.Start()
       Dim totalOrderPrice As Decimal = 0
       Dim totalFreightCost As Decimal = 0
       Dim averageFreightCost As Decimal = 0
       Dim orderQty As Integer = 0
    
       Dim doc = New XmlDocument()
       doc.Load(fileName)
       orderQty = doc.SelectNodes("//Order").Count
    
       For Each node As XmlNode In doc.SelectNodes("//LineItem")
          Dim freight = CDec(node.Attributes("Freight").Value)
          Dim linePrice = CDec(node.Attributes("Price").Value) _
                          * CDec(node.Attributes("Qty").Value)
          totalOrderPrice += linePrice + freight
          totalFreightCost += freight
       Next
    
       Console.WriteLine("Total Order Price: {0:C}", totalOrderPrice)
       Console.WriteLine("Total Freight Cost: {0:C}", totalFreightCost)
       Console.WriteLine("Average Freight Cost per Order: {0:C}", _
                         totalFreightCost / orderQty)
    
       sw.Stop()
       Console.WriteLine("Time to Parse XmlDocument: {0}", sw.Elapsed)
    End Sub

    Sample of C# Code

    private static void parseWithXmlDocument(string fileName)
    {
       var sw = new Stopwatch();
       sw.Start();
       decimal totalOrderPrice = 0;
       decimal totalFreightCost = 0;
       decimal averageFreightCost = 0;
       decimal orderQty = 0;
    
       var doc = new XmlDocument();
       doc.Load(fileName);
       orderQty = doc.SelectNodes("//Order").Count;
    
       foreach (XmlNode node in doc.SelectNodes("//LineItem"))
       {
          var freight = decimal.Parse(node.Attributes["Freight"].Value);
          var linePrice = decimal.Parse(node.Attributes["Price"].Value)
             * decimal.Parse(node.Attributes["Qty"].Value);
          totalOrderPrice += linePrice + freight;
          totalFreightCost += freight;
       }
    
       Console.WriteLine("Total Order Price: {0:C}", totalOrderPrice);
       Console.WriteLine("Total Freight Cost: {0:C}", totalFreightCost);
       Console.WriteLine("Average Freight Cost per Order: {0:C}",
          totalFreightCost/orderQty);
    
       sw.Stop();
       Console.WriteLine("Time to Parse XmlDocument: {0}", sw.Elapsed);
    }
  10. Run the application. Your total time will vary based on your machine configuration, but your output should look like the following:

    Result

    Total Order Price: $82,989,370.79
    Total Freight Cost: $2,011,265.92
    Average Freight Cost per Order: $529.84
    Time to Parse XmlDocument: 00:00:01.3775482
    Press <Enter> to end

EXERCISE 2 Implementing the XmlReader Solution

In this exercise, you extend the Console Application project from Exercise 1 by adding code to retrieve the necessary data using the XmlReader class.

  1. In Visual Studio .NET 2010, choose File | Open | Project.

  2. Select the project you created in Exercise 1.

  3. In Main, after the call to parseWithXmlDocument, add the parseWithXmlReader method and pass the file name as a parameter. Add this method to your code. Your code should look like the following:

    Sample of Visual Basic Code

    Sub Main()
       Dim fileName = "Orders.xml"
       parseWithXmlDocument(fileName)
       parseWithXmlReader(fileName)
       Console.Write("Press <Enter> to end")
       Console.ReadLine()
    End Sub
    
    Private Sub parseWithXmlReader(ByVal fileName As String)
    
    End Sub

    Sample of C# Code

    static void Main(string[] args)
    {
       string fileName = "Orders.xml";
       parseWithXmlDocument(fileName);
       parseWithXmlReader(fileName);
       Console.Write("Press <Enter> to end");
       Console.ReadLine();
    }
    
    private static void parseWithXmlReader(string fileName)
    {
    
    }
  4. In the parseWithXmlReader method, instantiate Stopwatch and assign the object to a variable. Start the stopwatch. Declare variables for the total order price, total freight cost, and order count. Your code should look like the following:

    Sample of Visual Basic Code

    Private Sub parseWithXmlReader(ByVal fileName As String)
          Dim sw = New Stopwatch()
          sw.Start()
          Dim totalOrderPrice As Decimal = 0
          Dim totalFreightCost As Decimal = 0
          Dim orderQty As Integer = 0
    
    End Sub

    Sample of C# Code

    private static void parseWithXmlReader(string fileName)
    {
       var sw = new Stopwatch();
       sw.Start();
       decimal totalOrderPrice = 0;
       decimal totalFreightCost = 0;
       decimal orderQty = 0;
    
    }
  5. Add a using statement to instantiate an XmlTextReader object and assign the object to a variable named xmlReader. In the using statement, add a while loop to iterate over all nodes. In the loop, add code to check the node type to see whether it is an element. Your code should look like the following:

    Sample of Visual Basic Code

    Using xmlReader As New XmlTextReader(fileName)
       Do While xmlReader.Read()
          If xmlReader.NodeType = XmlNodeType.Element Then
    
          End If
       Loop
    End Using

    Sample of C# Code

    using (var xmlReader = new XmlTextReader(fileName))
    {
       while (xmlReader.Read())
       {
          if(xmlReader.NodeType==XmlNodeType.Element)
          {
    
          }
       }
    }
  6. Inside the if statement, add a select (C# switch) statement that increments the order quantity variable if the element’s node name is Order. If the node name is LineItem, add code to retrieve the quantity, price, and freight. Add the freight to the total freight and add the total cost of the line to the total order cost variable. Your code should look like the following:

    Sample of Visual Basic Code

    Select Case xmlReader.Name
       Case "Order"
          orderQty += 1
       Case "LineItem"
          Dim qty = CDec(xmlReader.GetAttribute("Qty"))
          Dim price = CDec(xmlReader.GetAttribute("Price"))
          Dim freight = CDec(xmlReader.GetAttribute("Freight"))
          totalFreightCost += freight
          totalOrderPrice += (qty * price) + freight
    End Select

    Sample of C# Code

    switch(xmlReader.Name)
    {
       case "Order":
          ++orderQty;
          break;
       case "LineItem":
          var qty = decimal.Parse(xmlReader.GetAttribute("Qty"));
          var price = decimal.Parse(xmlReader.GetAttribute("Price"));
          var freight = decimal.Parse(xmlReader.GetAttribute("Freight"));
          totalFreightCost += freight;
          totalOrderPrice += (qty * price) + freight;
          break;
    }
  7. Add code to display the total order price, the total freight cost, and the average freight cost per order. Stop the stopwatch and display the elapsed time. Your completed method should look like the following:

    Sample of Visual Basic Code

    Private Sub parseWithXmlReader(ByVal fileName As String)
       Dim sw = New Stopwatch()
       sw.Start()
       Dim totalOrderPrice As Decimal = 0
       Dim totalFreightCost As Decimal = 0
       Dim averageFreightCost As Decimal = 0
       Dim orderQty As Integer = 0
    
       Using xmlReader As New XmlTextReader(fileName)
          Do While xmlReader.Read()
             If xmlReader.NodeType = XmlNodeType.Element Then
                Select Case xmlReader.Name
                   Case "Order"
                      orderQty += 1
                   Case "LineItem"
                      Dim qty = CDec(xmlReader.GetAttribute("Qty"))
                      Dim price = CDec(xmlReader.GetAttribute("Price"))
                      Dim freight = CDec(xmlReader.GetAttribute("Freight"))
                      totalFreightCost += freight
                      totalOrderPrice += (qty * price) + freight
                End Select
             End If
          Loop
       End Using
    
       Console.WriteLine("Total Order Price: {0:C}", totalOrderPrice)
       Console.WriteLine("Total Freight Cost: {0:C}", totalFreightCost)
       Console.WriteLine("Average Freight Cost per Order: {0:C}", _
                         totalFreightCost / orderQty)
       sw.Stop()
       Console.WriteLine("Time to Parse XmlReader: {0}", sw.Elapsed)
    End Sub

    Sample of C# Code

    private static void parseWithXmlReader(string fileName)
    {
       var sw = new Stopwatch();
       sw.Start();
       decimal totalOrderPrice = 0;
       decimal totalFreightCost = 0;
       decimal averageFreightCost = 0;
       decimal orderQty = 0;
    
       using (var xmlReader = new XmlTextReader(fileName))
       {
          while (xmlReader.Read())
          {
             if (xmlReader.NodeType == XmlNodeType.Element)
             {
                switch (xmlReader.Name)
                {
                   case "Order":
                      ++orderQty;
                      break;
                   case "LineItem":
                      var qty = decimal.Parse(xmlReader.GetAttribute("Qty"));
                      var price = decimal.Parse(xmlReader.GetAttribute("Price"));
                      var freight = decimal.Parse(
                         xmlReader.GetAttribute("Freight"));
                      totalFreightCost += freight;
                      totalOrderPrice += (qty * price) + freight;
                      break;
                }
             }
          }
       }
       Console.WriteLine("Total Order Price: {0:C}", totalOrderPrice);
       Console.WriteLine("Total Freight Cost: {0:C}", totalFreightCost);
       Console.WriteLine("Average Freight Cost per Order: {0:C}",
          totalFreightCost / orderQty);
    
       sw.Stop();
       Console.WriteLine("Time to Parse XmlReader: {0}", sw.Elapsed);
    }
  8. Run the application. Your total time will vary based on your machine configuration, but you should find that XmlReader is substantially faster. Your output should look like the following, which includes the result from Exercise 1.

    Result

    Total Order Price: $82,989,370.79
    Total Freight Cost: $2,011,265.92
    Average Freight Cost per Order: $529.84
    Time to Parse XmlDocument: 00:00:01.2218770
    Total Order Price: $82,989,370.79
    Total Freight Cost: $2,011,265.92
    Average Freight Cost per Order: $529.84
    Time to Parse XmlReader: 00:00:00.5919724
    Press <Enter> to end

Lesson Summary

This lesson provided detailed information about the XmlDocument and the XmlReader classes.

  • The XmlDocument class provides in-memory, random, read-write access to XML nodes.

  • The XmlReader class provides fast-streaming, forward-only, read-only access to XML nodes.

  • The XmlDocument class is easier to use, whereas the XmlReader class is faster.

  • The XmlDocument class enables you to retrieve XML nodes by using the element name.

  • The XmlDocument class enables you to retrieve XML nodes by using an XPath query.

Lesson Review

You can use the following questions to test your knowledge of the information in Lesson 1, “Working with the XmlDocument and XmlReader Classes”. The questions are also available on the companion CD if you prefer to review them in electronic form.

  1. Given an XML file, you want to run several queries for data in the file based on filter criteria the user will be entering for a particular purpose. Which class would be more appropriate for these queries on the file?

    1. XmlDocument

    2. XmlReader

  2. Every day, you receive hundreds of XML files. You are responsible for reading the file to retrieve sales data from the file and store it into a SQL database. Which class would be more appropriate for these queries on the file?

    1. XmlDocument

    2. XmlReader

  3. You have a service that receives a very large XML-based history file once a month. These files can be up to 20GB in size, and you need to retrieve the header information that contains the history date range and the customer information on which this file is based. Which class would be more appropriate to retrieve the data in these files?

    1. XmlDocument

    2. XmlReader