C# parsing XML file

1. The structure of the XML file

Related nouns involved are: Node, Element, Attribute, Content

See:

Example XML file:

<bookstore>
<book category="CHILDREN">
  <title>Harry Potter</title> 
  <author>J K. Rowling</author> 
  <year>2005</year> 
  <price>29.99</price> 
</book>
<book category="WEB">
  <title>Learning XML</title> 
  <author>Erik T. Ray</author> 
  <year>2003</year> 
  <price>39.95</price> 
</book>
</bookstore> 

(1) Node: It can be understood that each <>.XML node represents a single XML fragment, for example, the start element and its attributes, the end element, text or "typed" text content, such as an integer or byte array. 

(2) Element (Element): The part from the start tag to the end tag, such as <book category="CHILDREN"> to </book> in the above figure.

(3) Attribute (Attribute): The category="CHILDREN" behind <book category="CHILDREN"> is the attribute.

(4) Content: For example, the content inside the <year> and <price> tags.

2. Parse the XML file

There are currently two ways to parse XML files through C#: XmlDocument and XmlReader.

(1)XmlDocument

See: XmlDocument Class (System.Xml) | Microsoft Learn

①Features:

Advantages: XML files can be added, deleted, checked, modified, etc.

Disadvantages: Need to load the entire file into memory, parsing large XML files will be slower

② Code implementation (not directly available)

using System.Xml;

XmlDocument xmlDocument = new XmlDocument();//实例化
xmlDocument.Load(filepath);//加载xml文件,文件路径为绝对路径
XmlNode xmlNode = xmlDocument.SelectSingleNode(xpath);//利用节点的层级关系选择节点
XmlNodeList xmlNodeList = xmlNode.ChildNodes;//将节点下的子节点集合成一个列表

(2)XmlReader

See: xml::XmlReader class | Microsoft Learn

① Features

Pros: read-only, fast

Disadvantages: It is impossible to add, delete, check and modify XML files, and it is a one-way read

② Code implementation (not directly available)

XmlReaderSettings xmlReaderSettings = new XmlReaderSettings();
//使用 Create 该方法获取 XmlReader 实例。 
//此方法使用 XmlReaderSettings 类指定要在创建的对象中 XmlReader 实现的功能。
xmlReaderSettings.IgnoreWhitespace = true;//忽略空白
xmlReaderSettings.IgnoreComments = true;//忽略注释
using (XmlReader xmlReader = XmlReader.Create(filepath, xmlReaderSettings))
{
    while (xmlReader.Read())
    {
        //写需要进行的操作
    }

3. Idea development

(1) Purely using XmlDocument: the speed of parsing large files starts to be unacceptably slow (it takes about three minutes to run on my own computer)

(2) Pure XmlReader: faster than pure XmlDocument, but still a bit slow, so I started to try to find some tricky ways based on my understanding of the XML file structure

(4) Current thinking

①Read the XML file into the memory through XmlDocument, and use the SelectNodes method to gather all the frames into an XmlNodeList

②Use the XmlNodeReader method to read in each node as needed

Guess you like

Origin blog.csdn.net/simplenthpower/article/details/128669633