(13) XML in-depth, SGML, Rss, Html differences, Json, Dom, XmlSerializer, XmlDocument and XDocument reading and writing, recursive loading, examples

 


1. XML


    1. Understand SGML
        
            SGML (Standard Generalized Markup Language) is a standard general markup language, which is 
        the predecessor of HTML and XML. SGML is a meta-markup language that defines a set of rules and syntax for creating
        other markup languages, such as HTML and XML.
            The main goal of SGML is to provide a common markup language that enables users to define their own markup language and
        define document structure and markup according to their needs. SGML's syntax is very flexible, allowing you to define tags, elements, and attributes as needed
        , and allows users to create custom entities and document type definitions (DTDs).
            Key features of SGML include:
        Separation of content and presentation:
            SGML allows the structure and content of a document to be separated from its presentation. This allows the document to be presented differently in different environments
            without modifying the document's content.
        Extensibility:
            SGML allows users to define markup and document structure according to their needs.
            This allows users to create custom markup languages ​​and document types based on their needs.
        Reusability:
            SGML allows users to define entities and entity references to share and reuse content fragments across multiple documents. This
            improves document maintainability and reusability.
            
            Although SGML is a powerful markup language, its syntax is complex and bulky, making it unsuitable for direct use.
        Therefore, subset markup languages ​​such as HTML and XML were developed to simplify and constrain the syntax of SGML,
        making it easier to use and understand.
    
    
    2. XML terminology:
        
            XML uses tags, nodes, elements, sub-elements, descendant elements, attributes, namespaces and character data to
        describe and organize data.

        <bookstore>
          <book>
            <title>Harry Potter</title>
            <author>J.K. Rowling</author>
          </book>
          <book>
            <title>The Lord of the Rings</title>
            <author>J.R.R. Tolkien</author>
            <publisher>HarperCollins</publisher>
          </book>
        </bookstore>


        Tag:
            A tag in XML is a name surrounded by angle brackets < > and is used to identify an element or node. Tags can be a start
            tag <tag>, an end tag </tag>, or a self-closing tag <tag />.
        Node:
            Node in XML is the basic unit in XML document. Nodes can be elements, attributes , comments, processing instructions , etc.
        Element:
            An element in XML is composed of a start tag and an end tag, and can contain other elements, text, and attributes.
            Elements consist of tags and content, such as <book>Harry Potter</book>.
        Child Element:
            A child element in XML is an element located inside a parent element. For example, in the <bookstore> element, the <book>
            element is a child element of <bookstore>.
        Descendant Element:
            A descendant element in XML refers to all elements located below an element, whether they are direct child elements or
            further nested child elements.
        Attribute:
            Attributes in XML are additional information about an element, used to provide additional data about the element. Properties are grouped by name and value
            into, for example <book genre="fantasy">.
        Namespace:
            Namespace in XML is used to distinguish elements and attributes with the same name. Namespaces are
            identified by using a prefix, for example, xs in <xs:element> is the namespace prefix.
        Character Data:
            Character data in XML refers to text content, which can be the text content of elements, comments or processing instructions, etc. For example
            , Harry Potter in the <book> element is character data.
            CDATA is a special syntax in XML that is used to escape text data containing special characters (such as < and >) to
            avoid being parsed as tags or other special syntax.
            <description><![CDATA[This is a <b>bold</b> statement.
                ]]></description> The text data in the <description> element is wrapped in a CDATA block. This means that <b>bold</b>
            is not parsed as a tag, but is handled as plain text data.
                Note that in the CDATA block, only the character sequence ]]> needs special processing, because it will be
            mistaken for the CDATA end tag. To avoid this, you can use a trick like ]]]]><![CDATA[>,
            Split ]]> into two chunks.
    
        Question: What is the misunderstanding caused by the above ]]>?
        Answer: In the CDATA block, only the character sequence ]]> will be mistaken as the CDATA end mark. Therefore, if a CDATA
        block contains the character sequence ]]>, the parser will mistakenly regard it as the end of the CDATA block, resulting in a parsing error.
            <example>
              <![CDATA[This is some ]]> data]]>
            </example>
            The text in the CDATA block above is "This is some ]]> data". Since the character sequence ]]> appears in
        a CDATA block, the parser will mistakenly regard it as the end mark of the CDATA block, resulting in a parsing error.
            To avoid this, you can use a trick to split the ]]> into two chunks. For example:
            <example>
              <![CDATA[This is some ]]]]><![CDATA[> data]]>
            </example>
            The above CDATA block is divided into two parts: "This is some ]]]> " and"><!

            Summary: If a CDATA block contains the character sequence ]]>, the parser will mistakenly regard it as the
        end , resulting in a parsing error. To avoid this, you can use a trick to split the ]]> into two chunks.
    
    
    3. Understand RSS
    
            RSS (Really Simple Syndication) is a standard format for publishing and subscribing to website content. It
        is an XML-based format used to deliver website updates to subscribers in a structured manner.
            The main purpose of RSS is to allow users to easily get the latest updates on website content that interests them without having to
        browse each website. By subscribing to RSS feeds, users can use an RSS reader or other application to automatically receive
        and read the latest content from the website.
        RSS works as follows:
            1. The website owner publishes the website's content in RSS format to a specific URL, called an RSS feed.
            2. The user uses an RSS reader to subscribe to RSS feeds of interest.
            3. When the content of the website is updated, the website owner will publish the updated content to the RSS feed.
            4. The user's RSS reader will regularly check the subscribed RSS feeds. If there is new content, it will be automatically downloaded
                and displayed to the user.
        Advantages and uses of RSS include:
            1. Convenient subscription: Users can obtain content updates from multiple websites by subscribing to RSS feeds without having to
                        visit .
            2. Get information in a timely manner: Users can get the latest content of websites of interest in real time without waiting for push or manual
                        search.
            3. Customized content: Users can choose the RSS feeds to subscribe to according to their own interests and needs, and only receive
                        content they are interested in.
            4. Cross-platform and device: RSS readers can be used on different platforms and devices, such as desktop computers, mobile phones and
                        tablets.
            To sum up, RSS is a standard format for publishing and subscribing website content. By subscribing to RSS feeds, users
        can easily obtain the latest content updates of websites of interest and improve the efficiency of information acquisition.
    
    
    4. Question: How to use notpad++ to directly call the browser to view xml or html?
        
        Answer: It is very annoying to switch the viewing effect every time. You can directly use notpad++ to call the browser to view the effect.
            Click Run-Run in notpad++, enter the following format in the command:
                full path of the browser + space + full path of htm or xml,
            then click Save and set the shortcut keys. Here, I set it to Ctrl+F5. After each edit, press the shortcut key
        to view it directly.
        
        Question: What does <name/> mean in XML?
        Answer: <name/> represents an empty element that contains no content, that is, it does not have any value or text content.
            Generally, XML tags appear in pairs. For those without any content, they can be abbreviated as above, indicating self-closing tags
        without any content. Equivalent to <name></name>.
        
        Question: When the attribute is in single quotes, no error is reported. Why?
        Answer: That is browser compatibility. If it is another browser, an error may be reported. And most of the XML parsers are
        parsed according to standard standards (double quotation marks). Therefore, for attributes, please use double quotes strictly according to the specification.
        
        Q: How to annotate?
        Answer: Use <!--Content--> to comment. Although the browser may display it, it is not a node or label.
        
        
        Question: HTML escape characters cannot be used directly in XML?
        Answer: Yes, although somewhat universal, it is not necessarily true. Note the comparison:
        XML escape characters:
            & (ampersand): &
            < (less than sign): <
            > (greater than sign): >
            " (double quote): "
            ' (single quote): '
        <root >
            <message>This is a <test> message.</message>
        </root>
        htm escape characters:
            <: less than sign<
            >: greater than sign>
            &: ampersand
            ": double quotation mark"
            ': single quotation mark'
            : non-breaking space
            ©: copyright symbol ©
            ®: registered trademark symbol ®
            €: euro symbol €
            £: pound symbol £
            ¥: Japanese yen symbol ¥
            If you use the registered trademark ® of html in XML, an error will be reported.
            
    
    5. It is recommended to use lowercase letters for general html tags , but XML does not have this constraint due to case sensitivity.
    
            The case of HTML tags will not affect the compilation and execution of the code, which means that you can write HTML tags using any combination of case
        , and the behavior of the code will not be affected.
            However, According to the HTML specification and best practices, it is recommended to use all lowercase letters for HTML tags. This is
        because HTML tags are defined as lowercase letters in the specification, and most developers and tools follow this convention. Use
        Using lowercase tags makes your code more readable and consistent with other developers' code, making the code easier
        to understand and maintain.
 

        string html = @"
            <html>
                <head>
                    <title>My Web Page</title>
                </head>
                <body>
                    <h1>Welcome to my web page</h1>
                    <p>This is a paragraph of text.</p>
                </body>
            </html>

        ";


            With XML, case is sensitive and there is no requirement to use lowercase letters for tags. According to the XML 
        specification, tag names can use any combination of uppercase and lowercase letters.
            So, when writing XML files, you can choose to use uppercase letters, lowercase letters, or
        mixed . However, to be consistent with common XML specifications and best practices, it is recommended to
        write .

        <root>
            <person>
                <name>John Doe</name>
                <age>30</age>
            </person>
            <person>
                <name>Jane Smith</name>
                <age>25</age>
            </person>
        </root>   

 
        
    
 

    6. Question: Why do websites generally use json instead of xml when transmitting large amounts of data?
        Answer: The tag nesting and redundancy in XML format may cause slower processing speed. XML parsers need to traverse the entire document
        tree to parse and extract data, which can cause performance degradation.
            The JSON format has a more concise structure and is easier to parse and process than the XML format. JSON's
        data storage method is key-value pairs, which can quickly access and filter data by key. The Newtonsoft.Json library in C#
        provides efficient JSON parsing and manipulation methods, which can quickly extract and process JSON data.
             CSV, or text-delimited formats, have a simple structure and smaller file sizes, so they are
        generally faster to parse and process. The built-in StreamReader and Split methods in C# make it easy to parse and
        process these formats.
            For CSV or text data, you usually need to manually parse and process each row, convert it into an object or data
        structure , and then use LINQ query syntax to query and operate on these objects.
            In short: select the same data file that already exists in multiple formats. If you need filtering, statistics and other functions, it is best to choose
        JSON to facilitate C# direct use of Linq operations.


2. Reading and writing XML


    1. DOM (Document Object Model)
    
        DOM is a programming interface for accessing and manipulating HTML or XML documents. DOM provides a
        way to represent documents as objects, allowing developers to programmatically modify the structure, content, and style of the document.
            There are also some disadvantages in using DOM to operate HTML or XML documents, including memory consumption (the entire document is loaded into memory),
        performance issues (the performance is not as good as other processing methods, such as SAX parsers), and the complexity of performing complex operations (may require writing
        longer A lot of code for navigation and operation between nodes), no verification function (the validity of the document will not be verified,
        which may lead to errors or unexpected behaviors), and it is not suitable for non-standard HTML/XML structures (non-standard HTML or XML structure, DOM
        manipulation may encounter difficulties)
            However, its ease of operation belies its shortcomings.
    
        Syntax:
            Get DOM object: A DOM object, such as XmlDocument or HtmlDocument, can be created by referencing the appropriate namespace such as System.Xml or System.Html
                        .
            Navigate the DOM tree: Use the relationship between nodes to navigate, such as using
                        attributes such as parent nodes, child nodes, sibling nodes, etc. to obtain and operate nodes.
        Properties:
            InnerText: Get the text content of the HTML or XML node. When there are multiple nodes, each text is concatenated into a whole.
            InnerHtml: Gets or sets the internal HTML code (including text) of the HTML node. Only child nodes.
            OuterHtml: Gets or sets the HTML code of the HTML node and its child nodes.
            Attributes: Get the attribute collection of the node. You can access and operate the attributes of the node through this attribute.
        Method:
            GetElementsByTagName: Get the collection of nodes based on the tag name.
            GetElementById: Get an element node based on the element's ID.
            AppendChild: Adds a child node to the parent node.
            RemoveChild: Removes a child node from the parent node.
            SetAttribute: Set the attribute value of the node.
            GetAttribute: Get the attribute value of the node.
            AddEventListener: Add event handler for node.
    
        
        Q: What is the difference between XMLDocument and XDocument?
        Answer: XmlDocument and XDocument are two different classes used to process XML data. Differences:
            1. Namespace: XmlDocument is defined in the System.Xml namespace, while XDocument is
        defined in the System.Xml.Linq namespace. XDocument is part of Linq to XML and provides a simpler
        Cleaner, more modern API.
            2. API and syntax: XmlDocument uses the traditional DOM (Document Object Model) API, while
        XDocument uses Linq to XML API. XDocument provides a set of simpler and more intuitive methods and syntax
        to process XML data, such as using LINQ query statements to query and manipulate XML.
            3. Readability: XDocument is generally easier to read and write than XmlDocument because its syntax is
        closer to XML itself, making it more readable and maintainable.
            4. Function and performance: XmlDocument is more mature and comprehensive in terms of function and performance. It provides a wide range
        of functions and methods to process XML data. Although XDocument has more advantages in simplicity and readability,
        it may suffer some performance losses in some complex scenarios.
            XmlDocument is a traditional XML processing class that provides a wide range of functions and methods suitable for complex XML
        processing tasks. XDocument is part of Linq to XML, providing a simpler and more modern API suitable for
        simple and medium complexity XML processing tasks. Choosing which class to use depends on specific needs and personal preference.
    
    
            Sax (Java event driven, read and write XML), XmlReader (XmlTextReader) is used in .net,
        XmlWriter(XmlTextWriter) instead.
            XmlReader and XmlWriter are stream-based, lightweight XML parsers that read and write XML data sequentially and
        notify applications through events to process different nodes. Similar to SAX parsers, XmlReader does not
        load the entire XML document into memory, but parses it node by node in a streaming manner. As a result, the code can become lengthy and cumbersome.
        This node-by-node processing may not be suitable for processing complex XML structures or situations that require frequent jumps and operations.
        
    
    2. XMLSerializer serializer (review)
        
        The XmlSerializer class is a serializer used to serialize objects into XML format. It can convert the object's public
        fields and attributes into XML elements, and convert the object's value into the content of the XML element.

        internal class Program
        {
            private delegate void Mydelegate(string s);

            private static void Main(string[] args)
            {
                Person p = new Person() { Name = "基金", Age = 5 };
                XmlSerializer xmlser = new XmlSerializer(typeof(Person));
                StringWriter swriter = new StringWriter();
                xmlser.Serialize(swriter, p);

                string sp = swriter.ToString();
                Console.WriteLine(sp);

                StringReader sreader = new StringReader(sp);
                Person p1 = (Person)xmlser.Deserialize(sreader);
                Console.WriteLine(p1.Name);

                Console.ReadKey();
            }
        }

        public class Person
        {
            public int Age { get; set; } = 19;
            public string Name { get; set; } = "刀郎";
        }


        Created an XmlSerializer object and used its Serialize method to serialize the Person object to XML and
        write it to a StringWriter. Then, we get the serialized XML string from the StringWriter and
        output it. Then a StringReader was created to read the XML string and
        deserialized into a Person object using the Deserialize method of XmlSerializer. Finally, we output the property values ​​of the deserialized Person object.
    
    
        Introduction to StringWriter and
            StringReader StringWriter and StringReader provide a way to read and write text in memory and are suitable for
        situations where text needs to be processed as a string, such as string operations, unit testing, logging and data serialization
        . . They provide a more convenient and flexible approach without requiring actual files or streams.
            
            String operations: StringWriter and StringReader provide a convenient way to process text
        data as strings. You can use StringWriter to write text to a string and StringReader
        to read text from a string. This is useful when text needs to be manipulated, processed, or passed around.
            Unit Testing: StringWriter and StringReader are very useful when writing unit tests. you can use it
        StringWriter captures the text output generated in a method or function and compares it to the expected result. Likewise, you
        can use StringReader to read predefined input from a string to simulate user input during testing.
            Logging: StringWriter can be used to construct log messages in memory rather than writing them to an actual
        log file or database. This is useful when generating and processing log messages in an application, especially during development and
        debugging phases.
            Data serialization: StringWriter and StringReader can be used with other serialization mechanisms such as JSON or XML
        to serialize objects to and deserialize objects from strings. You can use a StringWriter
        to serialize an object into a string representation, and then use a StringReader to read and deserialize the object from the string.
        
        StringWriter Class
            The StringWriter class inherits from the TextWriter class, which is an abstract base class for writing characters.
        The StringWriter class overrides some methods in TextWriter to write characters to the internal string buffer
        . This means that StringWriter internally uses a character buffer instead of a byte stream. It provides a convenient
        way to write characters to a string without having to worry about the underlying byte stream.
        Attributes:
            Encoding: Gets the encoding used to write the string.
        Methods:
            Write: Write the specified string into StringWriter.
            WriteLine: Write the specified string and newline character to StringWriter.
            Flush: Clear the StringWriter's buffer and write the contents of the buffer into a string.
            ToString: Convert the content in StringWriter to a string.

        using (StringWriter writer = new StringWriter())
        {
            writer.WriteLine("Hello");
            writer.WriteLine("World");

            string result = writer.ToString();
            Console.WriteLine(result);
        }


    
        StringReader class
            The StringReader class inherits from the TextReader class, which is an abstract base class for reading characters.
        Therefore, it uses a character buffer instead of a byte stream.
        Methods:
            Peek: Returns the next available character without reading it from the input stream. The cursor does not move.
            Read: Reads the next character in the input stream and returns the character as an integer. Cursor movement.
            ReadBlock: Reads the specified number of characters from the input stream and stores them in the buffer array.
            ReadLine: Reads a line of characters from the input stream and returns a string representation of the line's characters.
            ReadToEnd: Reads all characters in the input stream from the current position to the end and returns their string
                        representation .

        string s = "Hello World!";
        using (StringReader srd = new StringReader(s))
        {
            int n;
            while ((n = srd.Read()) != -1)
            {
                Console.WriteLine((char)n);
            }

            Console.WriteLine(srd.ReadToEnd());//a
        }


        Because there is no way to reset the cursor, no content will be output at location a. If you want to reset, you can only redefine StringReader.
    
    
    3. Summary
        
        There are many technologies for reading and writing XML:
        1.Dom[XmlDocument, XDocument] (Document Object Model, load the entire xml into memory, and then operate);
        2.Sax (Yes Java event-driven, read and write XML), use XmlReader (XmlTextReader),
            XmlWriter (XmlTextWriter) instead in .net, as well as advanced reading and writing technology;
        3.XmlSerializer (xml serialization, you need to define the class first);
        4.Linq To XML(System.Xml.Linq) etc.
        
            XmlSerializer requires a set of classes to be defined for each different file, which is very troublesome, but Linq To XML
        does not need to create separate classes. Of course, it is lower level, has more code than XmlSerializer, and is more flexible.
            The classes under System
        .
            Core class XElement, one XElement represents an element, new XElement("Order"), creates a named
        The word is the label of Order. Call Add to add child elements, which are also XElement objects, the same as TreeView.
            What should I do if I want to get a string? ToString
            calls the Save method of XElement to save the xml content in Writer.
            You can use XDocument or not when creating xml. (Use XElement directly)
    
        Exercise: Connect Dom to create the following XML

        <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
        <root>
          <class id="c01">
            <student sid="s011">
              <name>黄林</name>
              <age>18</age>
            </student>
            <student sid="s012">
              <name>许正龙</name>
              <age>19</age>
            </student>
          </class>
        </root>


        The procedure is as follows:

        XmlDocument xmlDoc = new XmlDocument();
        //声明
        XmlDeclaration xmlDec = xmlDoc.CreateXmlDeclaration("1.0", "UTF-8", "yes");//a
        xmlDoc.AppendChild(xmlDec);
        //根结点
        XmlElement xmlRoot = xmlDoc.CreateElement("root");
        xmlDoc.AppendChild(xmlRoot);
        //班结点
        XmlElement xmlClass = xmlDoc.CreateElement("class");
        XmlAttribute attr = xmlDoc.CreateAttribute("id");
        attr.Value = "c01";
        xmlClass.Attributes.Append(attr);
        xmlRoot.AppendChild(xmlClass);
        //学生1结点
        XmlElement xmlStu = xmlDoc.CreateElement("student");
        attr = xmlDoc.CreateAttribute("sid");
        attr.Value = "s011";
        xmlStu.Attributes.Append(attr);
        xmlClass.AppendChild(xmlStu);
        //name,age结点
        XmlElement xmlName = xmlDoc.CreateElement("name");
        xmlName.InnerText = "黄林";
        xmlStu.AppendChild(xmlName);
        XmlElement xmlAge = xmlDoc.CreateElement("age");
        xmlAge.InnerText = "18";
        xmlStu.AppendChild(xmlAge);
        //学生2结点
        xmlStu = xmlDoc.CreateElement("student");
        attr = xmlDoc.CreateAttribute("sid");
        attr.Value = "s012";
        xmlStu.Attributes.Append(attr);
        xmlClass.AppendChild(xmlStu);

        xmlName = xmlDoc.CreateElement("name");
        xmlName.InnerText = "许正龙";
        xmlStu.AppendChild(xmlName);
        xmlAge = xmlDoc.CreateElement("age");
        xmlAge.InnerText = "19";
        xmlStu.AppendChild(xmlAge);

        //保存或显示
        xmlDoc.Save("school.xml");
        Console.WriteLine("OK");


        Add elements one by one, adding subordinate elements to each element.
        Add the statement XmlDocument.CreateXmlDeclaration(version, encoding, standalone ):
            standalone (whether independent) indicates whether the XML document is independent. Common values ​​include "yes" and "no". This
        parameter is of type string.
            "yes": Indicates that the XML document is independent, that is, it does not depend on any external resources.
            "no": Indicates that the XML document is not independent, that is, it depends on external resources.
            When an XML document is treated as self-contained, the parser can fully load and parse the entire document without needing
        to reference or access other external resources. This means that the parser can
        fully .
            When an XML document is treated as dependent, the parser may need to reference or access other external resources, such as
        DTD (Document Type Definition) files or external entities.
        These external resources may contain definitions of entities, elements, or attributes used in the document , or provide the document's structure and validation rules. In this case, the parser needs to be able to access
        these external resources in order to parse and process the document correctly.
            Note: The standalone attribute does not automatically load or handle any external resources. It just provides a reference
        Indicates whether the document is self-contained. Make programmers aware that when the parser processes a document, it may need to
        use other methods or settings to load and process external resources.
            In short: it is just a sign to let people know whether other processing is needed.
        
        
        In order to simplify the above example, use a loop to write, so define a List<Person> to initialize the person information into the
        list, and then loop to write each object one by one:

        internal class Program
        {
            private static void Main(string[] args)
            {
                List<Person> lists = new List<Person>()
                {
                    new Person{ Name= "黄林",Age = 18 },
                    new Person { Name = "许正龙", Age = 19 }
                };
                XmlDocument xmlDoc = new XmlDocument();
                XmlDeclaration xmlDec = xmlDoc.CreateXmlDeclaration("1.0", "UTF-8", null);
                xmlDoc.AppendChild(xmlDec);

                XmlElement xmlRoot = xmlDoc.CreateElement("list");
                xmlDoc.AppendChild(xmlRoot);

                for (int i = 0; i < lists.Count; i++)
                {
                    XmlElement xmlPerson = xmlDoc.CreateElement("Person");
                    XmlAttribute attr = xmlDoc.CreateAttribute("id");
                    attr.Value = (i + 1).ToString();
                    xmlPerson.Attributes.Append(attr);

                    XmlElement xmlName = xmlDoc.CreateElement("name");
                    xmlName.InnerText = lists[i].Name;
                    xmlPerson.AppendChild(xmlName);

                    XmlElement xmlAge = xmlDoc.CreateElement("age");
                    xmlAge.InnerText = lists[i].Age.ToString();
                    xmlPerson.AppendChild(xmlAge);

                    xmlRoot.AppendChild(xmlPerson);
                }

                xmlDoc.Save("school.xml");
                Console.WriteLine("OK");
                Console.ReadKey();
            }
        }

        public class Person
        {
            public int Age { get; set; }
            public string Name { get; set; }
        }


        It seems that some processing is missing. Although it is organized, XMlDocument is still more troublesome.
        
        
        Q: Are XmlElement and Xmlattribute the same as attributes?
        Answer: No, when fields or attributes are serialized to generate XML, the former is the element alias, and the latter is the superior element attribute.

        internal class Program
        {
            private static void Main(string[] args)
            {
                Person p = new Person() { Name = "Tom", Age = 8 };
                XmlSerializer ser = new XmlSerializer(typeof(Person));
                using (StringWriter sw = new StringWriter())
                {
                    ser.Serialize(sw, p);
                    Console.WriteLine(sw.ToString());
                }
                Console.ReadKey();
            }
        }

        public class Person
        {
            [XmlElement("yangling")]
            public int Age { get; set; }

            [XmlAttribute("Minzhi")]
            public string Name { get; set; }
        }


        Pay attention to the results:

        <?xml version="1.0" encoding="utf-16"?>
        <Person xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Minzhi="Tom">
          <yangling>8</yangling>
        </Person>


        XmlElementAttribute is used to serialize object attributes into XML elements, and XmlAttribute is used to serialize object
        attributes into XML attributes.
        
        
        Rewrite it again as XDocument to write xml:

        private static void Main(string[] args)
        {
            List<Person> lists = new List<Person>()
            {
                new Person{ Name= "黄林",Age = 18 },
                new Person { Name = "许龙", Age = 19 }
            };
            XDocument xDoc = new XDocument();
            //是属性,不再是add添加
            xDoc.Declaration = new XDeclaration("1.0", "UTF-8", null);
            //是单独的元素,而非doc下元素,new XElemnet进行创建,更符合逻辑
            XElement xRoot = new XElement("list");
            xDoc.Add(xRoot);

            for (int i = 0; i < lists.Count; i++)
            {
                XElement xmlp = new XElement("person");
                xmlp.SetAttributeValue("id", (i + 1).ToString());

                xmlp.SetElementValue("name", lists[i].Name);
                xmlp.SetElementValue("age", lists[i].Age);
                xRoot.Add(xmlp);
            }

            xDoc.Save("school.xml");
            Console.WriteLine("OK");
            Console.ReadKey();
        }
        }

        public class Person
        {
        public int Age { get; set; }
        public string Name { get; set; }
        }


        Q: Which one is more efficient, XMLDocument or XDocument?
        Answer: Generally XDocumnet.
            When processing XML data, the performance of XDocument is usually higher than that of XmlDocument.
            XDocument is part of LINQ to XML, which provides a more modern and simplified way to process 
        XML data. XDocument uses more efficient memory management and query technology, so it is faster than
        traditional .         1. Memory consumption: XDocument uses a more compact data structure, so its memory consumption is usually lower than XmlDocument
            when processing large XML documents .             2. Query language: XDocument uses LINQ to XML, which provides a concise and powerful query         language to operate XML data. This query language is more efficient than traditional XmlDocument DOM operations.             3. Readability: XDocument's API design is more concise and intuitive, making the code easier to understand and maintain.         In comparison, XmlDocument's API design is complex and lengthy.             However, for small or simple XML data, the difference may not be obvious. In some cases,         XmlDocument may be more suitable for specific needs, especially when complex modifications or manipulations of XML data are required.







        When making.
            Therefore, for most situations, XDocument is the more efficient choice, especially when working with large XML
        documents . However, specific performance depends on specific use cases and needs, so selection should
        be evaluated based on actual circumstances.
        
        Q: What is the shortcut key to view method overloads when entering code?
        Answer: After you call the method and add the brackets, press Ctrl + Shift + Space. This will trigger the autocomplete
        prompt and display the overloaded list of the method. You can use the up and down arrow keys to select the overload you want to view. Once selected, just
        continue entering the parameters.
            In the completed code, position the mouse in the brackets after the method and press the above shortcut keys to view it.
        
        
    4. Read XML (recursively load into TreeView)
        
            . Use XDocument to read XML below. Because Baidu RSS is used, this is easy to practice online. First, use
        WebClient to synchronize the xml string from the Internet, and then read it recursively. .
            

        private void button1_Click(object sender, EventArgs e)
        {
            WebClient wc = new WebClient();
            wc.Encoding = Encoding.UTF8;
            string s = wc.DownloadString(@"http://news.baidu.com/n?cmd=1&class=civilnews&tn=rss&sub=0");

            XDocument xDoc = XDocument.Parse(s);
            XElement xRoot = xDoc.Root;
            TreeNode tdRoot = treeView1.Nodes.Add(xRoot.Name.ToString());

            LoadTreeViewByXDoc(xRoot, tdRoot.Nodes);
            treeView1.ExpandAll();
            tdRoot.EnsureVisible();
            treeView1.SelectedNode = tdRoot.Nodes[0].Nodes[3];
            treeView1.Focus();
        }

        private void LoadTreeViewByXDoc(XElement xRoot, TreeNodeCollection t)
        {
            foreach (XElement item in xRoot.Elements())
            {
                if (item.Elements().Count() == 0)
                {
                    TreeNode tt = t.Add(item.Name.ToString());
                    if (item.Value.ToString() != "")
                    {
                        tt.Nodes.Add(item.Value.ToString());
                    }
                }
                else
                {
                    TreeNode tt = t.Add(item.Name.ToString());
                    LoadTreeViewByXDoc(item, tt.Nodes);
                }
            }
        }


        
        
        The httpclient method can also be used to obtain the xml part of the above network

        HttpClient client = new HttpClient();
        HttpResponseMessage resp = await client.GetAsync(@"http://news.baidu.com/n?cmd=1&class=civilnews&tn=rss&sub=0");
        resp.EnsureSuccessStatusCode();
        string s = await resp.Content.ReadAsStringAsync();


            async means asynchronous, that is, everyone runs his or her own way, in different tunes, without cooperation, and everyone can run freely.
        When await indicates an asynchronous method, execution will be suspended and control will be returned to the caller. Often used in conjunction with async, when the asynchronous
        execution is completed, await will no longer pause but will continue to execute from the bottom of this sentence.
        The above resp uses the getASync asynchronous method to request through the http protocol. The response obtained is
        obtained from the EnsuresuccessStatuscode. If an error occurs, an exception will be made. In the normal response development department, continue to read XML from the server asynchronously
        and return it in the form of characters.
            When you get a response from a web server, the server usually includes a character set (charset) in the HTTP response header that
        specifies the encoding of the response content. The ReadAsStringAsync() method will decode the response content based on this character set
        information .
            However, there are some situations where problems may arise. For example, if the server does not provide the correct character set information,
        or the character set information provided by the server does not match the actual encoding used, the ReadAsStringAsync() method may
        not be able to correctly decode the contents of the response. In this case, you may need to manually specify the correct encoding. like:

        resp.EnsureSuccessStatusCode();
        byte[] bytes = await resp.Content.ReadAsByteArrayAsync();
        string s = Encoding.UTF8.GetString(bytes);


        
        The next step is to expand the treeview and position it so that the root node can be seen, then select a node and activate the treeview
        to make the selected node have white text on a blue background.
        
        
        The following uses XMLDocument for recursive loading

        private void button1_Click(object sender, EventArgs e)
        {
            WebClient wc = new WebClient();
            wc.Encoding = Encoding.UTF8;
            string s = wc.DownloadString(@"http://news.baidu.com/n?cmd=1&class=civilnews&tn=rss&sub=0");

            XmlDocument xmlDoc = new XmlDocument();
            xmlDoc.LoadXml(s);
            XmlElement xmlRoot = xmlDoc.DocumentElement;
            TreeNode nodeRoot = treeView1.Nodes.Add(xmlRoot.Name);
            LoadToTreeViewByXmlDoc(xmlRoot, nodeRoot.Nodes);

            treeView1.ExpandAll();
            nodeRoot.EnsureVisible();
            treeView1.SelectedNode = nodeRoot.Nodes[0].Nodes[3];
            treeView1.Focus();
        }

        private void LoadToTreeViewByXmlDoc(XmlElement xmlRoot, TreeNodeCollection nodes)
        {
            foreach (XmlNode item in xmlRoot.ChildNodes)
            {
                if (item.NodeType == XmlNodeType.Element)
                {
                    TreeNode node = nodes.Add(item.Name);
                    LoadToTreeViewByXmlDoc((XmlElement)item, node.Nodes);//结点强转为元素
                }
                else
                {
                    if (item.NodeType == XmlNodeType.Text | item.NodeType == XmlNodeType.CDATA)
                    {//类型为文本或CDATA,就添加文本值。此时不再可能有子元素,递归终止条件。
                        nodes.Add(item.Value);
                    }
                }
            }
        }


            Note that the above xmldocument does not have direct use of xmlelements. It can only be listed through child nodes (direct child nodes, excluding
        descendant nodes after grand children). At the same time, the node type must be determined by nodetype.
            For example, <name>Tom</name> has three nodes. The first node type is element, the second one is
        Text, and the third one is endelement. The types are different, and <![CDATA]... The type is CDATA type.
        Therefore, you should make good use of nodetype when making recursive judgments.
        
        
        NodeType (node ​​type)
            NodeType (node ​​type) is an enumeration value used to identify different categories or types of nodes in XML documents.
        In XmlDocument, node types are represented by the XmlNodeType enumeration. Common node types:
            1. XmlNodeType.Element: represents element node, used to represent XML tags and their contents.
                    For example: <book>, <name>, <title>.
            2. XmlNodeType.Attribute: Represents the attribute node, used to represent the attributes of the element.
                    For example: id="123", name="John".
            3. XmlNodeType.Text: Represents text nodes, used to represent text content within elements.
                    For example: Tom, Hello, World!.
            4. XmlNodeType.CDATA: Represents CDATA node, used to represent text content containing special characters (such as "").
                    For example: <![CDATA[This is a CDATA section.]]>.
            5. XmlNodeType.Comment: Represents a comment node, used to represent comments in XML documents.
                    For example: <!-- This is a comment -->;.
            6. XmlNodeType.ProcessingInstruction: represents the processing instruction node, used to represent
                    the processing instructions in the XML document. For example: <?xml version="1.0" encoding="UTF-8"?>;.
            7. XmlNodeType.DocumentType: Represents the document type node, used to represent the type definition of XML documents.
                    For example: <!DOCTYPE html>;
            8. XmlNodeType.EndElement: Represents the end tag node of the document type node.
                    For example: </book>, </name>, </title>.
            Notice:
                The type of node can be obtained using the XmlNode.NodeType property, which returns the XmlNodeType
                        enumeration value.
                When processing nodes, it is usually necessary to perform targeted operations based on the type of the node, such as extracting
                        attribute values ​​​​of element nodes, obtaining text content of text nodes, etc.
                It should be noted that the judgment and processing of node types should be based on the structure and semantics of the XML document.
                        Different types of nodes may require different processing logic.
        
        
    5. Access a node
        
            In C#'s XmlDocument, there are several commonly used methods to access a node:
        (1) Access through the path of the node:
            Use the SelectSingleNode() method:
                Use XPath expressions to select and return matching The first node of the condition.
            Use the SelectNodes() method:
                Select and return a set of nodes that meet the criteria by using an XPath expression.
            Note:
            XPath expressions need to be written according to the structure and requirements of the XML document.
            If the node you want to access is unique, you can use the SelectSingleNode() method to get a single node.
            If there may be multiple nodes to be accessed, you can use the SelectNodes() method to obtain the node collection.
        (2) Access by node name:
            Use the GetElementsByTagName() method:
                Get the set of all nodes with that tag by specifying the tag name of the node.
            Use the GetElementById() method:
                obtain qualified nodes through the ID attribute value of the node (the attribute value of the node needs
                to be declared ).
            Note:
            The GetElementsByTagName() method returns an XmlNodeList, and
                each node can be accessed by traversing the collection.
            The GetElementById() method returns a single node, and the returned node can be accessed and operated directly.
            

        XmlDocument xmlDoc = new XmlDocument();
        xmlDoc.Load("example.xml");
        // 使用 XPath 访问某个节点(通过路径)
        XmlNode node1 = xmlDoc.SelectSingleNode("//root/element/name");  // 单个节点
        XmlNodeList nodeList1 = xmlDoc.SelectNodes("//root/element");  // 节点集合
        // 使用节点名称访问(通过标签名或属性)
        XmlNodeList nodeList2 = xmlDoc.GetElementsByTagName("name");  // 节点集合
        XmlNode elementNode = xmlDoc.GetElementById("I8");  // 单个节点


        
        Note when accessing the node:
            Make sure that the XML document has been loaded into the XmlDocument object. You can use the Load() method to load the XML document.
            You need to write the XPath expression properly or specify the correct node name to ensure that the target node can be accessed accurately.                 When using the SelectSingleNode() method or GetElementById() method, you can access and operate
            on the individual nodes returned ;                 when using the SelectNodes() method or GetElementsByTagName() method, you need to access each node by traversing the XmlNodeList.             When operating nodes, you should also pay attention to the type of node and the existence of attributes. You can choose                 appropriate .             In short, when using XPath expressions or node names to access nodes, make sure you correctly understand the structure of the XML document                 and the relationship between nodes, and choose the appropriate access methods and attributes according to your needs.                 Next use XDocument to access the node.






    
        

            
    

        WebClient wc = new WebClient();
        wc.Encoding = Encoding.UTF8;
        string s = wc.DownloadString(@"https://sspai.com/feed");

        XDocument xDoc = XDocument.Parse(s);
        XElement xRoot = xDoc.Root;
        IEnumerable<XElement> xe = xRoot.Element("channel").Elements("item");
        foreach (XElement item in xe)
        {
            Console.WriteLine(item.Name.ToString());
        }
        xe = xRoot.Descendants("item").Where(x => x.Element("pubDate").Value.ToString().Contains("Fri"));
        foreach (XElement item in xe)
        {
            Console.WriteLine(item.Element("pubDate").Value.ToString());
        }


            The Descendants() method returns the set of the specified node and its descendant nodes, that is, all descendant nodes. The collection of all descendant element nodes in the document
        tree starting from the current node (including the current node itself). In other words, it recursively
        traverses the current node and its descendants and returns all descendant nodes.
            elements are direct children of the current node. Excludes the grandchild node and the following.

        XDocument xdoc = XDocument.Load("example.xml");
        IEnumerable<XElement> descendants = xdoc.Descendants("root")
                                                 .Elements("element")
                                                 .Where(e => e.Attribute("id")?.Value == "I8");
        foreach (XElement descendant in descendants)
        {
            // 对满足条件的后代节点进行处理
            Console.WriteLine(descendant.Name);
        }    


        
    
    6. GetElementByID() Special instructions
        
            The GetElementByID() method is used to obtain the corresponding element node through the ID attribute of the XML element.
            In XML, the ID attribute is not built-in, and an attribute needs to be declared as an ID attribute. In HTML, the id attribute
        is usually used to represent the ID attribute of an element; in other XML documents,
        specific attributes can be defined as ID attributes through DTD or XML Schema.
            To determine or judge whether an attribute is defined as an ID attribute, you can view the DTD (Document Type 
        Definition) or XML Schema of the XML document to find out whether the attribute is defined as an ID attribute.
            Right-click the project, add->new item, fill in xml in the filter in the upper right corner, select "XML file" and name it XMLFile.xml
        to save it , and drag the file in the project to bin\Debug, which is the same directory as the program . Double-click the xml to edit:

        <root>
            <element id="I01">
                <name>Tom</name>
            </element>
            <element id="I02">
                <name>John</name>
            </element>
        </root>


        You can use the GetElementById() method to get the element node with the corresponding ID:

        XmlDocument xmlDoc = new XmlDocument();
        xmlDoc.Load("example.xml");
        XmlElement element = xmlDoc.GetElementById("I01");

        if (element != null)
        {
            // 通过 ID 属性获取到了元素节点
            // 可以在此对该节点进行操作或访问
            string name = element.SelectSingleNode("name").InnerText;
            Console.WriteLine("Name: " + name);
        }


            In fact, an error will be reported above because the ID is not what we think of as an ID. To declare an attribute as an ID attribute in an XML document 
        , you can use DTD (Document Type Definition) or XML Schema.
        (1) Use DTD to declare ID attributes:
            In DTD, you can use ID or IDREF types to declare attributes. Here is an example:

        <?xml version="1.0" encoding="utf-8"?>
        <!DOCTYPE root [
            <!ELEMENT root (element*)>
            <!ELEMENT element (name)>
            <!ATTLIST element id ID #REQUIRED>
            <!ELEMENT name (#PCDATA)>
        ]>
        <root>
            <element id="I01">
                <name>Tom</name>
            </element>
            <element id="I02">
                <name>John</name>
            </element>
        </root>


            Above, by using the ID type in <!ATTLIST>, the id attribute of the element element is declared as the ID attribute.
        (2) Use XML Schema to declare ID attributes:
            In XML Schema, you can use the xsd:ID type to declare attributes. Here is an example:

        <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
          <xsd:element name="root">
            <xsd:complexType>
              <xsd:sequence>
                <xsd:element name="element" maxOccurs="unbounded">
                  <xsd:complexType>
                    <xsd:sequence>
                      <xsd:element name="name" type="xsd:string"/>
                    </xsd:sequence>
                    <xsd:attribute name="id" type="xsd:ID" use="required"/>
                  </xsd:complexType>
                </xsd:element>
              </xsd:sequence>
            </xsd:complexType>
          </xsd:element>
        </xsd:schema>


            The above declares the id attribute of the element element as the ID attribute by setting the type of the id attribute to xsd:ID.
            In short, use DTD or XML Schema to explain which Id in GetElementById refers to.
    
        Note: After adding the above DTD and xml scheme, the content of the required ID cannot be numbers and must start with a letter.
            That is, id="01" will alert you.
    
    
    7. Exercise: Read the following xml file

        <?xml version="1.0" encoding="UTF-8"?>
        <Order>
          <CustomerName>杨中科</CustomerName>
          <OrderNumber>BJ200888</OrderNumber>
          <Items>
            <OrderItem Name="电脑" Count="30"/>
            <OrderItem Name="电视" Count="2"/>
            <OrderItem Name="水杯" Count="20"/>
          </Items>
        </Order>


        
        Use XmlDocument first. Note that this method is not recommended!

        string p = @"D:\OneDrive\附件\C#加强\Orders.xml";
        XmlDocument xmlDoc = new XmlDocument();
        xmlDoc.Load(p);
        XmlElement xmlRoot = xmlDoc.DocumentElement;
        XmlElement xmlCus = (XmlElement)xmlRoot.SelectSingleNode(@"//Order/CustomerName");
        Console.WriteLine($"客户名称:{xmlCus.InnerText}");
        XmlElement xmlOrd = (XmlElement)xmlRoot.SelectSingleNode(@"//Order/OrderNumber");
        Console.WriteLine($"订单号:{xmlOrd.InnerText}");
        XmlNodeList xmlItems = xmlRoot.GetElementsByTagName("OrderItem");
        foreach (XmlNode item in xmlItems)
        {
            XmlElement ele = (XmlElement)item;
            Console.WriteLine(ele.GetAttribute("Name") + ":" + ele.GetAttribute("Count"));
        }


        It is recommended to use XDocument method to read.
            Reading XML files using the XDocument class provides a more intuitive API and better performance, especially when working with large
        XML files. It also supports LINQ query syntax for easy filtering and processing of XML data. Therefore, XDocument is the recommended
        way to read XML files.

        string p = @"D:\OneDrive\附件\C#加强\Orders.xml";
        XDocument doc = XDocument.Load(p);
        XElement root = doc.Root;
        Console.WriteLine($"客户姓名:{root.Element("CustomerName").Value}");
        Console.WriteLine($"订单编号:{root.Element("OrderNumber").Value}");
        foreach (XElement ele in root.Descendants("OrderItem"))
        {
            Console.WriteLine($"名称:{ele.Attribute("Name").Value},数量:{ele.Attribute("Count").Value}");
        }


    
    
        Use Xdocument to read the following xml

        <?xml version="1.0" encoding="utf-8" ?>
        <CFX>
            <MSG>
                <交易码 val="1000"/>
                <流水号 val="100000000000000001"/>
                <金额 val="1234567890.12"/>
                <付款机构 val="腾讯销售部"/>
                <付款单位账号 val="12345678901234567890"/>
                <收款机构 val="新浪财务部"/>
                <收款单位账号 val="12345678901234567890"/>
            </MSG>
            <MSG>
                <交易码 val="1000"/>
                <流水号 val="100000000000000002"/>
                <金额 val="1234567890.12"/>
                <付款机构 val="1234"/>
                <付款单位账号 val="12345678901234567890"/>
                <收款机构 val="1234"/>
                <收款单位账号 val="12345678901234567890"/>
            </MSG>
            <MSG>
                <交易码 val="1000"/>
                <流水号 val="100000000000000003"/>
                <金额 val="1234567890.12"/>
                <付款机构 val="1234"/>
                <付款单位账号 val="12345678901234567890"/>
                <收款机构 val="1234"/>
                <收款单位账号 val="12345678901234567890"/>
            </MSG>
        </CFX>    


        Read one by one:

        string p = @"D:\OneDrive\附件\C#加强\ytbank.xml";
        XDocument doc = XDocument.Load(p);

        XElement root = doc.Root;

        foreach (XElement ele in root.Descendants("MSG"))
        {
            foreach (XElement ele2 in ele.Elements())
            {
                Console.WriteLine($"{ele2.Name}:{ele2.Attribute("val").Value}");
            }
        }


    
    
    8. Exercise: Add personnel information to the list, double-click the list to read it, and then modify it. Finally exit and save xml.
        


        The core uses list<person> to store, modify, and save data.
        (1) Create an object to conveniently store a person’s information

        public class Person
        {
            public Person()
            {
            }

            public Person(string name, string age, string email, string id)
            {
                Name = name; Age = age; Email = email; ID = id;
            }

            public string Name { get; set; }
            public string Age { get; set; }
            public string Email { get; set; }
            public string ID { get; set; }

            public override string ToString()
            {
                return Name + "," + Age + "," + Email + "," + ID;
            }
        }


        (2) Main interface code:

        public XDocument xDoc;
        public XElement root;
        public bool blEdit = false;
        public List<Person> lists = new List<Person>();

        private void button1_Click(object sender, EventArgs e)
        {
            if (txtName.Text == "")
            {
                MessageBox.Show("没有内容,请重新输入");
                return;
            }

            Person p = new Person(txtName.Text, txtAge.Text, txtEmail.Text, txtID.Text);
            if (blEdit == true)//编辑
            {
                lists[listBox1.SelectedIndex] = p;
                listBox1.Items[listBox1.SelectedIndex] = p;

                blEdit = false;//恢复默认的添加状态
            }
            else//添加
            {
                lists.Add(p);
                listBox1.Items.Add(p.ToString());
            }
            ClearTexts();
        }

        private void button2_Click(object sender, EventArgs e)
        {
            this.Close();
        }

        private void listBox1_MouseDoubleClick(object sender, MouseEventArgs e)
        {
            if (listBox1.SelectedIndex != -1)
            {
                blEdit = true;//双击时变化为编辑状态
                txtName.Text = lists[listBox1.SelectedIndex].Name;
                txtAge.Text = lists[listBox1.SelectedIndex].Age;
                txtEmail.Text = lists[listBox1.SelectedIndex].Email;
                txtID.Text = lists[listBox1.SelectedIndex].ID;
            }
        }

        private void Form1_Load(object sender, EventArgs e)
        {
            if (!File.Exists("class.xml")) return;//无xml跳过
            xDoc = XDocument.Load("class.xml");
            XElement root = xDoc.Element("root");
            foreach (XElement stu in root.Elements("stu"))
            {
                Person p = new Person();
                p.Name = stu.Element("name").Value;
                p.Age = stu.Element("age").Value;
                p.Email = stu.Element("email").Value;
                p.ID = stu.Element("id").Value;
                lists.Add(p);
                listBox1.Items.Add(p);
            }
        }

        private void ClearTexts()
        {
            foreach (TextBox tb in this.Controls.OfType<TextBox>())
            {
                tb.Text = string.Empty;
            }
        }

        private void SaveXml()
        {
            if (lists.Count == 0) return;//无内容退出
            xDoc = new XDocument();
            root = new XElement("root");
            xDoc.Add(root);
            for (int i = 0; i < lists.Count; i++)
            {
                Person p = lists[i];
                XElement stu = new XElement("stu");
                stu.Add(new XElement("name", p.Name));
                stu.Add(new XElement("age", p.Age));
                stu.Add(new XElement("email", p.Email));
                stu.Add(new XElement("id", p.ID));
                root.Add(stu);
            }
            xDoc.Save("class.xml");
        }

        private void Form1_FormClosing(object sender, FormClosingEventArgs e)
        {
            SaveXml();
        }


        
 

Guess you like

Origin blog.csdn.net/dzweather/article/details/132135655