One, understand the DOMParser method
DOMParser can parse HTML strings into DOM trees, and the format types include XML documents or HTML documents.
grammar
var domParser = new DOMParser ();
This domParser
is a DOMParser object.
The DOMParser object contains a method called parseFromString.
The method syntax is as follows :
var doc = domParser.parseFromString(string, mimeType);
This method returns an HTML document or an XML document depending on the value of the mimeType parameter.
The specific parameters are as follows :
string
Required. String. Represents the DOM string (DOMString) used for parsing. Must include HTML, xml, xhtml+xml or svg document content, otherwise it may parse error.
mimeType
Required. String. Indicates the type of document to be parsed, and supports the following parameter values:
mimeType value | Return document type |
---|---|
text/html | Document |
text/xml | XMLDocument |
application/xml | XMLDocument |
application/xhtml+xml | XMLDocument |
image/svg+xml | XMLDocument |
Among them, the Document document type will automatically include <html>
and <body>
tags, while the XMLDocument document type will not actively add <html>
and <body>
wait tags, and according to my test, many ordinary HTML tags sometimes have parseerror parsing errors.
for example:
var domParser = new DOMParser (); console.dir(domParser.parseFromString('<p>内容</p>', 'text/html'));
The DOM tree structure of the document returned at this time is shown in the screenshot below, <html>
and <body>
tags that are not in the DOM string parameters appear .
However, if the mimeType we set is text/xml
the same, it will return another document tree structure:
var domParser = new DOMParser (); console.dir(domParser.parseFromString('<p>内容</p>', 'text/xml'));
The console output results are as follows:
compatibility
DOMParser method is supported by IE9+.
Two, understand the XMLSerializer method
The role of the XMLSerializer method is the opposite of DOMParser, XMLSerializer can serialize DOM tree objects into strings.
grammar
var xmlSerializer = new XMLSerializer();
This xmlSerializer
is an XMLSerializer object.
The XMLSerializer object has a serializeToString()
method named that can return a serialized xml string.
The syntax is as follows:
var xmlString = xmlSerializer.serializeToString(rootNode);
The return value is of type DOMString.
parameter
rootNode
have to. The root node of the DOM tree used to convert to a string.
E.g:
var xmlSerializer = new XMLSerializer(); console.log(xmlSerializer.serializeToString(document.querySelector('.link')));
I run the above JS code in the console corresponding to this article, and the result is as follows:
Difference from outerHTML attribute
serializeToString()
The method is outerHTML
similar to some, but there are still differences. There are mainly the following two:
- outerHTML can only act on Element elements, but cannot be other node types, such as text nodes, comment nodes, and the like. But the
serializeToString()
method is applicable to any node type. include:Node type Paraphrase DocumentType Document type Document Documentation DocumentFragment Document fragment Element element Comment Comment node Text Text node ProcessingInstruction Processing instructions Attr Attribute node serializeToString()
The method will automatically add the xmlns namespace to the root element. For example, compare the output results of the following two codes:div = document.createElement('div'); // 1. console.log(div.outerHTML); // 2. var xmlSerializer = new XMLSerializer(); console.log(xmlSerializer.serializeToString(div));
The output result is as follows:
compatibility
DOMParser method is supported by IE9+.
Three, application examples
1. Remove line breaks and comments in HTML strings
First test the HTML as follows, put it in a custom script template:
<script id="tpl" type="text/template"> <!-- This is note 1 --> <p>This is text. </p> <!-- This is note 2 --> <ol> <li>List</li> <li>List</li> <li>List</li> </ol> </script>
In order to make it easier to read and maintain, HTML templates include indentation and comments, but the actual parsing and these are not needed and need to be deleted. In addition to the regular replacement of strings, you can also try using a browser For some native DOM API methods, such as DOMParse, the JavaScript code is as follows:
var htmlTpl = tpl.innerHTML; // Convert string to document type var domParser = new DOMParser (); var doc = domParser.parseFromString(htmlTpl, 'text/html'); // Use the native TreeWalker for traversal var treeWalker = document.createTreeWalker(doc); var arrNodeRemove = []; // Traverse comment nodes and newline text nodes while(treeWalker.nextNode()) { var node = treeWalker.currentNode; if (node.nodeType == Node.COMMENT_NODE || (node.nodeType == Node.TEXT_NODE && node.nodeValue.trim() == '')) { arrNodeRemove.push(node); } } // node removal arrNodeRemove.forEach(function (node) { node.parentNode.removeChild(node); }); // String restore console.log(doc.body.innerHTML); // The output result is: // <p>This is text. </p><ol><li>List</li><li>List</li><li>List</li></ol>
The screenshot of the Chrome browser is as follows:
It can be seen that comments related to HTML strings and newline spaces have been removed. The advantage of using the DOMParser method is that it is easier to understand and use than regular expression replacement. Due to the use of the browser's built-in parsing, HTML characters are fault-tolerant Stronger, the use of regular expressions may not be comprehensive. The disadvantage is that the amount of code is relatively large.
Four, conclusion
When I sorted out and studied this article today, I found that I still don’t know much about native DOM APIs. For example, NodeIterator and TreeWalker are native APIs supported by IE9+ browsers. I have the opportunity to introduce them in the future, although the circles are concerned about this. The proportion of developers like native DOM API is very low, and the probability of our daily use is not high. However, if we want to become creators, excellent framework creators in the future, such learning is essential.
Reference documents:
Thanks for reading!