How to convert HTML strings into DOM objects: use DOMParser
1. Problem description
Sometimes we need to process some HTML strings, for example, I need to extract <a>
the content and attributes of each tag from the HTML string below.
<pre>
<a href="cc1245.jpg">cc1245.jpg</a>
<a href="image.jpg">image.jpg</a>
<a href="movie.mp4">movie.mp4</a>
</pre>
Regular matching is possible, but it is very troublesome. It will be much more convenient to directly convert it into a DOM object, and then use the set of APIs for manipulating DOM to process it.
DOM operation: Element node
https://wangdoc.com/javascript/dom/element
Next, we convert the above string into a DOM object, then split <a>
the content of each tag and transform it into a JSON object, and then do other processing.
2. String -> DOM
this.htmlContent = `上面的字符串`
let doc = new DOMParser().parseFromString(this.htmlContent, "text/xml") // 转换 string -> dom 对象
let linkDomList = doc.querySelector('pre').children // 用 element API 获取其中的 pre 标签内的对象
for (let i = 0; i < linkDomList.length; i++) {
// 遍历这个 children 并输出
console.log(linkDomList[i])
}
Don't worry about the value inside href
, I have processed it.
result:
3. DOM -> JSON
Complementing the above code to extract the data into JSON, that's it:
let doc = new DOMParser().parseFromString(this.htmlContent, "text/xml")
let linkDomList = doc.querySelector('pre').children
let linkArray = []
for (let i = 0; i < linkDomList.length; i++) {
let currentDom = linkDomList[i]
linkArray.push({
name: currentDom.textContent, // <a> 标签内的文字内容
href: currentDom.getAttribute('href'), // <a> 标签的 href 属性
})
}
console.log(linkArray)
result: