match html closing tags

<(?<HtmlTag>[\w]+)[^>]*?>((?<Nested><\k<HtmlTag>[^>]*>)|</\k<HtmlTag>>(?<-Nested>)|.*?)*</\k<HtmlTag>>

 

break down

1、<(?<HtmlTag>[\w]+)[^>]*?>

  (?<HtmlTag>[\w]+) means to store the result matched by [\w]+ into the variable HtmlTag, the name can be arbitrarily set, and \k<HtmlTag> can be used later, such as matching div

    (?<HtmlTag>div) is generally used to match when the tag name is uncertain

    *? -> * The default is greedy mode, 0 or more times, that is, as many matches as possible. *? is to match as little as possible

2、((?<Nested><\k<HtmlTag>[^>]*>)|</\k<HtmlTag>>(?<-Nested>)|.*?)*

  (?<Nested><\k<HtmlTag>[^>]*>)   |   </\k<HtmlTag>>(?<-Nested>)    |    .*?

    (?<Nested><\k<HtmlTag>[^>]*>) is similar to recursion, when the value is <\k<HtmlTag>[^>]*> +1,

    </\k<HtmlTag>>(?<-Nested>)  遇到值为</\k<HtmlTag>>  -1 

    .*? is to match any single-line character

    | or

3、</\k<HtmlTag>>

  Using the previously defined HtmlTag content, the general tags are in pairs, such as <div><span>...test<span></div>

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324895573&siteId=291194637