Get HTML text with no tags around them

El-Burritos :

I have this HTML:

<div id="uglyHtml">
    <br> <b>Lead</b>: <a href="#">John</a>
    <br> <b>Boss</b>: <a href="#">Bernard</a>
    <br> <b>Mascot</b>: Patrick
    <br> <b>Designer</b>: Jeanette
    <br> <b>Front</b>: <a href="#">Larry</a>
</div>

For exemple :

We can simply capture John, Bernard and Larry with : #uglyHtml > a
Lead, Boss, Mascot, Designer with : #uglyHtml > b

Now I need to capture Patrick & Jeanette who have no tags around them, for this I can only use CSS or/and regex

there is a way to do this ?

epascarello :

I would never use a regular expression to match the text, but seems like that is what your tool wants. Something like this would match the role and the person. This will break very easily.

var html = document.querySelector("#uglyHtml").innerHTML

var re = /<b>([^<]+)<\/b>: (?:<a[^>]+>)?([^<\n]+)/g
let out = true
while (out) {
  out = re.exec(html)
  console.log(out)
}
<div id="uglyHtml">
  <br> <b>Lead</b>: <a href="#">John</a>
  <br> <b>Boss</b>: <a href="#">Bernard</a>
  <br> <b>Mascot</b>: Patrick
  <br> <b>Designer</b>: Jeanette
  <br> <b>Front</b>: <a href="#">Larry</a>
</div>

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=220570&siteId=1