allowing for missing parent in jsoup selector

elk :

I want to retrieve books from a website but that website uses different html to show the same thing. In some pages it has a div followed by an ul and then the li, like this:

<div class="book-description">
   <ul>
      <li>info 1</li>
      <li>info 2</li>
      <li>info 3</li>
   </ul>
</div>

To iterate over the li I would simply do: doc.select("div.book-description > ul > li")

On others it goes directly from div to li, like this:

<div class="book-description">
   <li>info 1</li>
   <li>info 2</li>
   <li>info 3</li>
</div>

The previous syntax would not work with this page, I would need to use doc.select("div.book-description > li") Is there a syntax I can use to specify that the ul may be missing?

d-h-e :

Have you tried doc.select("div.book-description li") ?

If your list have no nested lists, this selector would be ok.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=101561&siteId=1