Java HTML parser jsoup release 1.13.1, parsing speed significantly improved

jsoup 1.13.1 has been released , notable improvements include: parsing faster than 1.12.x has been significantly improved, the selector add new features, fix the problem Mark Invalid abnormalities, as well as many other improvements occur.

jsoup best Java HTML parser ( sweet potato authentication ), it is best to use a method HTML5 DOM and CSS selectors, provides a very convenient API for data extraction and processing. Feel the code below:

Document doc = Jsoup.connect("https://en.wikipedia.org/").get();
log(doc.title());
Elements newsHeadlines = doc.select("#mp-itn b a");
for (Element headline : newsHeadlines) {
  log("%s\n\t%s", 
    headline.attr("title"), headline.absUrl("href"));
}

The above first code fetch the Wikipedia page , parses it into the DOM, and then select the heading "In the news" and fill it to the area of use  Elements  headline object class initialization. ( Online example , the complete code )

Download: https://jsoup.org/download

<dependency>
  <!-- jsoup HTML parser library @ https://jsoup.org/ -->
  <groupId>org.jsoup</groupId>
  <artifactId>jsoup</artifactId>
  <version>1.13.1</version>
</dependency>

1.13.1 noteworthy improvements

  • The new Element.closest()method, it will search the tree to find and select the most closely match the elements
  • Optimize memory, the Documentpermanent memory is reduced by about 39% of memory allocated to a decrease of about 9%
    1 only when the element has attributes, will be in the Elementcreation AttributesHolder
    2. given only when the DOM tree baseUriis provided to a new value, before the track elements baseUri
    3. after parsing, does not Document.parserretain the input character reader (and associated buffer) in
  • Compared with 1.12.x, parsing speed has been substantial improvement
  • Remove the old methods and classes are marked as deprecated the old version
  • Increase Element.select(Evaluator)and  Element.selectFirst(Evaluator)method that allows in the case of multiple use of the same evaluator reuse parsed CSS selectors

For more updates View  https://jsoup.org/news/release-1.13.1

Guess you like

Origin www.oschina.net/news/113772/jsoup-1-13-1-released