jsoup 1.17.2 is now released . jsoup is a Java library for processing real-world HTML. It provides a very convenient API for extracting and manipulating data using the best of HTML5 DOM methods and CSS selectors.
Download address: https://jsoup.org/download
Specific updates include:
Improve
- Attribute object accessors : Add
Element.attribute(String)
andAttributes.attribute(String)
make it easier to getAttribute
objects. 2069 - Attribute source tracking : If source tracking is turned on and the attribute's key has been changed (passed
Attribute.setKey(String)
), the source range will now stillAttribute.sourceRange()
be tracked. 2070 - Wildcard attribute selector : Added
[*]
support for elements with any attribute selector. And also restored[^]
support for selection via empty property name prefix ( ). 2079
Bug fix
- Mixed-cased source position : When tracking the source position of an attribute, if the source attribute name is mixed case, but the parser normalizes the attribute name in lower case, the source position of the attribute cannot be tracked correctly. 2067
- Source position NPE : When tracking the source position of text fragment parsing, a null pointer exception is thrown. 2068
- Multi-point emoji entity : Multi-point encoded emoji entities may be incorrectly decoded as replacement characters. 2074
- Selector sub-expressions : (Regression)
parent [attr=va], other
In selectors like ,, OR
were bound to[attr=va]
instead ofparent [attr=va]
, resulting in incorrect selections. The fix includes an EvaluatorDebug class that generates a sexpr to represent a query, making query parsing testing simpler and more thorough. 2073 - XML CData output : When generating XML syntax output from parsed HTML, script nodes containing (pseudo) CData sections will have extraneous CData sections added, causing script execution errors. Now,
if the data is not already in the CData section, the data content will be emitted in HTML/XML/XHTML multilingual format. 2078 - Thread safety :
:has
The evaluator holds a non-thread-safe iterator, so if multiple concurrent threads share an Evaluator object, a NoSuchElement exception may be thrown and the result of the selection may be incorrect. Now, iterator objects are thread-local. 2088
Update instructions: https://github.com/jhy/jsoup/releases/tag/jsoup-1.17.2