problem
Do you have an HTML document that contains a path relative URLs, you need to convert these into URLs relative path absolute path.
method
- Make sure that you have specified when parsing the document
base URI
, and then - Use
abs:
property contains prefix to obtainbase URI
an absolute path. code show as below:
Document doc = Jsoup.connect("http://www.open-open.com").get(); Element link = doc.select("a").first(); String relHref = link.attr("href"); // == "/" String absHref = link.attr("abs:href"); // "http://www.open-open.com/"
Explanation
HTML element, URLs are often written in the document path relative position: <a href="/download">...</a>
When you use Node.attr(String key)
the time to obtain a href attribute element method, it returns the HTML source code directly specified predetermined value.
If you need to obtain an absolute path needs to be added before property name abs:
prefix. This will return a URL address that contains the root pathattr("abs:href")
Therefore, when parsing HTML document that defines the base URI is very important.
If you do not want to use the abs:
prefix, there is a way to achieve the same function Node.absUrl(String key)
.