d S :
I need help with parsing HTML with Jsoup from: https://www.sierra.com/clearance~1/women~d~5324/specdataor~gender!women/colorfamily~red/priceor~%2410-%2414dotdot99/3/.
When I am trying to parse any HTML I get
java.net.SocketTimeoutException: Read timed out.
With other URLs, this code is working fine.
How can I solve this problem?
private void Parsedata(){
try {
String URL = "https://www.sierra.com/clearance~1/women~d~5324/specdataor~gender!women/colorfamily~red/priceor~%2410-%2414dotdot99/3/";
System.out.println(getPage(URL));
} catch (IOException e) {
e.printStackTrace();
}
}
private static Document getPage(String URL) throws IOException {
Document page = Jsoup.connect(URL).timeout(0).execute().parse();
return page;
}
Samuel Philipp :
The page you are trying to connect to is requiring a valid user agent. You can set it by using Connection.userAgent()
. You can use the current Chrome version for example:
private static Document getPage(String URL) throws IOException {
return Jsoup.connect(URL)
.userAgent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36")
.timeout(10_000).execute().parse();
}