JSoup doesn't load the whole HTML - Code World

JSoup doesn't load the whole HTML

Others 2022-04-21 22:12:22 views: 0

wdc :

I want to scrape a website but when I connect to it using Jsoup.connect(url) only a part of the page is loaded.

When I downloaded the page as html I saw that in one part of the page there is only a loader icon so I concluded that that part of the page is loaded afterwards from some other source.

The funny thing is that inspect element contains the missing html and view page source doesn't. HTML loaded from jSoup is basically the same as when opened from "view page source".

Is there a way to bypass this and to load the whole page as it is displayed in browser?

The page in question is this: https://www.oddsportal.com/tennis/australia/atp-australian-open-2017/results/page/1/

Ask for any additional information I could provide.

===============

EDIT: I am connecting to url like this:

Document doc = null;

try {
    doc =  Jsoup.connect(url).get();
} catch (IOException e) {
    e.printStackTrace();
}

I am getting this div using css selector:

Elements tournamentTable = doc.select("div[id=tournamentTable]");

Content of tournamentTable is <div id="tournamentTable"></div>

Krzysztof Atłasik :

It seems id=tournamentTable is generated dynamically using javascript. JSoup is not evaluating javascript, so you'd have to use library like HtmlUnit. For example:

WebClient webClient = new WebClient(BrowserVersion.CHROME);
webClient.getOptions().setJavaScriptEnabled(true); // enable javascript
webClient.getOptions().setThrowExceptionOnScriptError(false); //even if there is error in js continue
webClient.waitForBackgroundJavaScript(5000); // important! wait when javascript finishes rendering
HtmlPage page = webClient.getPage(url);

page.getElementById("tournamentTable");

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=91890&siteId=1

JSoup doesn't load the whole HTML

HTML doesn't show scrollbars

jsoup parse html usage

Java HTML parser [jsoup]

jsoup: Java HTML parser

HTML & XML parser --Jsoup

JSoup parsing html

why html base tag doesn't work

Load a Document from a URL using jsoup

Jsoup doesn't work properly with encoded link containing non-letter characters

How to load a satellite image map of the whole country?

Jsoup parses HTML, Java reads CSV files

Java parses HTML files through Jsoup

Use Jsoup to parse the img element in html

JSoup - Parsing this nested HTML unordered list

SpringBoot, Java use Jsoup to parse HTML pages

jsoup 1.16.2 released, Java HTML parser

jsoup 1.17.2 released, Java HTML parser

I want to watch it, but the computer doesn't have a network. Python teaches you to save the whole cost as TXT~

Import external html containing script tag - doesn't work

React code to insert HTML doesn't seem to work

[] HTML Parser parsing HTML: Based on third-party libraries Jsoup

Spring Boot doesn't load a Map from environment variables when in camel-case

My Spring Boot project doesn't load css properly from the static folder

.click() only firing once then no more after that unless i refresh the whole page and doesn't render at all if i call $(document).ready()

Pycharm editing html pages can not fill the whole

Java HTML parser jsoup release 1.13.1, parsing speed significantly improved

Fetching proper text from html tags using JSoup

Detailed explanation of Jsoup parsing HTML examples and document methods

Detailed explanation of Jsoup parsing HTML examples and document methods

Recommended

Arc Browser for Windows 1.0 officially GA

A programmer born in the 1990s developed a video porting software and made over 7 million in less than a year. The ending was very punishing!

Ranking

1. Select Sort

Create a thread thread

3 press to play ball that reach 6

Programmation CUDA (4) : gestion de la mémoire

SpringBoot database connection pool Druid error

E Diudiu App redesign summary

4EVERLAND Hosting now supports SNS+IPFS

About HTTPS

[vue3+vite+ts+element-plu+sass] uses bug records in sass

Interpretation of HUAWEI CLOUD GaussDB (for Influx): Best Practice Data Modeling

Daily

More

2024-05-03(8)

2024-05-02(0)

2024-05-01(4)

2024-04-30(36)

2024-04-29(5)

2024-04-28(12)

2024-04-27(29)

2024-04-26(22)

2024-04-25(32)

2024-04-24(30)