Through the packet capture tool, we can find that when we refresh the entire page, the data packets we want to crawl are not found in the packet capture tool, so it is determined that they are dynamically loaded by ajax.
Note: The packages captured in XHR are dynamically loaded with ajax, their links cannot be obtained directly from the links of the entire page where they are located, and their links can be obtained in the request header
Analysis of ajax means that we need to extract the URL in the packet capture tool
Through analysis, we can get: the unique ID value of all companies can be obtained from the ajax on the homepage.
In each company's AJax package, we can get the same part of each detail page.
To conclude, we need to obtain the unique ID of each company, and then concatenate the same part of each link to get the specific details of each company.