python reptile --- to achieve the project (b) analysis of the Ajax request to fetch data

This time we have to continue in-depth data reptiles, some pages can not directly get the data requested by html code, we need the data is rendered through ajax to page up, this time we take a look at how to analyze ajax

We use this network library or on one of Requests, mongodb used to store the result (in advance pymongo need to install the library), open multi-threaded crawling.

 Analysis: There are many pages after opening, we get html source code and can not get the data we want, then the site is likely to be data through ajax to load.

We turn on debug mode F12, click NetWork, we have to analyze the data I want where to hide

We can see the data we need is loaded by ajax out.

Project: Analysis of Ajax to grab headlines today street shoot Mito

Code Address: https://gitee.com/dwyui/toutiao_jiepai.git

Simply look at our operating results:

 

Guess you like

Origin www.cnblogs.com/cxiaocai/p/10958210.html