Small example of crawler parsing request to obtain data

1. View the content of the Weibo page

First open Weibo, log in to your own account, and then select the desired user page. Then click F12, open the developer tools, click network, then refresh the page and select the XHR option

The next step is to find requests that contain Weibo content among these requests

You can see that there are 22 pieces of content in one request

Then observe the request header

These are the urls we want to request

Next is the demo of the crawler. When we need to request data from this url, we must bring the user-agent and cookie, otherwise the request will fail, and the cookie will easily expire. If you find that the demo fails to run in a few days, you can log in again and copy the new cookie.

Here we just pull down in the request header and get the ua and cookie inside.

 Finally, the demo part:

import requests
import json

headers = {
            'user-agent': '你的ua',
            'cookie': '你的cookie'
        }

response = requests.get('https://weibo.com/ajax/statuses/mymblog?uid=2219143801&page=2&feature=0',headers=headers).json()
print(response)

 The final result obtained:

 This looks more troublesome, you can directly see what content you need in the browser.

In the end, just grab what you need by yourself.

Here is only one page of content. If you want more content, you can observe the rules of the url. You can see that there is a page=1 in the url.

 So you only need to change the number of pages to 2, 3, etc. to get the following Weibo content. The previous uid is the id of the user, and changing the uid to the corresponding user id value can obtain the content of other users.

Guess you like

Origin blog.csdn.net/weixin_44134535/article/details/128001233