Python crawler combat (basic articles) - 1 Get Weibo TOP10 hot search (with complete code)

Today we will talk about how Python crawlers can obtain TOP10 hot search keywords on Weibo. If it is helpful to you, please pay more attention to it, like it a lot, and add a lot of favorites! !

Please add a picture description

get down to business

The first step is to enter the Weibo official website: click me to enter

We can see that the trending search is at the bottom right

insert image description here

Step 2, click [f12], or [right click] to check, check the hot search, URL source

insert image description here

step 3

1. Click Network, refresh the page, all loaded resources will appear on the left

2. We found that there is a 【hotSearch】this is the hot search link

3. Click the [{}] icon below to format the json information

insert image description here

Step 4 find the request URL

insert image description here

Step 5 code request

As shown in the figure, the request is successful

insert image description here

Step 6 Organize the data

1. Convert the returned data (string) to dict

2. After analysis, it is found that hot searches are mainly in a list:json.loads(url.text)['data']['realtime']insert image description here

3. Get

insert image description here

4. Continue to analyze (you can continue to analyze, there are still many categories in it, I will not analyze it here)

insert image description here

All have been obtained here, and then write to Excel

insert image description here

import json
import re

import openpyxl
import requests
from lxml import etree

wb = openpyxl.Workbook()
ws = wb.active
ws.append(['顺序','热搜分类','热搜关键词'])

url = requests.get("https://weibo.com/ajax/side/hotSearch")
# url.encoding= "gbk"
# print(url.text)
data = json.loads(url.text)['data']['realtime']
for i in data:
    # print(i)
    try:
        print(f'热搜:{
      
      i["realpos"]}, 热搜分类[{
      
      i["category"]}], 热搜关键词:{
      
      i["word"]}')
        ws.append([i["realpos"],i["category"],i["word"]])
    except:
        pass
wb.save("热搜.xlsx")

I hope everyone has to help

I've seen this, follow + like + bookmark = don't get lost! !

Guess you like

Origin blog.csdn.net/weixin_42636075/article/details/131935111