Using Python to Crawl the Keywords of Baidu's Today's Hot Event Ranking List - Code World

Using Python to Crawl the Keywords of Baidu's Today's Hot Event Ranking List

News 2023-06-12 12:59:32 views: null

Baidu Today's Hot Event Ranking URL : Today's Hot Event Ranking

code:

#CrawBaiduTop.py
import requests
from bs4 import BeautifulSoup
import bs4

tops = []                                                                   #创建空列表，用于储存词条
url = 'http://top.baidu.com/buzz?b=341&c=513&fr=topbuzz_b1_c513'
r = requests.get(url, timeout=40)                                           #获得url信息，设置40秒超时时间
r.raise_for_status()                                                        #失败请求(非200响应)抛出异常
r.encoding = r.apparent_encoding                                            #根据内容分析出的编码方式，备选编码；
html = r.text                                                               #获得的HTML文本                                 
table = BeautifulSoup(html,"html.parser").find("table")                     #对获得的文本进行html解析，查找<table>内的信息
for words in table.find_all("a"):                                           #查找<table>内<a>的所有信息
    if words.string !='search' and words.string !='新闻' and words.string !='视频'and words.string !='图片':
        tops.append(words.string)                                           #append() 方法用于在列表末尾添加新对象
    else:
        continue
print(tops)

result:

Guess you like

Origin blog.csdn.net/admiz/article/details/79835969

Using Python to Crawl the Keywords of Baidu's Today's Hot Event Ranking List

Python- crawl today's headlines

Python crawls the top ten data of Baidu's hot list

How to crawl the title of the comprehensive hot list of CSDN, and count the frequency of keywords by the way｜Reptile case

[Python realizes web crawler 20] Knowing the hot list crawl

Python Crawl Shrimp Music Ranking

[python] Crawl the Kugou Music Top500 ranking list [with source code]

Python reptile's Requests Library - Baidu / 360 search keywords submitted

Reptile combat: If it weren't for learning, who would crawl young lady? Analyze Ajax to crawl today's headline street photos (python)

using keywords what's the use? What is IDisposable?

Springboot's product design hot-selling ranking realization

Python ranking data reptiles crawl University --2019

Hundreds of thousands of QPS, Baidu's stability guarantee practice for hot event search

Xiaojie's learning process of requests + re (regular) cat's eye top100 ranking information crawl

Today's headlines spurred the "search ads" trigger and stepped into the footsteps of Baidu's "bid ranking"?

Which Chinese university is stronger? Python crawls the ranking list, it's great(31)

Crawler actual combat: crawl a large number of pictures through Baidu keywords

Python crawler combat: use pyquery to crawl the content of the cat's eye movie TOP100 list-1

Python crawler combat: use beautiful soup to crawl the content of the cat's eye movie TOP100 list-1

Today's hot news recommendation system based on Python crawler + K-means machine learning algorithm - hot recommendation, hot word presentation and personalized analysis (including all project source code)

Compilation of tricks using for loop statements in python's List

Python Best University Network University Ranking Crawl (2020)

About using jQuery's keyup () event

Crawl Jingdong mobile ranking

Byte beat (Today's headlines) Merchants hot in autumn

it's hot

Using Python Crawler (Case 8)-Today's eating X, I have to get something right

[Today’s sharing] Official example using python to call APIkey of OpenAI tahcTPG to generate intelligent question and answer

Use python to crawl the short review of the Douban movie "Hot"

The basic routine of python using beautifulsoup to crawl

Recommended

TIOBE May list: Fortran “resurrected” into Top 10

GCC 14.1 released

Ranking

B. Little Girl and Game【1300 / 回文字符串博弈论】

CIKERS Shane 20190613

"Javascript advanced programming" study notes - the constructor and prototype

beeline hiveserver2 start

springboot - Automatically backup mysql data every day

Data Storage Full Solution--Detailed Persistence Technology

Detailed Explanation of Spring Web MVC DispatcherServlet—Official Original

TCP / IP protocol layers structure and function

Command type literal pos: unknown； Fallback type literal pos: unknown] with root cause

Design of multifunctional curtain controller with indoor anti-theft alarm

Daily

More

2024-05-08(18)

2024-05-07(34)

2024-05-06(6)

2024-05-05(0)

2024-05-04(18)

2024-05-03(8)

2024-05-02(0)

2024-05-01(4)

2024-04-30(36)

2024-04-29(5)