Graduation project - Example analysis of hot website data generated by Python code

As an efficient and concise programming language, Python has received more and more attention and love from everyone in recent years. In this blog, we will introduce you through case analysis how to use Python to write a program to obtain the hot data of multiple hot websites and display it in the GUI window.

First, let's look at the problem we're trying to solve. When we open a browser and browse popular websites such as Baidu and Zhihu, we usually see some hot topics or news recommendations. So, what should we do if we want to obtain this hotspot information and analyze it in our own program?

It's actually very simple. We only need to use Python's requests and lxml libraries to obtain web page data, and use the PyQt5 library to create a GUI window to display the data.

Next, let's take a look at the specific code implementation.

First, we need to import the corresponding libraries and modules:

import sys
import requests
from lxml import html
from PyQt5.QtWidgets import QApplication, QWidget, QVBoxLayout, QHBoxLayout, QPushButton, QLabel, QTextEdit, QComboBox
from PyQt5.QtGui import QIcon
from PyQt5.QtCore import Qt

Then, we need to define a function get_data()to obtain the hot data of hot websites.

def get_data(url):
    response = requests.get(url)
    page_content = response.content.decode('utf-8')
    tree = html.fromstring(page_content)
    hot_items = tree.xpath('//div[@class="hot-list"]/ul/li/a/text()')
    return hot_items

In this function, we use the requests library to obtain web page data, and use the lxml library to parse the HTML page content into a tree structure. Then, we can use XPath expressions to extract hotspot data in the web page.

Next, we need to define a class HotWindowto create the GUI window. In this window, we will add a drop-down box and a button. Users can select the hot websites to be obtained from the drop-down box and click the button to obtain the hot data.

class HotWindow(QWidget):
    def __init__(self):
        super().__init__()

        self.initUI()

    def initUI(self):
        self.setFixedSize(400, 300)
        self.setWindowTitle('Python获取多个热点网站的热点数据')

        self.url_label = QLabel('请选择要获取的热点网站:', self)

        self.url_combo_box = QComboBox(self)
        self.url_combo_box.addItems(['百度', '知乎', '微博'])

        self.hot_button = QPushButton('→进入', self)
        self.hot_button.clicked.connect(self.get_hot_data)

        self.hot_label = QLabel('以下是当前热点:', self)

        self.hot_text_edit = QTextEdit(self)
        self.hot_text_edit.setReadOnly(True)

        hbox = QHBoxLayout()
        hbox.addWidget(self.url_label)
        hbox.addWidget(self.url_combo_box)
        hbox.addWidget(self.hot_button)

        vbox = QVBoxLayout()
        vbox.addLayout(hbox)
        vbox.addWidget(self.hot_label)
        vbox.addWidget(self.hot_text_edit)

        self.setLayout(vbox)

In this class, we use some components provided by PyQt5 to create a GUI window. We added a dropdown box and a button and placed them in a horizontal layout. We then combine this horizontal layout with a vertical layout for displaying the acquired hotspot data.

Finally, we need to HotWindowdefine a function in the class get_hot_data()to obtain the hotspot data of the selected hotspot website and display it in the window.

    def get_hot_data(self):
        selected_url = self.url_combo_box.currentText()

        if selected_url == '百度':
            url = 'https://top.baidu.com/buzz?b=1&fr=topindex'
        elif selected_url == '知乎':
            url = 'https://www.zhihu.com/hot'
        elif selected_url == '微博':
            url = 'https://s.weibo.com/top/summary?cate=realtimehot'

        hot_items = get_data(url)

        self.hot_text_edit.clear()
        for i in range(len(hot_items)):
            self.hot_text_edit.append(str(i+1) + '. ' + hot_items[i])

In this function, we first obtain the popular websites selected by the user. Then, based on the user selection, we determine which web links to crawl. Next, we call the previously defined get_data()function to obtain the hotspot data of the web page.

Finally, we clear the text box in the window and display the obtained data in the text box.

In this blog, we use Python to write a program to obtain the hot data of multiple hot websites and display it in the GUI window. Through this example, we can learn how to use the requests and lxml libraries in Python to obtain web page data, and use the PyQt5 library to create a GUI window to display the data. At the same time, through this example, we also have a deeper understanding of the advantages and convenience of the Python language. I hope this blog is helpful to you, thank you for reading!

It’s still your Xiao Xiao!

Please give me a follow!

 

Guess you like

Origin blog.csdn.net/m0_55813592/article/details/130230126