Python builds a series of blog posts for intelligent robots --- realizes an automatic weather query system with the help of web crawler technology


Author:qyan.li

Date:2022.5.29

Topic: python implements weather query system with the help of web crawlers


One, written in front:

       ~~~~~~       Recently, the course needs to implement a simple human-computer interaction system. Considering that I have implemented a python-based intelligent chat robot before , refer to the link: (13 messages) Intelligent chat robot based on "machine learning"—python implementation (1 )_Neighbor Li Xuechang's blog-CSDN blog_python training chat robots . So this time I plan to implement a similar intelligent interactive system, but the main body turns 功能性: to implement a functional robot based on python, that is, it can understand commands according to user input and perform corresponding operations, not just chatting.

       ~~~~~~       I will update this system as a series. This blog post updates the 天气查询implementation of the system, and the rest will be updated in succession, including:

  • Automatically send WeChat or QQ messages
  • Automatically realize the specified music playback and automatic retrieval of web content
  • send email automatically

2. Weather query system

       ~~~~~~       The realization principle and idea of ​​the weather query system are relatively simple, mainly using the technology of web crawlers. Construct the website through the name of the city, analyze the source code of the webpage, obtain the weather information of the target city, and return to complete the intelligent weather query system.

       ~~~~~~       The main body of the weather query code is the analysis of web page content. For the code of web crawlers, this article will not explain in detail. On the one hand, I have recently reviewed the exam and time is tight; Come out to mislead the children. After the follow-up relevant knowledge is improved, several related blog posts may be published to introduce the relevant technologies of crawlers.

       ~~~~~~       As always, paste the code first for your reference and application:

# -*-coding = utf-8-*-

# Author:qyan.li
# Date:2022/5/12 18:30
# Topic:爬虫返回天气信息(class实现)

import urllib.request,urllib.error
from bs4 import BeautifulSoup
import re

class WeatherInfo():
    def __init__(self,cityName):
        self.city = cityName
        self.url = r'https://www.tianqi.com/' + str(cityName) + r'/'
        self.findTime = re.compile(r'<dd class="week">(.*?)</dd>')
        self.html = None
        self.WeatherInformation = ''


    def askURL(self):
        # 模拟浏览器头部信息(浏览器伪装,不会被设别为爬虫程序)
        # 用户代理,可以接受什么类型的返回文件
        head = {
    
    'User-Agent':  # 中间不能存在任何空格,包括大小写的相关问题
                    'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:100.0) Gecko/20100101 Firefox/100.0'
                }
        request = urllib.request.Request(self.url, headers=head)  # 携带头部信息访问url
        html = ''
        try:
            responseInfo = urllib.request.urlopen(request)  # responseInfo包含网页的基本信息
            html = responseInfo.read().decode('utf-8')  # 防止格式错误
            # print(html)
        except urllib.error.URLError as e:
            if hasattr(e, 'code'):
                print(e.code)
            if hasattr(e, 'reason'):
                print(e.reason)
        self.html = html


    def getData(self):
        soup = BeautifulSoup(self.html, 'lxml')
        item = soup.find('div', class_="left")
        # 分别获得'湿度','天气','空气'信息
        ShiDuItem = item.find('dd', class_='shidu')
        WeatherItem = item.find('dd', class_='weather')
        AirItem = item.find('dd', class_='kongqi')
        item = str(item)
        # 获得时间信息
        Time = str(re.findall(self.findTime, item)[0]).split('\u3000')[0]
        # print(Time)
        # 获得湿度信息
        ShiduInfo = ''
        for item in ShiDuItem.find_all('b'):
            ShiduInfo = ShiduInfo + str(item.string)
            ShiduInfo = ShiduInfo + ' '
        # 获得天气信息
        temperature = WeatherItem.find('p', class_='now').find('b').string + '摄氏度'
        condition = WeatherItem.find('span').find('b').string
        TempCondition = temperature + condition
        # 获得空气信息
        AirCondition = AirItem.find('h5').string
        PM = AirItem.find('h6').string
        AirPM = AirCondition + PM

        self.WeatherInformation = Time + ' '  + ShiduInfo + '温度' +TempCondition + AirPM

    def startWeather(self):
        self.askURL()
        self.getData()


if __name__ == '__main__':
    WeatherItem = WeatherInfo('beijing')
    WeatherItem.startWeather()
    print(WeatherItem.WeatherInformation)

Explain the more important points in the code:

  • getData()The content obtained in the function can be modified, and the obtained content can be adjusted by yourself

  • This code can only get the weather information of the target city on that day . If you want to adjust the time dynamically, you can modify the code yourself

  • In order to realize voice human-computer interaction in the later stage, it is necessary to convert the city name of voice recognition into pinyin to construct the website. The modules needed at this time pythonare xpinyin:

    from xpinyin import Pinyin
    
    P = Pinyin()
    cityName = '北京'
    cityName_pinyin = P.get_pinyin(cityName,'')
    print(cityName_pinyin)
    
  • The code calling method can be directly placed in the target folder, and the class can pybe imported into the fileWeatherClass

    from WeatherClass import WeatherInfo, you can directly call the weather query class in the new file

3. Summary:

       ~~~~~~       The content of the article realizes simple weather query, which can be applied to the subsequent intelligent robot system as a small functional module. Through this, you can also practice python code writing and related operations of web crawlers.
       ~~~~~~      The automatic retrieval of web content with the help of python has been updated, the reference link:
https://blog.csdn.net/DALEONE/article/details/125196888?spm=1001.2014.3001.5501

Guess you like

Origin blog.csdn.net/DALEONE/article/details/125036611