python构建智能机器人系列博文---借助于网络爬虫技术实现天气的自动查询系统

最近，课程这边需要实现一个简单的人机交互系统，考虑到自己之前实现过基于python的智能聊天机器人，参考链接：(13条消息) 基于“机器学习”的智能聊天机器人—python实现（1）_隔壁李学长的博客-CSDN博客_python训练聊天机器人。所以这次打算实现一个类似的智能交互系统，但是主体转向功能性：基于python去实现一个功能性的机器人，即可以根据用户输入理解命令，执行对应的操作，而不只是仅仅进行聊天。

这个系统我会作为一个系列来进行更新，本篇博文更新天气查询系统的实现，后续还会相继更新剩余部分，包括：

自动发送微信或者QQ消息
自动实现指定音乐播放，网页内容的自动检索
自动发送邮件

二、天气查询系统

天气查询系统的实现原理和思路比较简单，主要利用的是网络爬虫的技术。通过城市的名称构造网址，分析网页源码，获取目标城市的天气信息，返回即完成天气的智能查询系统。

天气查询代码的主体为网页内容的解析，针对于网络爬虫的代码，本文不会进行详细讲解，一方面自己最近复习考试，时间紧张；另外，针对于爬虫，个人了解也不是深刻，就不发出来误人子弟。后续相关知识完善后，可能会出几篇相关的博文介绍一下爬虫的相关技术。

老样子，还是先粘贴代码，方便大家参考和应用：

# -*-coding = utf-8-*-

# Author:qyan.li
# Date:2022/5/12 18:30
# Topic:爬虫返回天气信息(class实现)

import urllib.request,urllib.error
from bs4 import BeautifulSoup
import re

class WeatherInfo():
    def __init__(self,cityName):
        self.city = cityName
        self.url = r'https://www.tianqi.com/' + str(cityName) + r'/'
        self.findTime = re.compile(r'<dd class="week">(.*?)</dd>')
        self.html = None
        self.WeatherInformation = ''


    def askURL(self):
        # 模拟浏览器头部信息(浏览器伪装，不会被设别为爬虫程序)
        # 用户代理，可以接受什么类型的返回文件
        head = {
    
    'User-Agent':  # 中间不能存在任何空格，包括大小写的相关问题
                    'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:100.0) Gecko/20100101 Firefox/100.0'
                }
        request = urllib.request.Request(self.url, headers=head)  # 携带头部信息访问url
        html = ''
        try:
            responseInfo = urllib.request.urlopen(request)  # responseInfo包含网页的基本信息
            html = responseInfo.read().decode('utf-8')  # 防止格式错误
            # print(html)
        except urllib.error.URLError as e:
            if hasattr(e, 'code'):
                print(e.code)
            if hasattr(e, 'reason'):
                print(e.reason)
        self.html = html


    def getData(self):
        soup = BeautifulSoup(self.html, 'lxml')
        item = soup.find('div', class_="left")
        # 分别获得'湿度','天气','空气'信息
        ShiDuItem = item.find('dd', class_='shidu')
        WeatherItem = item.find('dd', class_='weather')
        AirItem = item.find('dd', class_='kongqi')
        item = str(item)
        # 获得时间信息
        Time = str(re.findall(self.findTime, item)[0]).split('\u3000')[0]
        # print(Time)
        # 获得湿度信息
        ShiduInfo = ''
        for item in ShiDuItem.find_all('b'):
            ShiduInfo = ShiduInfo + str(item.string)
            ShiduInfo = ShiduInfo + ' '
        # 获得天气信息
        temperature = WeatherItem.find('p', class_='now').find('b').string + '摄氏度'
        condition = WeatherItem.find('span').find('b').string
        TempCondition = temperature + condition
        # 获得空气信息
        AirCondition = AirItem.find('h5').string
        PM = AirItem.find('h6').string
        AirPM = AirCondition + PM

        self.WeatherInformation = Time + ' '  + ShiduInfo + '温度' +TempCondition + AirPM

    def startWeather(self):
        self.askURL()
        self.getData()


if __name__ == '__main__':
    WeatherItem = WeatherInfo('beijing')
    WeatherItem.startWeather()
    print(WeatherItem.WeatherInformation)

针对于代码中比较重要的几点进行说明：

getData()函数中获取的内容可以进行修改，可以自行调整获取的内容
本代码仅能获取目标城市当天的天气信息，如果想动态调整时间，可自行修改代码
后期欲实现语音人机交互，需要将语音识别的城市名称转换为拼音进行网址的构建，此时需要python的xpinyin模块：
```
from xpinyin import Pinyin

P = Pinyin()
cityName = '北京'
cityName_pinyin = P.get_pinyin(cityName,'')
print(cityName_pinyin)
```
代码调用方式，可以直接放置在目标文件夹下，在py文件引入WeatherClass类即可

from WeatherClass import WeatherInfo，即可直接在新文件中调用该天气查询类

三、总结：

文章内容实现简单的天气查询，可以应用于后续的智能机器人系统中，作为一个小的功能模块，通过此，也可以锻炼一下python代码编写以及网络爬虫的相关操作。
借助于python实现网页内容的自动检索已更新，参考链接：
https://blog.csdn.net/DALEONE/article/details/125196888?spm=1001.2014.3001.5501