Python extra: selenium+matplotlib crawling weather forecast to make temperature line chart

Hello, everyone, my name is wangzirui32, today we will learn how to use selenium+matplotlib to crawl the weather forecast and make a line chart of future temperature.
Let's take a look at the finished product first:
chart
well, seeing our results, I believe you will be more motivated to learn!
Start learning!

1. Crawl weather data and store it in a json file

1.2 Data sources

The data source used here is the weather forecast provided by China Weather Network ( www.weather.com.cn ).

1.3 Analyze the weather forecast webpage

I take Taizhou City as an example. The weather forecast page for the next 7 days (actually we want to crawl is the next 6 days, does not contain today's weather data) is http://www.weather.com.cn/weather/ 101191201.shtml , it can be seen that 101191201 is the city number, and the weather forecast of different cities can be obtained by switching the city number.
Start analysis: It
ul tag
can be seen that each li tag in ul is the weather data for a day. Continue to analyze:
li tagOK, after the analysis is complete, start writing the crawling code.

1.4 Writing crawling code

from selenium.webdriver import Firefox
from datetime import datetime, timedelta
from json import dump

# 城市编号
city_id = 101191201

# url生成
url = "http://www.weather.com.cn/weather/" + str(city_id) + ".shtml"

# executable_path设置为你电脑上浏览器驱动的路径
driver = Firefox(executable_path="geckodriver.exe")

driver.get(url)

# 查找之前分析的ul标签
ul = driver.find_element_by_xpath("//ul[@class='t clearfix']")
# 获取ul里所有的li标签
li_list = ul.find_elements_by_tag_name("li")
# 删除第一项的数据 也就是今天的
del li_list[0]

# 天气数据列表存储我们爬取的所有温度数据
temperature_data_list = []

# 设置当前爬取的日期为1天后
day = 1

for li in li_list:
	# 找到之前分析的p标签
    p = li.find_element_by_css_selector("p.tem")
    
    # 将今天的日期加上day天 然后格式化存储到date中
    date = (datetime.now() + timedelta(days=day)).strftime("%Y-%m-%d")

	# 最高气温和最低气温
	# 说明:[:-1]是为了把内容中末尾的℃符号去掉
    low_temperature = int(p.find_element_by_tag_name("i").text[:-1])
    high_temperature = int(p.find_element_by_tag_name("span").text[:-1])
	
	# 构造数据字典
    dict_temperature = {
    
    "date": date,
                        "low": low_temperature,
                        "high": high_temperature}
	# 存储
    temperature_data_list.append(dict_temperature)

	# 天数+1
    day += 1

with open("temperature.json", "w") as t:
	# 将数据存储至json
    dump(temperature_data_list, t)

Run the code to store weather data.

2. Read the data and use matplotlib to make a chart

Now, we get the data in the loaded temperature.json and make a chart with matplotlib:

import matplotlib.pyplot as plt
from json import load
from datetime import datetime

# 由于matplotlib不支持中文 只好把“泰州”换成“TaiZhou”
city_name = "TaiZhou"

with open("temperature.json") as t:
    data = load(t)
	
	# 准备3个存储数据的列表
    date, high, low = [], [], []

    for d in data:
        date.append(datetime.strptime(d["date"], '%Y-%m-%d'))
        high.append(int(d["high"]))
        low.append(int(d["low"]))

# 开始绘制泰州一周的气温折线图
# 最高气温为红色线 最低为蓝色线
plt.plot(date, high, c='red', linewidth=2)
plt.plot(date, low, c='blue', linewidth=2)
# 在红线和蓝线之前填充一些淡淡的蓝色
plt.fill_between(date, high, low, facecolor='blue', alpha=0.1)
# 给数据设置标签
plt.title("A week's temperature in " + city_name, fontsize=24)
plt.xlabel("Date\nData Source:www.weather.com.cn", fontsize=15)
plt.ylabel("Hightemperature and Lowtemperature", fontsize=15)
# 设置展示格式
plt.tick_params(axis='both', labelsize=10)
# 保存图表
plt.savefig("A week's temperature in " + city_name + ".png", bbox_inches='tight')
# 展示给用户
plt.show()

Write at the end

With this program, you can also have some "sky-defying" operations, such as directly crawling the temperature data for the next 40 days. Of course, the author will not demonstrate it here. You can try it yourself!


Okay, these are today's content, if you are interested, you can like and collect, thank you!

Guess you like

Origin blog.csdn.net/wangzirui32/article/details/113871177