Python crawls the daily forecast results of the Central Meteorological Observatory

Crawl the daily forecast results of the Central Meteorological Observatory

1. Data website introduction

​The website of the Central provides 1-7 days of forecast information for each element (here, precipitation information is taken as an example). Through the analysis of weather observation data and numerical models, the website provides forecasts of precipitation in various regions across the country in the next few days. Users can obtain accurate precipitation forecasts through the website in order to make corresponding meteorological decisions and arrangements. Whether you are an individual or a professional, you can obtain reliable precipitation forecast information on the website of the Central Meteorological Observatory to help you meet your meteorological needs in daily life or work.

2. python crawling code

from selenium import webdriver  
from selenium.webdriver.common.action_chains import ActionChains
# from selenium.webdriver.support.ui import Select
# from selenium.webdriver.chrome.service import Service
import time
import os
import urllib.request
import datetime


# import sys
# sys.path.append("D:\ProgramFiles\Anoconda3\envs\pyht\Lib\site-packages/")

"""
version:
selenium = 3.141.0
"""


def get_picture():
    level = ['24小时']

    save_map_path = "D:/tmp/中央气象台-降水预报/"
    save_git_path = "D:\My_Document\GitHub\HthtPre.github.io\Forecast/"

    # chrome_driver = 'D:/Browser_Download/chromedriver_win32/chromedriver.exe'  # chromedriver的文件位置
    chrome_driver = r"D:\Browser_Download\chromedriver_win32113.0.5672.63\chromedriver.exe"
    driver = webdriver.Chrome(executable_path=chrome_driver)  # 加载浏览器驱动
    # driver = webdriver.Chrome(service = Service("D:\python\chubanshe\geckodriver.exe"))  # 加载浏览器驱动
    # driver.get('http://www.nmc.cn/publish/precipitation/1-day.html')  # 打开页面
    driver.get('http://www.nmc.cn/publish/precipitation/1-7day-precipitation.html')
    time.sleep(3)
    # 模拟鼠标选择高度层
    try:
        for z in level:
            button1 = driver.find_element_by_link_text(z)  # 通过link文字精确定位元素
            action = ActionChains(driver).move_to_element(button1)  # 鼠标悬停在一个元素上
            action.click(button1).perform()  # 鼠标单击
            # time.sleep(1)
            for p in range(7):  # 下载最近6个时次的天气图
                str_p = str(p)
                if p > 0:
                    # 模拟鼠标选择时间
                    button2 = driver.find_element_by_id('next')  # 通过id精确定位元素
                    action = ActionChains(driver).move_to_element(button2)  # 鼠标悬停在一个元素上
                    action.click(button2).perform()  # 鼠标单击
                    time.sleep(3)
                # 模拟鼠标选择图片
                elem_pic = driver.find_element_by_id('imgpath')  # 通过id精确定位元素
                action = ActionChains(driver).move_to_element(elem_pic)
                # action.context_click(elem_pic).perform()              #鼠标右击
                filename0 = str(elem_pic.get_attribute('src')).split('/')[-1].split('?')[0]  # 获取文件名
                filename  = filename0[36:44] + '-day' + str(int(filename0[48:51])//24) + filename0[-4:]
                # print(filename)
                # 获取图片src
                src1 = elem_pic.get_attribute('src')
                save_path = save_map_path + '/' + filename0[36:40] + '/' + filename0[40:44] + '/'
                save_git  = save_git_path + '/' + filename0[36:40] + '/' + filename0[40:44] + '/CMA/'
                # 保存7天预报结果
                if os.path.exists(save_path) is not True:
                    os.makedirs(save_path)
                # git上传3天预报结果
                if os.path.exists(save_git) is not True:
                    os.makedirs(save_git)

                urllib.request.urlretrieve(src1, save_path + filename)
                # if p<3:
                #     urllib.request.urlretrieve(src1, save_git + filename)
                print(filename)
                time.sleep(3)
    except:
        print("爬取结果失败!!!")

    driver.quit()

if __name__ == "__main__":
    get_picture()
    time.sleep(10)
    # 判断爬取结果是否完整
    save_path = "D:/tmp/中央气象台-降水预报/"
    time_tick = datetime.datetime.now().strftime("%Y%m%d")
    year = time_tick[0:4]
    mon  = time_tick[4:6]
    day  = time_tick[6:8]
    pic_path = os.path.join(save_path+year, mon+day)
    pic_list = os.listdir(pic_path)
    # 如果爬取结果不是7天,则再爬取一次,一共三次机会
    flag = 1
    if (len(pic_list) != 7) and (flag < 3):
        time.sleep(60)
        get_picture()
        flag += 1
    elif (len(pic_list) != 7) and (flag >= 3):
        print("尝试爬取三次,但都失败了")
    else:
        print("爬取成功,图片已保存!")

3. Windows scheduled tasks

​ According to the following steps, specify a timed task in windows, keep the computer online, and then automatically obtain the corresponding forecast results every day.

  • Open Task Scheduler: Search for "Task Scheduler" in the start menu and open it.
  • Create a new task: In the Task Scheduler window, click "Create Basic Task" on the left to enter the task creation wizard.
  • Name and describe the task: On the first page, enter a name and description for the task, then click Next.
  • Set the trigger: On the next page, select the conditions to trigger the task, such as "Daily", "Weekly", "Monthly", etc., and click "Next".
  • Set operation: On the next page, select the operation to be performed, such as running a program, opening a file, etc., and then click "Next".
  • Configure task: On the next page, set additional options for the task as required, such as whether to display a window when starting, whether to require administrator rights, etc., and then click "Next".
  • Finish setup: On the last page, confirm the task configuration you've set up, and click Finish.

Scheduled task settings
Obtained daily forecast results

Guess you like

Origin blog.csdn.net/qq_38734327/article/details/131454895