Python data analysis - PMI data graphic display

Get into the habit of writing together! This is the 5th day of my participation in the "Nuggets Daily New Plan · April Update Challenge", click to view the details of the event .

A modest gentleman, use Dachuan.

foreword

The graphs of ppi-cpiand . In this article, I continue to share an indicator that reflects the prosperity of economic activities. In this article, we still use the crawler method to obtain data, and then use the drawing tool to display the data year by year. For beginners, you will learn the basics of , crawler and graphic drawing knowledge.m0-m1-m2PMImatplotlibPMIpython

PMI data acquisition

Before obtaining the data, let’s talk about PMIthe meaning behind the (purchasing managers’ index) data: Everyone knows that manufacturing is the foundation of a country, so it PMIis an indicator to measure the development and operation of a country’s manufacturing industry. If it is more than 50%, it is the dividing line of economic strength. If it is more than 50%, it means that the manufacturing industry is expanding. If it is 40-50, it means recession. If it is below 40, it means depression.

Since it is data acquisition, you need to find an authoritative website to obtain data. Here, the editor uses the data of Oriental Fortune.com, and here directly gives the access address of the page:

# 货币供应量数据访问地址
https://data.eastmoney.com/cjsj/pmi.html
复制代码

The data sources of the purchasing managers' index are shown in the figure below. Only the index data of the manufacturing and non-manufacturing industries can be obtained here, and the year-on-year growth data will not be obtained.

Now that you know the source of the PMI, how do you get the data? Is it necessary to copy the page into excel for analysis? If you do, it will be time-consuming and labor-intensive. I think you have noticed that there is a pagination below the table, so there must be communication with the background through ajax. By observing, you can find the following interface, and the result of data interaction is shown in the following figure:

#采购经理人指数
https://datainterface.eastmoney.com/EM_DataCenter/JS.aspx?type=GJZB&sty=ZGZB&p=1&ps=200&mkt=21

# 这里也同样贴了前文中货币供应量接口、 ppi 和 cpi 的接口,会发现都是一样的,只不过mkt的参数不一样
# 货币供应量接口
https://datainterface.eastmoney.com/EM_DataCenter/JS.aspx?type=GJZB&sty=ZGZB&p=1&ps=200&mkt=11
# ppi 数据和cpi 数据
https://datainterface.eastmoney.com/EM_DataCenter/JS.aspx?type=GJZB&sty=ZGZB&p=1&ps=10&mkt=22
https://datainterface.eastmoney.com/EM_DataCenter/JS.aspx?type=GJZB&sty=ZGZB&p=1&ps=10&mkt=19
复制代码

As for the acquisition of data, it is still operated in a principled way, using to pythoncapture data, here is used requeststo acquire data:

    body = requests.get(req_url).text
    body = body.replace("(", "").replace(")", "")
    data_list = body.split("\",\"")

    # 定义数据
    date_list, pmi1_list, pmi2_list = [], [], []

    for node in data_list:
        node = node.replace("]", "").replace("[", "").replace("\"", "")
        arr_list = node.split(",")
        date = arr_list[0]
        if date < "2010-01-01":
            continue
        # 时间数据
        date_list.append(date)
        # 数据操作存储
        pmi1_list.append(float(arr_list[1]))
        pmi2_list.append(float(arr_list[3]))
        print(node)

复制代码

The final data obtained is shown in the following figure:

PMI graphics drawing

Before drawing the graph, the data needs to be processed:

  • 1 The data needs to be processed to extract the data that needs to be displayed, and then the format of the data needs to be converted.
  • 2 During data processing, the data is obtained according to the list of manufacturing and non-manufacturing industries and time.
  • 3 Still use np.asarray to create data and prepare for graphics drawing.

According to the above point of view, the data processing code is shown in the following figure:

For the drawing of graphics, there are the following points:

  • 1 The data of manufacturing and non-manufacturing needs to be displayed in the graph, and the legend should be displayed for identification.
  • 2 Set the indicator to 50 and 40 horizontal lines, which are used to set standard contrast line styles.

Finally, after these encodings, the final comparison graph of manufacturing and non-manufacturing indices is as follows:

Summarize

In this article, a simple pythoncrawler , numpysimple data processing matplotlibis performed graph is drawn using , which realizes an intuitive way to display the graphs of manufacturing and non-manufacturing indices. Using the interface method to obtain data, you can obtain the data update and redraw the graph at any time, saving the step of data recapture.

Guess you like

Origin juejin.im/post/7082787341508542500