Detailed explanation of report automation using Python

0417afe4d68b4740a7b485b0c999867d.png


 

overview

I believe that those who have made reports are extremely annoyed by them. Weekly reports, monthly reports, quarterly reports; period after period of reports, they are filled with sadness and tears, and they are extremely annoying. Next, I will introduce to you today’s protagonist- xlwings .

Let’s take a brief look at the final generated table effect.

640?wx_fmt=png&wxfrom=5&wx_lazy=1&wx_co=1&tp=wxpic

Let’s take a look at this case.

The following is our original data. There are three sheets in total. Each sheet contains data on raw coal, crude oil, and natural gas. The indicators include current value of output, cumulative value of output, year-on-year growth in output, and cumulative growth in output.

These data can be downloaded from the National Bureau of Statistics , and interested friends can download them by themselves. This case allows us to output the data in the form of the above table. The indicator name is white and the cells are black. In addition, in the data, red is the mark that is greater than the average value, blue is the mark that is less than the average value, and the table font is Song Dynasty. .

640?wx_fmt=png&wxfrom=5&wx_lazy=1&wx_co=1&tp=wxpic

640?wx_fmt=png&wxfrom=5&wx_lazy=1&wx_co=1&tp=wxpic

640?wx_fmt=png&wxfrom=5&wx_lazy=1&wx_co=1&tp=wxpic

First, import the relevant libraries and use python to read the original data.

import pandas as pdimport xlwings as xw

raw_coal=pd.read_excel(r'统计局数据.xlsx',sheet_name='原煤')crude_oil=pd.read_excel(r'统计局数据.xlsx',sheet_name='原油')natural_gas=pd.read_excel(r'统计局数据.xlsx',sheet_name='天然气')data=pd.merge(raw_coal,crude_oil,on='指标')data=pd.merge(data,natural_gas,on='指标')finally_data=data[['指标','原煤产量当期值(万吨)','原油产量当期值(万吨)','天然气产量当期值(亿立方米)']]print(finally_data)

640?wx_fmt=png&wxfrom=5&wx_lazy=1&wx_co=1&tp=wxpic

As far as the data is concerned, it's not far from the final table we want, just a little bit more detail.

It’s time to introduce our protagonist xlwings. xlwings can read and write data in excel files very conveniently. The most important thing is that it can modify the format of the unit and can be seamlessly connected with pandas.

Use the xlwings library to create an excel workbook and create a table in the workbook with the name of finally_data.

Then copy the data integrated using pandas above to the finally_data table. Of course, there are three ways to copy the data to the table.

The first one: treat a piece of data as a unit and write it one by one into the created table. At this time, you need to pay attention to the position of each data in excel and the position in the dataframe table to avoid errors.

The second type: Treat a row of data as a unit. At this time, you need to pay attention to the position of the first data in each row in excel. Please refer to the copy and paste form.

The third method: treat the data of a table as a unit, which is essentially the same as the second method. The data is passed in in slices, but the third method is written in the form of a one-dimensional array.

wb=xw.Book()sht=wb.sheets['Sheet1']sht.name='finally_data'columns=list(finally_data.columns)##得到列名sht.range('A1').value = columns####在第一行复制列名##第一种方式,将一个数据为单位,一个个写入创建的表格中# for row in range(2,11):#     for col in range(1,5):#         sht.range(row,col).value =finally_data.iloc[row-2,col-1]##第二中方式,将一行数据为单位,一行一行的写入创建的表格中# for i in range(0,len(finally_data)):#     data_row=list(finally_data.iloc[i,:])#     row=i+2#     row_clo='A'+str(row)#     sht.range(row_clo).value =data_row#第三种方式,将一张表格为单位,直接写入创建的表格中finally_data1=finally_data.valuessht.range('A2').value = finally_data1

All three can achieve the results we want, and each has its own advantages and disadvantages. The editor likes the third one. When this step is reached, all that is left is to modify the format of the cells in the table.

Before modifying the cells, we must first find out the current value of raw coal production, the long-term value of crude oil production, and the current value of natural gas production. The data in these three columns of data that are greater than the average and less than the average are in the Dataframe, and at the same time, we get The location of this data in excel makes it convenient to modify the cell format.

describe=finally_data.describe()avg=list(describe.loc['mean',:])##计算大于均值的数在excel的位置red_原煤=list(finally_data.index[finally_data['原煤产量当期值(万吨)']>avg[0]])red_position1=['B'+str(i+2) for i in red_原煤 ]red_原油=list(finally_data.index[finally_data['原油产量当期值(万吨)']>avg[1]])red_position2=['C'+str(i+2) for i in red_原油 ]red_天然气=list(finally_data.index[finally_data['天然气产量当期值(亿立方米)']>avg[2]])red_position3=['D'+str(i+2) for i in red_天然气 ]red=red_position1+red_position2+red_position3

##计算小于均值的数在excel的位置blue_原煤=list(finally_data.index[finally_data['原煤产量当期值(万吨)']<avg[0]])< p="">blue_position1=['B'+str(i+2) for i in blue_原煤 ]blue_原油=list(finally_data.index[finally_data['原油产量当期值(万吨)']<avg[1]])< p="">blue_position2=['C'+str(i+2) for i in blue_原油 ]blue_天然气=list(finally_data.index[finally_data['天然气产量当期值(亿立方米)']<avg[2]])< p="">blue_position3=['D'+str(i+2) for i in blue_天然气 ]blue=blue_position1+blue_position2+blue_position3print(red)print(blue

640?wx_fmt=png&wxfrom=5&wx_lazy=1&wx_co=1&tp=wxpic

Finally, all the conditions are met, and the format of the table can finally be modified.

The first step is to change all fonts to Song Dynasty and add borders to the areas with data in the table.

#区域内字体改变成宋体,加上边框a_range = f'A1:D10'#区域sht.range(a_range).api.Font.Name='宋体' #字体sht.range(a_range).api.Borders(8).LineStyle = 1 #上边框sht.range(a_range).api.Borders(9).LineStyle = 1 #下边框sht.range(a_range).api.Borders(7).LineStyle = 1 #左边框sht.range(a_range).api.Borders(10).LineStyle = 1 #右边框sht.range(a_range).api.Borders(12).LineStyle = 1 #内横边框sht.range(a_range).api.Borders(11).LineStyle = 1 #内纵边框

The second step is to change the font of the first row to white and fill the cells with black.

#区域内字体颜色成白色,单元格变成黑色b_range = f'A1:D1'#区域第一行sht.range(b_range).api.Font.Color = 0xffffffsht.range(b_range).color=(0, 0, 0)

The last step is to change the font of data greater than the mean to red and the font of data less than the mean to blue. Then save.

#######在excel 表格里改变字体颜色for i in red:sht.range(i).api.Font.Color = 0x0000fffor i in blue:sht.range(i).api.Font.Color = 0xFF0000wb.save('结果数据.xlsx')wb.close()

640?wx_fmt=png&wxfrom=5&wx_lazy=1&wx_co=1&tp=wxpic

After the results came out, they met our requirements.

This case has been completed. Of course, starting a complete automated reporting project is far more than that simple, and there will be other problems in the middle.

 
 

Guess you like

Origin blog.csdn.net/Rocky006/article/details/133208878
Recommended