overview
I believe that those who have made reports are extremely annoyed by them. Weekly reports, monthly reports, quarterly reports; period after period of reports, they are filled with sadness and tears, and they are extremely annoying. Next, I will introduce to you today’s protagonist- xlwings .
Let’s take a brief look at the final generated table effect.
Let’s take a look at this case.
The following is our original data. There are three sheets in total. Each sheet contains data on raw coal, crude oil, and natural gas. The indicators include current value of output, cumulative value of output, year-on-year growth in output, and cumulative growth in output.
These data can be downloaded from the National Bureau of Statistics , and interested friends can download them by themselves. This case allows us to output the data in the form of the above table. The indicator name is white and the cells are black. In addition, in the data, red is the mark that is greater than the average value, blue is the mark that is less than the average value, and the table font is Song Dynasty. .
First, import the relevant libraries and use python to read the original data.
import pandas as pdimport xlwings as xw
raw_coal=pd.read_excel(r'统计局数据.xlsx',sheet_name='原煤')
crude_oil=pd.read_excel(r'统计局数据.xlsx',sheet_name='原油')
natural_gas=pd.read_excel(r'统计局数据.xlsx',sheet_name='天然气')
data=pd.merge(raw_coal,crude_oil,on='指标')
data=pd.merge(data,natural_gas,on='指标')
finally_data=data[['指标','原煤产量当期值(万吨)','原油产量当期值(万吨)','天然气产量当期值(亿立方米)']]
print(finally_data)
As far as the data is concerned, it's not far from the final table we want, just a little bit more detail.
It’s time to introduce our protagonist xlwings. xlwings can read and write data in excel files very conveniently. The most important thing is that it can modify the format of the unit and can be seamlessly connected with pandas.
Use the xlwings library to create an excel workbook and create a table in the workbook with the name of finally_data.
Then copy the data integrated using pandas above to the finally_data table. Of course, there are three ways to copy the data to the table.
The first one: treat a piece of data as a unit and write it one by one into the created table. At this time, you need to pay attention to the position of each data in excel and the position in the dataframe table to avoid errors.
The second type: Treat a row of data as a unit. At this time, you need to pay attention to the position of the first data in each row in excel. Please refer to the copy and paste form.
The third method: treat the data of a table as a unit, which is essentially the same as the second method. The data is passed in in slices, but the third method is written in the form of a one-dimensional array.
wb=xw.Book()
sht=wb.sheets['Sheet1']
sht.name='finally_data'
columns=list(finally_data.columns)##得到列名
sht.range('A1').value = columns####在第一行复制列名
##第一种方式,将一个数据为单位,一个个写入创建的表格中
# for row in range(2,11):
# for col in range(1,5):
# sht.range(row,col).value =finally_data.iloc[row-2,col-1]
##第二中方式,将一行数据为单位,一行一行的写入创建的表格中
# for i in range(0,len(finally_data)):
# data_row=list(finally_data.iloc[i,:])
# row=i+2
# row_clo='A'+str(row)
# sht.range(row_clo).value =data_row
#第三种方式,将一张表格为单位,直接写入创建的表格中
finally_data1=finally_data.values
sht.range('A2').value = finally_data1
All three can achieve the results we want, and each has its own advantages and disadvantages. The editor likes the third one. When this step is reached, all that is left is to modify the format of the cells in the table.
Before modifying the cells, we must first find out the current value of raw coal production, the long-term value of crude oil production, and the current value of natural gas production. The data in these three columns of data that are greater than the average and less than the average are in the Dataframe, and at the same time, we get The location of this data in excel makes it convenient to modify the cell format.
describe=finally_data.describe()
avg=list(describe.loc['mean',:])
##计算大于均值的数在excel的位置
red_原煤=list(finally_data.index[finally_data['原煤产量当期值(万吨)']>avg[0]])
red_position1=['B'+str(i+2) for i in red_原煤 ]
red_原油=list(finally_data.index[finally_data['原油产量当期值(万吨)']>avg[1]])
red_position2=['C'+str(i+2) for i in red_原油 ]
red_天然气=list(finally_data.index[finally_data['天然气产量当期值(亿立方米)']>avg[2]])
red_position3=['D'+str(i+2) for i in red_天然气 ]
red=red_position1+red_position2+red_position3
##计算小于均值的数在excel的位置
blue_原煤=list(finally_data.index[finally_data['原煤产量当期值(万吨)']<avg[0]])< p="">
blue_position1=['B'+str(i+2) for i in blue_原煤 ]
blue_原油=list(finally_data.index[finally_data['原油产量当期值(万吨)']<avg[1]])< p="">
blue_position2=['C'+str(i+2) for i in blue_原油 ]
blue_天然气=list(finally_data.index[finally_data['天然气产量当期值(亿立方米)']<avg[2]])< p="">
blue_position3=['D'+str(i+2) for i in blue_天然气 ]
blue=blue_position1+blue_position2+blue_position3
print(red)
print(blue
Finally, all the conditions are met, and the format of the table can finally be modified.
The first step is to change all fonts to Song Dynasty and add borders to the areas with data in the table.
#区域内字体改变成宋体,加上边框
a_range = f'A1:D10'#区域
sht.range(a_range).api.Font.Name='宋体' #字体
sht.range(a_range).api.Borders(8).LineStyle = 1 #上边框
sht.range(a_range).api.Borders(9).LineStyle = 1 #下边框
sht.range(a_range).api.Borders(7).LineStyle = 1 #左边框
sht.range(a_range).api.Borders(10).LineStyle = 1 #右边框
sht.range(a_range).api.Borders(12).LineStyle = 1 #内横边框
sht.range(a_range).api.Borders(11).LineStyle = 1 #内纵边框
The second step is to change the font of the first row to white and fill the cells with black.
#区域内字体颜色成白色,单元格变成黑色
b_range = f'A1:D1'#区域第一行
sht.range(b_range).api.Font.Color = 0xffffff
sht.range(b_range).color=(0, 0, 0)
The last step is to change the font of data greater than the mean to red and the font of data less than the mean to blue. Then save.
#######在excel 表格里改变字体颜色
for i in red:
sht.range(i).api.Font.Color = 0x0000ff
for i in blue:
sht.range(i).api.Font.Color = 0xFF0000
wb.save('结果数据.xlsx')
wb.close()
After the results came out, they met our requirements.
This case has been completed. Of course, starting a complete automated reporting project is far more than that simple, and there will be other problems in the middle.