Several ways to operate excel in Python--xlrd, xlwt, openpyxl

Found a limitation of xlwt when working with excel data – it cannot write more than 65535 rows, 256 columns of data (because it only supports Excel 2003 and earlier versions, where there is this limit on the number of rows and columns), This is still insufficient for practical applications. After some searching, I found an openpyxl that supports Excel version 07/10/13. Although the function is very powerful, it is not as convenient to operate as xlwt. The common operations of the following modules are described below.

xlrd

xlrd is used to read and write data from Excel, but I usually only use it for read operations, and write operations will encounter some problems. It is more convenient to read with xlrd. The process is the same as the usual manual operation of Excel. Open the workbook (Workbook), select the worksheet (sheets), and then operate the cell (cell). Let's take an example, for example, to open the Excel file named "data.xlsx" in the current directory, select the first worksheet, and then read the entire content of the first row and print it out. The Python code is as follows:

#Open the excel file 
data=xlrd.open_workbook( 'data.xlsx') #Get 
the first worksheet (by indexing) 
table=data.sheets()[ 0] 
#data_list is used to store data 
data_list=[] 
# Read and add the data of the first row in the table to data_list 
data_list.extend(table.row_values( 0)) #Print 
out all the data of the first row 
for item in data_list: print item

In the above code, table.row_values(number) is used to read a row. Similarly, table.column_values(number) is used to read a column, where number is the row index. In xlrd, both rows and columns are indexed from 0, so in Excel The top-leftmost cell A1 is row 0, column 0.
To read a cell in xlrd, use table.cell(row,col), where row and col are the row and column corresponding to the cell, respectively.
The following is a brief summary of the usage of xlrd

Summary of xlrd usage

Open an Excel workbook
1

data=xlrd.open_workbook(filename)
View the names of all sheets in the workbook
1

data.sheet_names()

Select a worksheet (by index or table name)

#Get the first worksheet 
table=data.sheets()[ 0] #Get 

the first worksheet by index 
table=data.sheet_by_index( 0) #Select 

worksheet by sheet name 
table=data.sheet_by_name( u'haha ')

Get the number of rows and columns of a table
1
2

nrows=table.nrows
ncols=table.ncols

Get the value of the entire row and column

1 2	table.row_values(number) table.column_values(number)

Read all rows of a table by looping

1 2	for rownum in xrange(table.nrows):print table.row_values(rownum)

get cell value

cell_A1=table.row( 0)[ 0].value #Or 
like below 
cell_A1=table.cell( 0, 0).value #Or 
like below by column index 
cell_A1=table.col( 0)[ 0]. value

The write operation is rarely used by itself, so I will not summarize it.

xlwt

If xlrd is not a pure Reader (if the last two characters in xlrd are regarded as Reader, then the last two characters of xlwt are regarded as Writer), then xlwt is a pure Writer, because it can only perform operations on Excel write operation. xlwt and xlrd not only have similar names, but also many functions and operation formats are exactly the same. The common operations are briefly summarized below.

Common operations of xlwt

Create a new Excel file (can only be written by new)

1	data=xlwt.Workbook()

Create a new worksheet

1	table=data.add_sheet('name')

Write data to cell A1

1	table.write( 0, 0, u'hehe ')

Note: If you repeat the operation on the same cell, an overwrite Exception will be thrown. To cancel this function, you need to specify as overwrite when adding a worksheet , like the following

1	table=data.add_sheet('name',cell_overwrite_ok=True)

save document

1	data.save('test.xls')

Only files with the extension xls can be saved here, the format of xlsx is not supported

xlwt supports certain styles, the operations are as follows

#Initialize style 
style=xlwt.XFStyle() #Create 

a font for the style 
font=xlwt.Font() #Specify the 

font name 
font.name= 'Times New Roman' #Font 

bold 
font.bold= True #Set 

the font Font for style 
style.font=font #Use 

this style when writing to the file 
sheet.write( 0, 1, 'just for test',style)

openpyxl

This module supports the latest version of Excel file format, and has responsive read and write operations for Excel files. There are two special classes of Reader and Writer for this, which is convenient for the operation of Excel files. Even so, I generally use the default workbook for operations. Common operations are summarized as follows:

Common operations of openpyxl

Read Excel file

1
2
3

from openpyxl.reader.excel import load_workbook

wb=load_workbook(filename)

Display the index range of the worksheet

1	wb.get_named_ranges()

Display the names of all worksheets

1	wb.get_sheet_names()

get the first table

1 2	sheetnames = wb.get_sheet_names() ws = wb.get_sheet_by_name(sheetnames[0])

获取表名

ws.title

获取表的行数

1	ws.get_highest_row()

获取表的列数

1	ws.get_highest_column()

单元格的读取，此处和xlrd的读取方式很相近，都是通过行和列的索引来读取

1 2	#读取B1单元格中的内容 ws.cell(0,1).value

当然也支持通过Excel坐标来读取数据，代码如下

1 2	#读取B1单元格中的内容 ws.cell("B1").value

写文件，只有一种操作方式，就是通过坐标。例如要向单元格C1写数据，就要用类似ws.cell(“C1”).value=something这样的方式。
一般推荐的方式是用openpyxl中的Writer类来实现。代码类似下面这样：

from openpyxl.workbook import Workbook 
 
#ExcelWriter,里面封装好了对Excel的写操作
from openpyxl.writer.excel import ExcelWriter 

#get_column_letter函数将数字转换为相应的字母，如1-->A,2-->B 
from openpyxl.cell import get_column_letter 

#新建一个workbook 
wb = Workbook() 

#新建一个excelWriter 
ew = ExcelWriter(workbook = wb) 

#设置文件输出路径与名称 
dest_filename = r'empty_book.xlsx' 

#第一个sheet是ws 
ws = wb.worksheets[0] 

#设置ws的名称 
ws.title = "range names"

#向某个单元格中写入数据
ws.cell("C1").value=u'哈哈'

#最后保存文件
ew.save(filename=dest_filename)

向某个单元格内写文件时要先知道它对应的行数和列数，这里注意行数是从1开始计数的，而列则是从字母A开始，因此第一行第一列是A1，这实际上是采用坐标方式操作Excel。例如，想向表格的第三行第一列插入一个数值1.2，用xlwt写就是table.write(2,0,1.2),因为xlwt中行列索引都从0开始；而如果用openpyxl写就是ws.cell(“A3”).value=1.2。一般对于一个较大的列数，需要通过get_column_letter函数得到相应的字符，然后再调用cell函数写入。
下面是我之前写的一个代码的一部分，可以用来演示将多位数组保存到Excel文件中。为了体现多维数组，这里用到了numpy，另外这里为了简化过程，没有用ExcelWriter。代码如下：

#coding:utf-8

from openpyxl import Workbook
from openpyxl.cell import get_column_letter

import numpy as np
#生成一个对角阵
a=np.diag([1,2,3,4,5])

#新建一个工作簿
wb=Workbook()
#使用当前激活的工作表（默认就是Excel中的第一张表）
ws=wb.active
#下面是对a的遍历，注意cell中行和列从1开始，a中索引从0开始。
for row in xrange(1,a.shape[0]+1):
 for col in xrange(1,a.shape[1]+1):
 col_letter=get_column_letter(col)
 ws.cell('%s%s'%(col_letter,row)).value=a[row-1,col-1]
wb.save('test.xlsx')

暂时介绍这么多，基本够用了。

总结

读取Excel时，选择openpyxl和xlrd差别不大，都能满足要求
写入少量数据且存为xls格式文件时，用xlwt更方便
写入大量数据（超过xls格式限制）或者必须存为xlsx格式文件时，就要用openpyxl了。

除以上几个模块外，还有Win32com等模块，但没有用过，就不说了。

【转自】 http://wenqiang-china.github.io/2016/05/13/python-opetating-excel/　　author: wenqiang