Found a limitation of xlwt when working with excel data – it cannot write more than 65535 rows, 256 columns of data (because it only supports Excel 2003 and earlier versions, where there is this limit on the number of rows and columns), This is still insufficient for practical applications. After some searching, I found an openpyxl that supports Excel version 07/10/13. Although the function is very powerful, it is not as convenient to operate as xlwt. The common operations of the following modules are described below.
xlrd
xlrd is used to read and write data from Excel, but I usually only use it for read operations, and write operations will encounter some problems. It is more convenient to read with xlrd. The process is the same as the usual manual operation of Excel. Open the workbook (Workbook), select the worksheet (sheets), and then operate the cell (cell). Let's take an example, for example, to open the Excel file named "data.xlsx" in the current directory, select the first worksheet, and then read the entire content of the first row and print it out. The Python code is as follows:
1 |
#Open the excel file |
In the above code, table.row_values(number) is used to read a row. Similarly, table.column_values(number) is used to read a column, where number is the row index. In xlrd, both rows and columns are indexed from 0, so in Excel The top-leftmost cell A1 is row 0, column 0.
To read a cell in xlrd, use table.cell(row,col), where row and col are the row and column corresponding to the cell, respectively.
The following is a brief summary of the usage of xlrd
Summary of xlrd usage
-
Open an Excel workbook
1
data=xlrd.open_workbook(filename)
-
View the names of all sheets in the workbook
1
data.sheet_names()
-
Select a worksheet (by index or table name)
1
2
3
4
5
6
7
8#Get the first worksheet
table=data.sheets()[ 0] #Get
the first worksheet by index
table=data.sheet_by_index( 0) #Select
worksheet by sheet name
table=data.sheet_by_name( u'haha ') -
Get the number of rows and columns of a table
1
2nrows=table.nrows
ncols=table.ncols -
Get the value of the entire row and column
1
2table.row_values(number)
table.column_values(number) -
Read all rows of a table by looping
1
2for rownum in xrange(table.nrows):print table.row_values(rownum)
-
get cell value
1
2
3
4
5cell_A1=table.row( 0)[ 0].value #Or
like below
cell_A1=table.cell( 0, 0).value #Or
like below by column index
cell_A1=table.col( 0)[ 0]. value
The write operation is rarely used by itself, so I will not summarize it.
xlwt
If xlrd is not a pure Reader (if the last two characters in xlrd are regarded as Reader, then the last two characters of xlwt are regarded as Writer), then xlwt is a pure Writer, because it can only perform operations on Excel write operation. xlwt and xlrd not only have similar names, but also many functions and operation formats are exactly the same. The common operations are briefly summarized below.
Common operations of xlwt
Create a new Excel file (can only be written by new)
1 |
data=xlwt.Workbook() |
Create a new worksheet
1 |
table=data.add_sheet('name') |
Write data to cell A1
1 |
table.write( 0, 0, u'hehe ') |
Note: If you repeat the operation on the same cell, an overwrite Exception will be thrown. To cancel this function, you need to specify as overwrite when adding a worksheet , like the following
1 |
table=data.add_sheet('name',cell_overwrite_ok=True) |
save document
1 |
data.save('test.xls') |
Only files with the extension xls can be saved here, the format of xlsx is not supported
xlwt supports certain styles, the operations are as follows
1 |
#Initialize style |
openpyxl
This module supports the latest version of Excel file format, and has responsive read and write operations for Excel files. There are two special classes of Reader and Writer for this, which is convenient for the operation of Excel files. Even so, I generally use the default workbook for operations. Common operations are summarized as follows:
Common operations of openpyxl
Read Excel file
1 |
from openpyxl.reader.excel import load_workbook |
Display the index range of the worksheet
1 |
wb.get_named_ranges() |
Display the names of all worksheets
1 |
wb.get_sheet_names() |
get the first table
1 |
sheetnames = wb.get_sheet_names() |
获取表名
1 |
ws.title |
获取表的行数
1 |
ws.get_highest_row() |
获取表的列数
1 |
ws.get_highest_column() |
单元格的读取,此处和xlrd的读取方式很相近,都是通过行和列的索引来读取
1 |
#读取B1单元格中的内容 |
当然也支持通过Excel坐标来读取数据,代码如下
1 |
#读取B1单元格中的内容 |
写文件,只有一种操作方式,就是通过坐标。例如要向单元格C1写数据,就要用类似ws.cell(“C1”).value=something这样的方式。
一般推荐的方式是用openpyxl中的Writer类来实现。代码类似下面这样:
1 |
from openpyxl.workbook import Workbook |
向某个单元格内写文件时要先知道它对应的行数和列数,这里注意行数是从1开始计数的,而列则是从字母A开始,因此第一行第一列是A1,这实际上是采用坐标方式操作Excel。例如,想向表格的第三行第一列插入一个数值1.2,用xlwt写就是table.write(2,0,1.2),因为xlwt中行列索引都从0开始;而如果用openpyxl写就是ws.cell(“A3”).value=1.2。一般对于一个较大的列数,需要通过get_column_letter函数得到相应的字符,然后再调用cell函数写入。
下面是我之前写的一个代码的一部分,可以用来演示将多位数组保存到Excel文件中。为了体现多维数组,这里用到了numpy,另外这里为了简化过程,没有用ExcelWriter。代码如下:
1 |
#coding:utf-8 |
暂时介绍这么多,基本够用了。
总结
读取Excel时,选择openpyxl和xlrd差别不大,都能满足要求
写入少量数据且存为xls格式文件时,用xlwt更方便
写入大量数据(超过xls格式限制)或者必须存为xlsx格式文件时,就要用openpyxl了。
除以上几个模块外,还有Win32com等模块,但没有用过,就不说了。
【转自】 http://wenqiang-china.github.io/2016/05/13/python-opetating-excel/ author: wenqiang