python excel的相关操作

因为经常用到对excel的相关操作，今天就在此总结相关内容，方便大家参考。

python操作excel除了读就是写。

揭秘Book

通过open_workbook返回的xlrd.Book对象包含了所有对工作簿要的事情，能被用于在工作簿中取得独立的sheet。

这个nsheets属性是一个整数，包含工作簿sheet的数量。这个属性与sheet_by_index方法结合起来是获取独立sheet最常用的方法。

从读开始

sheet_names方法返回包含工作簿中所有sheet名字的unicode列表。单独的sheet可以通过sheet_by_name方法使用这些名字获取。

sheets方法的结果是迭代获取工作簿中的每个sheet。

from xlrd import open_workbook

book = open_workbook('simple.xls')

print book.nsheets

for sheet_index in range(book.nsheets):
    print book.sheet_by_index(sheet_index)
    
print book.sheet_names()
for sheet_name in book.sheet_names():
    print book.sheet_by_name(sheet_name)
    
for sheet in book.sheets():
    print sheet

xlrd.Book对象有与工作簿内容相关的其它属性，但很少用到：

codepage
countries
user_name

如果你可能需要运用这些属性，请查看xlrd文档。

通过上面介绍的方法返回的xlrd.sheet.Sheet对象包含了所有对worksheet和它的内容操作的信息。

name属性是worksheet名字的unicode表示。

nrows和ncols属性分别包含了worksheet中的行数和列数。

下面例子展示了如何使用迭代来显示一个worksheet的内容：

Unicode

由xlrd产生的所有文本属性不是unidecode对象，就是ascii字符串（很少）。

由Microsoft Excel输入的每个文本都是下列编码之一：

Latin1,如果匹配
UTF_16_LE，如果不匹配Latin1
在更老的文件中，是按MS字符集规范编码的。他们由xlrd映射到Python编码，结果仍是unicode对象。

其他知名软件用错误字符集或不用字符集写入Excel文件的情况是很少的。这种情况下，可能需要在open_workbook方法中指定正确的字符集。

from xlrd import open_workbook
book = open_workbook('dodgy.xls',encoding='cp1252')

xlrd

http://pypi.python.org/pypi/xlrd

导入
import xlrd

打开excel
file = xlrd.open_workbook('demo.xls')

查看文件中包含sheet的名称
file.sheet_names()

得到第一个工作表，或者通过索引顺序或工作表名称
sheet = file.sheets()[0]
sheet = file.sheet_by_index(0)
sheet = file.sheet_by_name(u'Sheet1')

获取行数和列数
nrows = sheet.nrows
ncols = sheet.ncols

循环行,得到索引的列表
for rownum in range(sheet.nrows):
print sheet.row_values(rownum)

获取整行和整列的值（数组）
sheet.row_values(i)
sheet.col_values(i)

单元格（索引获取）
cell_A1 = sheet.cell(0,0).value
cell_C4 = sheet.cell(2,3).value

分别使用行列索引
cell_A1 = sheet.row(0)[0].value
cell_A2 = sheet.col(1)[0].value

xlwt

http://pypi.python.org/pypi/xlrd

导入xlwt

import xlwt

新建一个excel文件

file = xlwt.Workbook() #注意这里的Workbook首字母是大写，无语吧

新建一个sheet

sheet = file.add_sheet('sheet name')

写入数据sheet.write(行,列,value)

sheet.write(0,0,'test')

如果对一个单元格重复操作，会引发
returns error:
# Exception: Attempt to overwrite cell:
# sheetname=u'sheet 1' rowx=0 colx=0

所以在打开时加cell_overwrite_ok=True解决

sheet = file.add_sheet('sheet name',cell_overwrite_ok=True)

保存文件

file.save('demo.xls')

另外，使用style

style = xlwt.XFStyle() #初始化样式

font = xlwt.Font() #为样式创建字体

font.name = 'Times New Roman'

font.bold = True

style.font = font #为样式设置字体

sheet.write(0, 0, 'some bold Times text', style) # 使用样式

xlwt 允许单元格或者整行地设置格式。还可以添加链接以及公式。可以阅读源代码，那里有例子：

dates.py, 展示如何设置不同的数据格式

hyperlinks.py, 展示如何创建超链接 (hint: you need to use a formula)

merged.py, 展示如何合并格子

row_styles.py, 展示如何应用Style到整行格子中.

操作大的Excel文件

如果你在操作特别大的Excel文件，那么有两个你应该注意的xlrd特性：

open_workbook方法的on_demand参数为True，被访问时会导致只往内存里加载worksheet。
xlrd.Book对象有一个unload_sheet方法能通过指定sheet索引或sheet名称从内存中卸载worksheet。

下面的例子展示了一个大的workbook怎么去迭代被检查只匹配某一模式的sheet，并在内存中某个时间被卸载。

from xlrd import open_workbook

book = open_workbook('simple.xls',on_demand=True)

for name in book.sheet_names():
	if name.endswith('2'):
		sheet = book.sheet_by_name(name)
		print sheet.cell_value(0,0)
		book.unload_sheet(name)

参考http://blog.sina.com.cn/s/blog_63f0cfb20100o617.html