When it comes to tables, the first thing that comes to mind is Excel; in fact, in addition to Microsoft's spreadsheets, there are also good spreadsheets in Linux systems, and Google also provides good online spreadsheets. Spreadsheets have had a wide variety of uses throughout history. Therefore, Python also has to manipulate the spreadsheet, because some data exists in the spreadsheet.
1,
openpyl
The openpyl module is a third-party library for reading and writing files whose extension is Excel 2010 xlsx/xlsm/xltx/xltm in versions such as Microsoft Excel 2007/2010.
Install
(open the shell module):
C:\Windows\system32>pip install openpyxl
workbook和sheet
The methods provided by the Workbook
are as follows:
active: get the currently active Worksheet worksheets: return all Worksheets (tables) in the form of a list read_only: determine whether to open the Excel document in read_only mode encoding: get the character set encoding of the document properties: get the metadata of the document, such as title, creator , creation date, etc. sheetnames: Get the sheets (list) in the workbook
get_sheet_names
:获取所有表格的名称(新版已经不建议使用,通过
Workbook
的
sheetnames
属性即可获取)
get_sheet_by_name
:通过表格名称获取
Worksheet
对象(新版也不建议使用,通过
Worksheet
[‘表名‘]
获取)
get_active_sheet
:获取活跃的表格(新版建议通过
active
属性获取)
remove_sheet
:删除一个表格
create_sheet
:创建一个空的表格
copy_worksheet
:在
Workbook
内拷贝表格
Worksheet
提供的方法如下:
title:表格的标题
dimensions:表格的大小,这里的大小是指含有数据的表格的大小,即:左上角的坐标:右下角的坐标
max_row:表格的最大行min_row:表格的最小行max_column:表格的最大列min_column:表格的最小列rows:按行获取单元格(Cell对象) - 生成器columns:按列获取单元格(Cell对象) - 生成器freeze_panes:冻结窗格values:按行获取表格的内容(数据) - 生成器
(打开交互模式):
>>> from
openpyxl
import
Workbook
#引入Workbook 模块
>>> wb =
Workbook
() #用Workbook()类里面的方法展开工作 wb = Workbook("D://test.xlsx")创建指定文件
>>> ws =
wb.active
>>> ws1 =
wb.create_sheet
() #增加一个sheet
>>> ws.
title
= "python" #给第一个sheet命名
>>> ws01 = wb['python'] #按名称获取表格的sheet
>>> ws is ws01 #判断是否存在
True
>>> print wb.
sheetnames
#打印所有sheet
[u'python', u'Sheet1']
>>> for sh in wb: #遍历
print sh.title
python
Sheet1
cell:
cell对象提供如下方法:
row:单元格所在的行column:单元格坐在的列
value
:单元格的值coordinate:单元格的坐标
对于Sheet,其中的cell是它的下级单位。所以,要得到某个cell可以这样:
>>> a1 = ws['A1']
如果A1这个cell已经有了,用这种方法就是将它的值赋给了变量a1;如果sheet中没有这个cell,那么就创建这个cell对象。
注意,当我们打开Excel,默认已经画好了很多cell。但是,在Python操作的电子表格中,
不会默认画好那样一个表格,一切都要创建之后才有。所以,如果按照前面的操作流程,上面就是创建了A1这个
cell,并且把它作为一个对象被a1变量引用。
>>> ws['A1'] = 333 #给A1添加数据
>>> a1.value #获取A1的值
333
获取cell对象还可以这样:
>>> cells = ws["A1":"B1":"C1"] #批量获取
>>> ws['A2']=444
>>> ws['B1']="dadasd"
>>> wb.
save
("D://test.xlsx") #保存文件
读取已知文件:
>>> from openpyxl import load_workbook #引入load_workbook模块
>>> wb2 = load_workbook("D://test.xlsx") #打开已知文件
>>> print wb2.sheetnames #获取文件sheel
[u'python', u'Sheet1']
>>> ws_wb2 = wb2["python"]
>>> for row in ws_wb2.rows:
for cell in row:
print cell.value
333
dadasd
444
None
2,
其他第三方库
针对表格的第三方库,除了上面这个openpyxl之外还有别的,下面列出几个仅供参考,使用方法大同小异。
xlsxwriter:针对Excel 2010格式,如.xlsx,官方网站:https://xlsxwriter.readthedocs.org/,这个官方文档写得图文并茂。非常好读。
下面两个是用来处理.xls格式的电子表表格:
xlrd:网络文件,
https://secure.simplistix.co.uk/svn/xlrd/trunk/xlrd/doc/xlrd.html?p=4966。
xlwt:网络文件,http://xlwt.readthedocs.org/en/latest/。