Python xlrd 读取 Excel 表格基础(一):获取sheet、行、列对象,读取数据类型等

前言:

前面已经介绍了Python 第三方模块xlwt,将数据写入Excel表格的基础操作。从这片博客开始我们来介绍 xlrd 模块读取 Excel 表格数据内容的操作。

本篇主要内容为:

1、获取工作簿、sheet表对象;
2、获取sheet表中行对象,列对象;
3、获取sheet表中所有数据;
4、获取工作簿对象所有sheet表数据。

1、获取工作簿、sheet对象

在CSDN创作中心的数据数据详情中下载了一个Excel表格文件,然后做成了两个sheet表,文件截图如下:
sheet1 截图:

在这里插入图片描述

Sheet2 截图:
在这里插入图片描述

获取工作簿文件对象(打开文件):

# 导入模块
import xlrd

# 打开文件方式1:
work_book = xlrd.open_workbook('test01.xls')
# 方式2:
w2 = xlrd.book.open_workbook_xls('表02.xls')

以上两种方式都可以获取 xls 格式的Excel文件对象,其实也都是在 xlrd 中创建了Book 对象,推荐第一种打开方式,操作更简单。

获取工作簿对象属性:

# 获取工作簿中sheet表数量
print(work_book.nsheets)
# 打印结果:
# 2

# 获取工作簿中所有sheet表对象
sheets = work_book.sheets()
print(sheets)
# ------运行结果------
# [<xlrd.sheet.Sheet object at 0x0000025838B69E80>,
# <xlrd.sheet.Sheet object at 0x0000025838B69E48>]
# ------运行结果------

# 获取工作簿所有sheet表对象名称
sheets_name = work_book.sheet_names()
print(sheets_name)
# ------运行结果------
# ['sheet1', 'Sheet2']
# ------运行结果------

按索引、名称获取sheet表对象:

# 按索引获取sheet对象
sheet_1 = work_book.sheet_by_index(0)
print(sheet_1)
# ------运行结果------
# <xlrd.sheet.Sheet object at 0x000001CE3473C550>
# ------运行结果------

# 按sheet表名称获取sheet对象,名称分大小写
sheet_2 = work_book.sheet_by_name('Sheet2')
print(sheet_2)
# ------运行结果------
# <xlrd.sheet.Sheet object at 0x000001C6A5B7C710>
# ------运行结果------

2、获取sheet表中行、列对象

获取单元格对象及属性:

# 获取sheet表单元格对象,单元格数据类型:单元格值
cell_0 = sheet_1.cell(0,0)
print(cell_0)
# ------运行结果------
# text:'日期'
# ------运行结果------

# 获取sheet表单元格值
cell_0_value = sheet_1.cell_value(0,0)
print(cell_0_value)
# ------运行结果------
# 日期
# ------运行结果------

# 获取单元格类型
cell_0_type = sheet_1.cell_type(0,0)
print(cell_0_type)
# ------运行结果------
# 1
# ------运行结果------

单元格类型索引:

'''
单元格类型索引:
XL_CELL_EMPTY: 'empty',
XL_CELL_TEXT: 'text',
XL_CELL_NUMBER: 'number',
XL_CELL_DATE: 'xldate',
XL_CELL_BOOLEAN: 'bool',
XL_CELL_ERROR: 'error',
XL_CELL_BLANK: 'blank',
'''

获取sheet表行(row)对象及属性:

# 获取sheet表对象有效行数
row_sum = sheet_1.nrows
print(row_sum)
# ------运行结果------
# 29
# ------运行结果------

# 获取sheet表某一行长度
row_len = sheet_1.row_len(0)
"def row_len(self, rowx):"
print(row_len)
# ------运行结果------
# 2
# ------运行结果------

获取行对象数据类型及值:

# 获取sheet表某一行所有数据类型及值
row_0 = sheet_1.row(0)
"def row(self, rowx):"
print(row_0)
# ------运行结果------
# [text:'日期', text:'访问量']
# ------运行结果------

# 获取某一行对象数据类型、值,可指定开始结束列
row_0_s = sheet_1.row_slice(0,0,1)
"def row_slice(self, rowx, start_colx=0, end_colx=None):"
print(row_0_s)
# ------运行结果------
# [text:'日期']
# ------运行结果------

# 获取sheet表对象某一行数据类型,返回一个数组对象
row_0_type = sheet_1.row_types(0)
"def row_types(self, rowx, start_colx=0, end_colx=None):"
print(row_0_type)
# ------运行结果------
# array('B', [1, 1])
# ------运行结果------

# 获取sheet表对象某一行数据值
row_0_value = sheet_1.row_values(0)
"def row_values(self, rowx, start_colx=0, end_colx=None):"
print(row_0_value)
# ------运行结果------
# ['日期', '访问量']
# ------运行结果------

获取sheet表所有行(获得一个生成器):

# 获得sheet对象所有行对象生成器
rows = sheet_1.get_rows()
print(rows)
# ------运行结果------
# <generator object Sheet.get_rows.<locals>.<genexpr> at 0x000001AF2C41DDE0>
# ------运行结果------

for row in rows:
    print(row)
# ------运行结果------
# [text:'日期', text:'访问量']
# [xldate:43974.0, number:17.0]
# [xldate:43975.0, number:36.0]
# [xldate:43976.0, number:22.0]
# [xldate:43977.0, number:20.0]
# [xldate:43978.0, number:18.0]
# [xldate:43979.0, number:223.0]
# [xldate:43980.0, number:1187.0]
# [xldate:43981.0, number:854.0]
# [xldate:43982.0, number:119.0]
# [xldate:43983.0, number:164.0]
# [xldate:43984.0, number:123.0]
# [xldate:43985.0, number:32.0]
# [xldate:43986.0, number:336.0]
# [xldate:43987.0, number:277.0]
# [xldate:43988.0, number:14.0]
# [xldate:43989.0, number:57.0]
# [xldate:43990.0, number:18.0]
# [xldate:43991.0, number:12.0]
# [xldate:43992.0, number:67.0]
# [xldate:43993.0, number:532.0]
# [xldate:43994.0, number:1120.0]
# [xldate:43995.0, number:621.0]
# [xldate:43996.0, number:53.0]
# [xldate:43997.0, number:22.0]
# [xldate:43998.0, number:51.0]
# [xldate:43999.0, number:511.0]
# [xldate:44000.0, number:774.0]
# [xldate:44001.0, number:850.0]
# ------运行结果------

注:Excel表格中 0 代表的是 1900 年 1 月 1日,即:

在这里插入图片描述
获取列对象及属性:

# 获取sheet表有效列数
col_sum = sheet_1.ncols
print(col_sum)
# ------运行结果------
# 2
# ------运行结果------

# 获取列对象
# #################### 该方法好像有问题... ####################3
col_0 = sheet_1.col_slice(1)
"def col_slice(self, colx, start_rowx=0, end_rowx=None):"
print(cell_0)

# 获取某一列的值
col_0_value = sheet_1.col_values(0)
"def col_values(self, colx, start_rowx=0, end_rowx=None):"
print(col_0_value)
# ------运行结果------
# ['日期', 43974.0, 43975.0, 43976.0, 43977.0, 43978.0, 43979.0, 43980.0,
# 43981.0, 43982.0, 43983.0, 43984.0, 43985.0, 43986.0, 43987.0, 43988.0,
# 43989.0, 43990.0, 43991.0, 43992.0, 43993.0, 43994.0, 43995.0, 43996.0,
# 43997.0, 43998.0, 43999.0, 44000.0, 44001.0]
# ------运行结果------

# 获取某一列的数据类型
col_0_type = sheet_1.col_types(0)
"def col_types(self, colx, start_rowx=0, end_rowx=None):"
print(col_0_type)
# ------运行结果------
# [1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
# 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]
# ------运行结果------

3、读取sheet对象中所有数据

1、按行读取:

# 按行读取
data_row = []
for row in range(sheet_1.nrows):
    data_row.append(sheet_1.row_values(row))

print(data_row)
# ------运行结果------
# [['日期', '访问量'], [43974.0, 17.0], [43975.0, 36.0],
# [43976.0, 22.0], [43977.0, 20.0], [43978.0, 18.0],
# [43979.0, 223.0], [43980.0, 1187.0], [43981.0, 854.0],
# [43982.0, 119.0], [43983.0, 164.0], [43984.0, 123.0],
# [43985.0, 32.0], [43986.0, 336.0], [43987.0, 277.0],
# [43988.0, 14.0], [43989.0, 57.0], [43990.0, 18.0],
# [43991.0, 12.0], [43992.0, 67.0], [43993.0, 532.0],
# [43994.0, 1120.0], [43995.0, 621.0], [43996.0, 53.0],
# [43997.0, 22.0], [43998.0, 51.0], [43999.0, 511.0],
# [44000.0, 774.0], [44001.0, 850.0]]
# ------运行结果------

2、按列读取:

# 按列读取
data_col = [sheet_1.col_values(i) for i in range(sheet_1.ncols)]
print(data_col)
# ------运行结果------
# [['日期', 43974.0, 43975.0, 43976.0, 43977.0, 43978.0, 43979.0,
# 43980.0, 43981.0, 43982.0, 43983.0, 43984.0, 43985.0, 43986.0,
# 43987.0, 43988.0, 43989.0, 43990.0, 43991.0, 43992.0, 43993.0,
# 43994.0, 43995.0, 43996.0, 43997.0, 43998.0, 43999.0, 44000.0,
# 44001.0],
# ['访问量', 17.0, 36.0, 22.0, 20.0, 18.0, 223.0, 1187.0, 854.0,
# 119.0, 164.0, 123.0, 32.0, 336.0, 277.0, 14.0, 57.0, 18.0, 12.0,
# 67.0, 532.0, 1120.0, 621.0, 53.0, 22.0, 51.0, 511.0, 774.0, 850.0]]
# ------运行结果------

4、按行读取Excel文件所有sheet表数据

构造一个字典,按行读取文件所有数据:

# 按行读取test01.xls 所有 sheet 表数据
all_data = {}

for i,sheet_obj in enumerate(work_book.sheets()):
    all_data[i] = [sheet_obj.row_values(row)
                   for row in range(sheet_obj.nrows)]
print(all_data)
# ------运行结果------
# {0: [['日期', '访问量'], [43974.0, 17.0], [43975.0, 36.0],
# [43976.0, 22.0], [43977.0, 20.0], [43978.0, 18.0], [43979.0, 223.0],
# [43980.0, 1187.0], [43981.0, 854.0], [43982.0, 119.0],
# [43983.0, 164.0], [43984.0, 123.0], [43985.0, 32.0],
# [43986.0, 336.0], [43987.0, 277.0], [43988.0, 14.0],
# [43989.0, 57.0], [43990.0, 18.0], [43991.0, 12.0],
# [43992.0, 67.0], [43993.0, 532.0], [43994.0, 1120.0],
# [43995.0, 621.0], [43996.0, 53.0], [43997.0, 22.0],
# [43998.0, 51.0], [43999.0, 511.0], [44000.0, 774.0],
# [44001.0, 850.0]],
# 1: [['日期', '访问量', '评论数', '收藏数', '粉丝数'],
# [43974.0, 17.0, 0.0, 0.0, 0.0], [43975.0, 36.0, 0.0, 0.0, 0.0],
# [43976.0, 22.0, 0.0, 0.0, 0.0], [43977.0, 20.0, 0.0, 0.0, 0.0],
# [43978.0, 18.0, 0.0, 0.0, 0.0], [43979.0, 223.0, 0.0, 0.0, 0.0],
# [43980.0, 1187.0, 0.0, 0.0, 0.0], [43981.0, 854.0, 0.0, 0.0, 1.0],
# [43982.0, 119.0, 0.0, 7.0, 0.0], [43983.0, 164.0, 0.0, 1.0, 0.0],
# [43984.0, 123.0, 0.0, 0.0, 0.0], [43985.0, 32.0, 1.0, 0.0, 1.0],
# [43986.0, 336.0, 1.0, 1.0, 1.0], [43987.0, 277.0, 0.0, 0.0, 0.0],
# [43988.0, 14.0, 0.0, 0.0, 0.0], [43989.0, 57.0, 3.0, 4.0, 0.0],
# [43990.0, 18.0, 0.0, 0.0, 0.0], [43991.0, 12.0, 0.0, 0.0, 0.0],
# [43992.0, 67.0, 1.0, 3.0, 0.0], [43993.0, 532.0, 2.0, 6.0, 1.0],
# [43994.0, 1120.0, 0.0, 9.0, 7.0], [43995.0, 621.0, 0.0, 0.0, 0.0],
# [43996.0, 53.0, 0.0, 2.0, 0.0], [43997.0, 22.0, 0.0, 0.0, 0.0],
# [43998.0, 51.0, 0.0, 1.0, 0.0], [43999.0, 511.0, 0.0, 7.0, 3.0],
# [44000.0, 774.0, 0.0, 3.0, 0.0], [44001.0, 850.0, 0.0, 10.0, 0.0]]}
# ------运行结果------

最后:

以上就是 xlrd 读取Excel表格中数据的操作介绍,总体来说是比较简单的,感谢阅读。

【Python与Excel表格】专栏 会对Python常用Excel表格处理模块的使用进行讲解,也会搭配一些实例演练,强化对知识点的运用和理解。感兴趣的朋友,可以点个关注、收藏。

猜你喜欢

转载自blog.csdn.net/zhouz92/article/details/106883774