【Python网络爬虫】150讲轻松搞定Python网络爬虫付费课程笔记 篇十五——数据存储:excel文件处理

1. execl 打开和获取sheet

首先需要安装两个库:

  • xlrd 用于读
  •  xlwt 用于写

打开excel文件:

  • xlrd.open_workbook("xxx.xls")

获取sheet :

import xlrd

workbook = xlrd.open_workbook('成绩表.xlsx')

# 获取sheet名字
print(workbook.sheet_names())

sheet = workbook.sheet_by_index(0)
print(sheet.name)

sheet = workbook.sheet_by_name('1班')
print(sheet.name)

sheets = workbook.sheets()
for sheet in sheets:
    print(sheet.name)


# 获取指定sheet的行列列数
sheet = workbook.sheet_by_index(0)
print({"rows": sheet.nrows, "cols": sheet.ncols})

 

2. 获取cell及其属性

# 获取指定sheet的行列列数
sheet = workbook.sheet_by_index(0)
print({"rows": sheet.nrows, "cols": sheet.ncols})

sheet = workbook.sheet_by_index(0)
cell = sheet.cell(0, 0)
print(cell.value)

cell_value = sheet.cell_value(0, 1)
print(cell_value)

cells = sheet.row_slice(1, 1, 4)
for cell in cells:
    print(cell.value)

cells = sheet.col_slice(0, 1, sheet.nrows)
for cell in cells:
    print(cell.value)

cell_values = sheet.col_values(1, 1, sheet.nrows)
print(cell_values)

cell_values = sheet.row_values(1, 1, sheet.ncols)
print(cell_values)
print(sum(cell_values), sum(cell_values)/3.0)

 

3. cell 常用数据类型

# cell类型

sheet = workbook.sheet_by_index(0)
cell = sheet.cell(0, 0)
print(cell.ctype)
print(xlrd.XL_CELL_TEXT)

cell = sheet.cell(1, 1)
print(cell.ctype)
print(xlrd.XL_CELL_NUMBER)

cell = sheet.cell(20, 0)
print(cell.ctype)
print(xlrd.XL_CELL_DATE)

cell = sheet.cell(19, 0)
print(cell.ctype)
print(xlrd.XL_CELL_BOOLEAN)

cell = sheet.cell(0, 0)
print(cell.ctype)
print(xlrd.XL_CELL_EMPTY)

 

4. excel 写入步骤

import xlwt
import random

workbook = xlwt.Workbook()
sheet = workbook.add_sheet('sheet1')

headers = ['姓名', '语文', '英语', '数学']
for index, header in enumerate(headers):
    sheet.write(0, index, header)

names = ['张三', '李四', '王五']
scores = []

for index, name in enumerate(names):
    sheet.write(index+1, 0, name)

for row in range(1, 4):
    for col in range(1, 4):
        sheet.write(row, col, random.randint(90, 100))

workbook.save('成绩表2.xls')

 

5.  编辑 excel 文件

  1. 先读取原来excel文件
  2. 读取sheet上面进行cell 修改,使用sheet.put_cell(row, col, ctype, value, None) 进行实现
  3. 重新创建一个excel文件,把之前读取的数据写入到新的excel 中
import xlwt, xlrd

rwb = xlrd.open_workbook('成绩表.xlsx')
rsheet = rwb.sheet_by_index(0)

rsheet.put_cell(0, 4, xlrd.XL_CELL_TEXT, '总分', None)
nrows = rsheet.nrows

for row in range(1, nrows):
    grades = rsheet.row_values(row, 1, 4)
    print(grades)
    print(type(grades[0]))
    total = sum(grades)
    rsheet.put_cell(row, 4, xlrd.XL_CELL_NUMBER, total, None)

nrows = rsheet.nrows
ncols = rsheet.ncols
for col in range(1, 5):
    grades = rsheet.col_values(col, 1, nrows)
    avg = sum(grades) / len(grades)
    rsheet.put_cell(nrows, col, xlrd.XL_CELL_NUMBER, avg, None)

wwb = xlwt.Workbook()
wsheet = wwb.add_sheet('sheet1')
for row in range(0, rsheet.nrows):
    for col in range(0, rsheet.ncols):
        wsheet.write(row, col, rsheet.cell_value(row, col))

wwb.save('abc.xls')

猜你喜欢

转载自blog.csdn.net/weixin_44566432/article/details/108735567