Python uses openpyxl to read and write excel files

foreword

According to the official documentation, openpyxl is a third-party library that can process Excel files in xlsx/xlsm format (A Python library to read/write Excel 2010 xlsx/xlsm files).
There are three main concepts in openpyxl: Workbook (worksheet), Sheet (table page) and Cell (grid).
The main operations in openpyxl: open Workbook, locate Sheet, and operate Cell.
The main methods of reading and writing in openpyxl are described below.

text

Installation method

pip install: sudo pip install openpyxl

Source installation: python setup.py install (download link at the bottom)

Workbook Actions
  • Read Workbook
from openpyxl import load_workbook
# 加载存在的 excel 文件: 默认可读写
wb = load_workbook("sample.xlsx")
# 只读模式打开文件
wb = load_workbook("sample.xlsx", read_only=True)
  • Write Workbook
from openpyxl import Workbook
# 新建一个新的工作表(未保存)。
wb = Workbook()
# 只写模式
wb = Workbook(write_only=True)
# 保存文件,若加载路径与保存的路径一致将会被覆盖
wb.save(r"F:\sample.xlsx")
# 将文件作为模板保存 as_template 默认为 False
wb.save("template.xltx", as_template=True)
Sheet operations
  • read sheet
# 获得所有 sheet 的名称()
name_list = wb.get_sheet_names()
# 根据 sheet 名字获得 sheet
for name in name_list:
    my_sheet = wb.get_sheet_by_name(name)
# 获得 sheet 名
    print(my_sheet.title)
# 获得当前正在显示的 sheet, 或 wb.get_active_sheet()
my_sheet = wb.active
# 通过索引加载 sheet,index 从0开始
my_sheet = wb.worksheets[index]
# 最大行
my_sheet.max_row
# 最大列
my_sheet.max_column
# 设置标签栏的字体颜色(标签栏背景色默认为白色)
my_sheet.sheet_properties.tabColor = "FF0000"
  • write sheet
# 获得所有 sheet 的名称
wb.get_sheet_names()
# 改工作表的名称
my_sheet.title = "Sheet1"
# 新建一个工作表,0是第一个位置
wb.create_sheet("Data", index=1)
#默认插在工作簿末尾
my_sheet = wb.create_sheet() 
# 删除某个工作表
wb.remove(my_sheet)
# 删除某个工作表
del wb[my_sheet]
Cell operation
  • Read Cell
# 获取某个单元格的值,观察 excel 发现也是先字母再数字的顺序,即先列再行
c3 = my_sheet["C3"]
# 列,即 C
c3.column
# 行,即 3
c3.row
# 坐标,即 C3
c3.coordinate
# 对应的值
c3.value
# 除了用下标的方式获得,还可以用cell 函数, 换成数字,这个表示 C3
c3_cell = my_sheet.cell(row=3, column=3)
print(c3_cell.value)
# 获得最大列和最大行
print(my_sheet.max_row)
print(my_sheet.max_column)
# 按行读取: 按 A1、B1、C1 顺序返回
for row in my_sheet.rows:
    for cell in row:
        print(cell.value)
# 按列读取: 按 A1、A2、A3 顺序返回
for column in my_sheet.columns:
    for cell in column:
        print(cell.value)

# 获取某一行的数据,例:获取第三行 tuple 对象
for cell in list(my_sheet.rows)[2]:
    print(cell.value)

# 获取矩形区间数据
for i in range(1, 4):
    for j in range(1, 3):
        print(my_sheet.cell(row=i, column=j))

# iter_rows() 方法获得多个单元格
for row in ws.iter_rows("A1:C2"):
    for cell in row:
        print cell

# 像切片一样使用        
for row_cell in my_sheet["A1":"B3"]:
    for cell in row_cell:
        print(cell)
  • write Cell
# 直接给单元格赋值就行
my_sheet["A1"] = "test"
# B9 处写入平均值
my_sheet["B9"] = "=AVERAGE(B2:B8)"
# 添加一行
row = [1 ,2, 3, 4, 5]
my_sheet.append(row)
# 添加多行
rows = [
    ["ID", "data1", "data2"],
    [2, 40, 20],
    [3, 40, 25],
    [4, 40, 30],
    [5, 40, 35],
    [6, 45, 40],
    [7, 40, 45],
]
my_sheet.append(rows)
# 添加多列
columns = list(zip(*rows))
my_sheet.append(columns)

Get the column number according to the letter, return the letter according to the column number

from openpyxl.utils import get_column_letter, column_index_from_string
# 根据列的数字返回字母
print(get_column_letter(3))  # C
# 根据字母返回列的数字
print(column_index_from_string("C"))  # 3

Set the cell style Style

from openpyxl.styles import Font, colors, Alignment
# 设置字体: 等线 24 号加粗斜体,字体颜色红色
bold_itatic_24_font = Font(name="等线", size=24, italic=True, color=colors.RED, bold=True)
my_sheet["A1"].font = bold_itatic_24_font
# 设置填充色: 
my_sheet["A2"].fill = PatternFill(fill_type=fills.FILL_SOLID, fgColor="00FF0000", bgColor="00FF0000")
# 对齐方式: B1 中的数据垂直居中和水平居中
my_sheet["B1"].alignment = Alignment(horizontal="center", vertical="center")
# 设置行高和列宽
my_sheet.row_dimensions[2].height = 40
my_sheet.column_dimensions["C"].width = 30
# 合并和拆分单元格
# 合并单元格, 往左上角写入数据即可
# 合并后只可以往左上角写入数据,也就是区间中:左边的坐标。
my_sheet.merge_cells("B1:G1") # 合并一行中的几个单元格
my_sheet.merge_cells("A1:C3") # 合并一个矩形区域中的单元格
my_sheet.unmerge_cells("A1:C3") #拆分后,值回到A1位置。

other instructions:

  • In order to be consistent with the expression in Excel, the row and column in openpyxl do not use 0 to represent the first value, but 1 to start with the habit of programming languages.
  • wb.worksheets[index] index starts at 0
  • Suppose sheet[“B9”] = “=AVERAGE(B2:B8)”, when reading data, data_only=True, when reading the formula, the formula returned by B9 is obtained. If this parameter is not added, the formula itself will be returned. "=AVERAGE(B2:B8)"
  • If the text encoding is "gb2312", it will display garbled characters after reading, please convert to Unicode first
  • When a worksheet is created, it contains no cells. Created only when the cell is fetched. This way we don't create cells that we never use, reducing memory consumption.
  • When saving, the suffix should be consistent
  • To save the file in xlsm format, you need to pass the parameter keep_vba=True
  • wb = load_workbook(“sample.xltm”, keep_vba=True), save as template document, need to pass as_template=True, save as document, need to pass as_template=False

Link:

openpyxl official documentation: http://openpyxl.readthedocs.io/en/default/
Common examples: http://openpyxl.readthedocs.io/en/default/usage.html
BitBucket address: https://bitbucket.org/openpyxl /openpyxl
openpyxl source download: https://pypi.python.org/pypi/openpyxl
A good tutorial https://automatetheboringstuff.com/chapter12/


If there are any mistakes, please point them out.

email: dxmdxm1992#gmail.com

blog: http://blog.csdn.net/david_dai_1108

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325484854&siteId=291194637