Basic use of openpyxl

Use of openpyxl

1 Basic structure of Excel

Workbook: an Excel spreadsheet document is called a workbook, a workbook is saved in the extension .xlsxfile.

Insert picture description here

Sheet: A workbook can contain multiple sheets (also called worksheets). The table currently viewed by the user (or the last viewed table before closing Excel) is called the active table

Insert picture description here

Row: The row in the table, the address is a number starting from 1

Column: The column in the table, the address is the letter starting from A

Cell: A cell, a cell composed of a specific row and column is called a cell. Each cell contains a number or text value. The grid and data formed by the cells constitute the table.

Insert picture description here

2 Install the openpyxl module

Python does not come with openpyxl, so it needs to be installed separately before it can be used normally.

Open the command line and enter under Windows:

pip install openpyxl

Open the terminal under Mac and enter:

pip3 install openpyxl

If you want to test whether the installation is correct, you can enter the following code in an interactive environment:

import openpyxl

If the module is installed correctly, no error message will be returned after running

Openpyxl official document link:

https://openpyxl.readthedocs.io/en/stable/index.html

3 Read an existing Excel table

Insert picture description here

Simple example

from openpyxl import load_workbook
# fileName 这里是指文件路径
fileName = "工作簿.xlsx"
# 以只读模式打开工作簿
wb = load_workbook(filename = fileName,read_only = True)
# sheetName 就是 sheet页的名称
sheetName = "Sheet1"
# 通过 工作表名 获取 工作表
ws = wb[sheetName]
# 按行读取 工作表的内容
for row in ws.rows:
    for cell in row:
        # 输出 单元格中的数据
        print(cell.value)

3.1 Get workbook objects

After importing the openpyxl module, you can use the load_workbook() method to read the Excel file. The openpyxl.load_workbook() method returns a workbook object (also a value of the workbook data type). For the convenience of subsequent use, we usually name this object wb.

Note: The load_workbook() method can only read an existing Excel table file, and cannot create a new Excel table.

import openpyxl

wb = openpyxl.load_workbook(filename, read_only=False, keep_vba=False, data_only=False, keep_links=True)

The openpyxl.load_workbook() method can receive multiple parameters for reading settings

filename: string type, the file path to read the Excel file, you can use a relative path or an absolute path.

read_only: Boolean type, select read-only mode or read-write mode, select read-only mode to improve query speed, the default is False.

keep_vba: Boolean type, keep vba content (this does not mean you can use it), the default is False.

data_only: Boolean type. If set to True, the cell containing the formula will display the latest calculation result or None. If set to False, the cell will display the formula. The default is False. Note: Open with data_only=False, and finally save with the save() function, the original xlsx file only has the formula data_only=False. If you want to retrieve the formula and result, you need to open the file with Excel and save it.

keep_links: Boolean type, whether to keep links to external workbooks. The default is True.

3.2 Get worksheet object

3.2.1 Get all worksheets in the workbook

Call the get_sheet_names() method or the sheetnames attribute to get all the sheet names in the workbook and return them as a list.

from openpyxl import load_workbook
# fileName 这里是指文件路径
fileName = "工作簿.xlsx"
# 以只读模式打开工作簿
wb = load_workbook(filename = fileName,read_only = True)
# 获取工作簿中所有的工作表,返回一个列表对象
wb.get_sheet_names()
wb.sheetnames

3.2.2 Select the worksheet to be operated

Each worksheet is represented by a Worksheet object, which can be obtained through a dictionary-like key or method. Finally, you can also get the active table through wb.active (the last table viewed before closing Excel)

# 通过工作表名称去的工作表
sheet = wb.get_sheet_by_name('工作表名称')
sheet = wb['工作表名称']

# 获取工作簿中的活动表
sheet = wb.active
# 获取活动表的表名称
sheet.title

3.2.3 Get some attributes in the table

# 获取表格的大小(表格存在数据的大小)
sheet.dimensions
# 单独获取最大行或列
sheet.max_row or sheet.max_column
# 设置表格为隐藏状态
sheet.sheet_state = 'hidden'
# 设置表格为显示状态
sheet.sheet_state = 'visible'

3.3 Get the cell object

3.3.1 Get the cell object

With the table object, you can get the cell object according to the coordinates or the number of rows and columns

# 通过固定的行列坐标获取单元格对象
cell = sheet['A1']
# 通过数字的行列获取单元格对象
cell = sheet.cell(row=1,column=1)
# 为单元格赋值
cell.value = 'A1'

3.3.3 Get some attributes of a cell

# 获取单元格的行
cell.row
# 获取单元格的列
cell.column
# 获取单元格的坐标
cell.coordinate

3.3.4 Take a row or column of cells and return a tuple object

You can slice the table object to get all the cell objects in a row, a column, or a rectangular area in the spreadsheet. Then you can get all the cells in the slice by looping.

# 取一列单元格
cells = sheet['A']
# 取一行单元格
cells = sheet[1]
# 取表格当中的所有列,一列为一组
cells = sheet.columns
# 取表格当中的所有行,一行为一组
cells = sheet.rows
# 已知单元格矩形范围坐标时
cells = sheet['A1:C5']
# 已知单元格起始与终结的行列数时
cells = sheet.iter_cols(min_row=1, max_row=5, min_col=1, max_col=3)
cells = sheet.iter_rows(min_row=1, max_row=5, min_col=1, max_col=3)

4 Create a new Excel table

4.1 Create a new table

To create a new Excel table, you need to use the openpyxl.Workbook() method. In fact, the Workbook method calls the Excel template file that comes with openpyxl.

When modifying the content in the Excel table, the spreadsheet file will not be automatically saved until we use the save() workbook method.

import openpyxl
# 打开一个工作簿,由于是创建一个新的,所以不需要添加路径参数
wb = openpyxl.Workbook(encodin='utf-8')
# 获取活动表
ws = wb.active
# 修改活动表明湖曾
ws.title = 'test1'
# 修改 A1 单元格数据
ws['A1'] = 'A1'
# 查看当前所有表格名称
wb.get_sheet_names()
# 保存Excel文件
wb.save('test1.xlsx')

4.2 Some operations on table objects

4.2.1 Create table

The create_sheet() method returns a new sheet object named SheetX, which defaults to the last sheet of the workbook. Or, you can use the index and title parameters to specify the index and name of the new worksheet

# 使用工作簿对象创建一个新的表格,表格名称为test2
wb.create_sheet(index=None,title='test2')

index: Integer type, set the index of the new worksheet, the default is None and put it at the end, if set to 0, the table is put at the top.

title: string type, set the name of the new worksheet, if the new worksheet name already exists, the new worksheet name will automatically become title1.

4.2.2 Delete table

The remove() method receives a table object instead of a string of the table name.

wb = openpyxl.load_workbook('test.xlsx')
# 使用工作簿对象删除表格
wb.remove(wb['test2'])
wb.save('test.xlsx')

4.2.3 Copy table

The copy_worksheet() method receives a table object and copy and paste it

wb = openpyxl.load_workbook('test.xlsx')
sheet = wb.active
# 复制选中表格
wb.copy_worksheet(sheet)
wb.save('test.xlsx')

4.2.4 Insert and delete rows and columns in the table

Insert rows and columns

wb = openpyxl.load_workbook('test.xlsx')
sheet = wb.active

# 插入行
# 在idx行上方插入数量为amount的行
sheet.insert_rows(idx=2, amout=2)

# 插入列
# 在idx列左侧插入数量为amount的行
sheet.insert_cols(idx=2, amout=2)

wb.save('test.xlsx')

Delete rows and columns

wb = openpyxl.load_workbook('test.xlsx')
sheet = wb.active

# 删除行
# 从idx行开始向下删除amount数量的行,包括idx这一行
sheet.delete_rows(idx=2, amout=2)

# 删除列
# 从idx列开始向右删除amount数量的列,包括idx这一列
sheet.delete_cols(idx=2, amout=2)

wb.save('test.xlsx')

4.2.5 Add data to the table

Adding new data means adding data to an existing table. You can only use the load_workbook() method to read the Excel spreadsheet.

The append() method accepts a variable parameter, including but not limited to a list, range or generator or dictionary

If you pass in a list: add all values ​​in order from the first column, the list element corresponds to each row

If a dictionary is passed in: the value is assigned to the column indicated by the key (number or letter)

Note: append can only add one row of data at a time. If you want to add multiple rows, you need to combine it with loops or other methods.

wb = openpyxl.load_workbook('test.xlsx')
sheet = wb.active
# 在当前工作表数据的底部追加一组值
sheet.append(iterable)
# 追加数据
sheet.append([1,2,3])
sheet.append({
    
    'A':'这是A1'})

4.3 Some operations on cells

4.3.1 Modify cell data and define formula

wb = openpyxl.load_workbook('test.xlsx')
sheet = wb.active
cell = sheet['A1']
# 修改单元格数据
cell.value = 'A1'
# 在单元格中添加公式
cell.value = '=SUM(A2:C4)'
wb.save('test.xlsx')

# 查看openpyxl支持的公式
from openpyxl.utils import FORMULAE
FORMULAE

4.3.2 Move cells

Use the move_range() method to move cells

sheet.move_range(cell_range, rows=0, cols=0, translate=False)

wb = openpyxl.load_workbook('test.xlsx')
sheet = wb.active

# 将A1:C2范围的单元格,向下移动一行,向右移动2列
sheet.move_range('A1:C2', rows=1, cols=2)

4.3.3 Freeze pane

For Excel spreadsheets with too much data to be displayed on one screen, freezing the required rows or columns is very helpful.

Using the freeze_panes property, you can set a string of cell coordinates to freeze the rows and columns above and to the left of the cell coordinates, but the rows and columns where the cell itself is located will not be frozen.

# 冻结行1
sheet.freeze_panes = 'A2'
# 冻结列A
sheet.freeze_panes = 'B1'
# 冻结列A和列B,行1
sheet.freeze_panes = 'C2'
# 不会冻结窗格
sheet.freeze_panes = 'A1'

5 Cell style modification

Setting the font style of certain cell rows or columns can help you emphasize key areas in the spreadsheet.

Here you need to use the Font() method in openpyxl.styles

Note: Once an instance of the following styles (font, fill, border, position, etc.) is created, the properties of the instance cannot be changed, and the instance can only be recreated.

5.1 Get the cell font style

import openpyxl
wb = openpyxl.load_workbook('test.xlsx')
sheet = wb.active
cell = sheet['A1']
# 获取单元格字体所有属性
cell.font
# 获取字体对象
font = cell.font
# 获取字体名称
font.name
# 获取字体大小
font.size
# 字体是否加粗
font.bold
# 字体是否为斜体
font.italic

5.2 Set the cell font style

The Font() method can set the font style of the cell, the following are some commonly used parameter settings

# 导入需要的方法
from openpyxl.styles import Font
from openpyxl import load_workbook
# 加载一个Excel电子表格
wb = load_workbook('test.xlsx')
sheet = wb.active
cell = sheet['A1']
cell.value = 'CSDN'

# 设置一个字体风格,宋体,字号20,加粗,斜体,颜色为红色
font = Font(name='宋体',size=20, bold=True,italic=True,color='FF0000')
# 设置选中单元格的字体风格
cell.font = font
wb.save('test.xlsx')

name: string type, input the font name to be set.

size: positive integer type, set the font size.

bold: Boolean type, True means bold.

italic: Boolean type, True means italic.

color: string type, input the hexadecimal color number (HEX), such as'FFFFFF' for white.

5.3 Set cell format classification

You can set the cell format classification by setting the cell style

# 设置单元格分类为百分比
cell.style = '百分比'
cell.style = '常规'

# 常规单元格
cell.number_format = 'General'
# 百分比单元格
cell.number_format = '0.00%'
# 科学计数法
cell.number_format = '0.00E+00'

The cell format of excel supported by openpyxl

https://openpyxl.readthedocs.io/en/stable/_modules/openpyxl/styles/numbers.html?highlight=openpyxl.styles.numbers

5.4 Cell Alignment Format Setting

# 导入所需模块
from openpyxl.styles import Alignment
# 设置单元格对齐样式
alignment = Alignment(horizontal='center', vertical='center', wrap_text=True)

# 应用样式
cell.alignment = alignment

Multiple parameters can be passed in Alignment:

horizontal: string type, horizontal alignment, corresponding to the Excel assist alignment format

"general", "left", "center", "right", "fill", "justify", "centerContinuous",
    "distributed"

vertical: string type, vertical alignment, corresponding to the Excel assist alignment format

 "top", "center", "bottom", "justify", "distributed",

wrap_text: Boolean type, set whether to wrap automatically

textRotation: integer type, set the text rotation angle, the maximum value is 180

Official document link:

https://openpyxl.readthedocs.io/en/latest/_modules/openpyxl/styles/alignment.html

5.5 Set the color fill style

5.5.1 Pure color fill

# 导入所需模块
from openpyxl.styles import PatternFill
# 设置颜色填充样式
pattern_fill = PatternFill(fill_type='solid',fgColor='00B0F0')

# 应用样式
cell.fill = pattern_fill

fill_type: string type, set the filling style, the commonly used is solid

fgColor: string type, input HEX hexadecimal color color number

Note: It is stated in the official document that if fill_type is not specified, the subsequent parameters are invalid

5.5.2 Gradient color fill

# 导入所需模块
from openpyxl.styles import GradientFill
# 设置颜色填充样式
gradient_fill = GradientFill(stop=('FFFFFF', '99CCFF', '000000'))

# 应用样式
cell.fill = gradient_fill

stop: Pass in a tuple, from the start color number to the end color number, which is also in hexadecimal HEX format

5.6 Set border style

# 导入所需模块
from openpyxl.styles import Side, Border
# 设置边框线样式
side = Side(style='thin', color='000000')
border = Border(left=side, right=side, top=side, bottom=side)

Side set the specific border line format

style: incoming format, styles are as follows

‘dashDot’,‘dashDotDot’,‘dashed’,‘dotted’,‘double’,
‘hair’,‘medium’,‘mediumDashDot’,‘mediumDashDotDot’,
‘mediumDashed’,‘slantDashDot’,‘thick’,‘thin’

color: HEX color number

Border Set the range of border line application, the commonly used ones are up, down, left, and right

ll = gradient_fill


stop:传入一个元组,从起始颜色色号至终止颜色色号,同样是十六进制HEX的格式

### 5.6 设置边框样式

```python
# 导入所需模块
from openpyxl.styles import Side, Border
# 设置边框线样式
side = Side(style='thin', color='000000')
border = Border(left=side, right=side, top=side, bottom=side)

Side set the specific border line format

style: incoming format, styles are as follows

‘dashDot’,‘dashDotDot’,‘dashed’,‘dotted’,‘double’,
‘hair’,‘medium’,‘mediumDashDot’,‘mediumDashDotDot’,
‘mediumDashed’,‘slantDashDot’,‘thick’,‘thin’

color: HEX color number

Border Set the range of border line application, the commonly used ones are up, down, left, and right

Guess you like

Origin blog.csdn.net/weixin_45609519/article/details/107933694