Table of Contents of Series Articles
1. Use of MultiIndex in pandas
Article directory
- Table of Contents of Series Articles
- Preface
- One,MultiIndex is this?
- 2. Usage steps
- 1. Import the library
- 2.Use MultiIndex for the data to be reviewed and indexed
- Summarize
Preface
The data processing of recording a certain project involves data processing of multi-level headers, and pandas happens to have a composite index.
MultiIndex method
1. What is MultiIndex?
MultiIndex is a pandas method for handling double-checked indexes.
parameter:
(
cls,
levels=None,
codes=None,
sortorder=None,
names=None,
dtype=None,
copy=False,
name=None,
verify_integrity: bool = True,
):
2. Usage steps
1. Import the library
code show as below:
import pandas as pd
2. Use MultiIndex for the data to be reviewed and indexed
code show as below:
new_nor_sum1_df.columns = pd.MultiIndex(
[['发薪公司', '纳税地', '中国籍', '非中国籍', "合计"], ['', '人数', '当月所得税']],
codes=[[0, 1, 2, 2, 3, 3, 4, 4],
[0, 0, 1, 2, 1, 2, 1, 2]])
Simulated data for a project used here.
levels: [['Payroll Company', 'Tax Place', 'Chinese Nationality', 'Non-Chinese Nationality', "Total"], ['', 'Number of people', 'Income tax for the month']],
codes:[[0, 1, 2, 2, 3, 3, 4, 4], [0, 0, 1, 2, 1, 2, 1, 2]]
Levels and codes are used to determine the correspondence between multi-level indexes.
First-level list of levels['Payroll Company', 'Tax Place', 'Chinese Nationality ', 'Non-Chinese nationality', "Total"], corresponding to codes[0, 1, 2, 2, 3, 3, 4, 4],
In the codes, 0 represents the position of 'payroll company', 1 represents the position of 'tax payment place', 2 represents the position of 'Chinese nationality', 3 represents the position of 'non-Chinese nationality', and 4 represents the position of 'total'.
leves second level table['', 'Number of people', 'Current month's income tax' ],对应codes的[0, 0, 1, 2, 1, 2, 1, 2], codes neutral0representative'' Location,1Representative'Number of people' Location,2Representative'Current month's income tax' Target position.
Debug the code to show the effect:
Actual excel effect:
Summarize
It is very convenient to use MultiIndex to process DataFrame data with multi-level headers. After processing the data and generating excel, use the openpyxl module to adjust the cell format, and the table will look good. Data processing is so simple.