Practical records of using MultiIndex in pandas

Table of Contents of Series Articles

1. Use of MultiIndex in pandas

Article directory


Preface

The data processing of recording a certain project involves data processing of multi-level headers, and pandas happens to have a composite index.

MultiIndex method

1. What is MultiIndex?

MultiIndex is a pandas method for handling double-checked indexes.

parameter:

(
    cls,
    levels=None,
    codes=None,
    sortorder=None,
    names=None,
    dtype=None,
    copy=False,
    name=None,
    verify_integrity: bool = True,
):

2. Usage steps

1. Import the library

code show as below: import pandas as pd

2. Use MultiIndex for the data to be reviewed and indexed

code show as below:

new_nor_sum1_df.columns = pd.MultiIndex(
            [['发薪公司', '纳税地', '中国籍', '非中国籍', "合计"], ['', '人数', '当月所得税']],
            codes=[[0, 1, 2, 2, 3, 3, 4, 4],
                   [0, 0, 1, 2, 1, 2, 1, 2]])

Simulated data for a project used here.

levels: [['Payroll Company', 'Tax Place', 'Chinese Nationality', 'Non-Chinese Nationality', "Total"], ['', 'Number of people', 'Income tax for the month']],

codes:[[0, 1, 2, 2, 3, 3, 4, 4], [0, 0, 1, 2, 1, 2, 1, 2]]

Levels and codes are used to determine the correspondence between multi-level indexes.

First-level list of levels['Payroll Company', 'Tax Place', 'Chinese Nationality ', 'Non-Chinese nationality', "Total"], corresponding to codes[0, 1, 2, 2, 3, 3, 4, 4],

In the codes, 0 represents the position of 'payroll company', 1 represents the position of 'tax payment place', 2 represents the position of 'Chinese nationality', 3 represents the position of 'non-Chinese nationality', and 4 represents the position of 'total'.


leves second level table['', 'Number of people', 'Current month's income tax' ],对应codes的[0, 0, 1, 2, 1, 2, 1, 2], codes neutral0representative'' Location,1Representative'Number of people' Location,2Representative'Current month's income tax' Target position.

Debug the code to show the effect:

Actual excel effect:


Summarize

It is very convenient to use MultiIndex to process DataFrame data with multi-level headers. After processing the data and generating excel, use the openpyxl module to adjust the cell format, and the table will look good. Data processing is so simple.

Guess you like

Origin blog.csdn.net/Smile_Lai/article/details/125913941