Symmetric Matrix Transformation

Table – switch between three columns

Join us to have a table format similar to the following:

north sky superior Heavy stone open bear Tang Qin
north 0 75 33 45 166 67 52 69 3
sky 75 0 46 21 214 70 64 221 70
superior 33 46 0 55 31 2 0 2 1
Heavy 45 21 55 0 29 0 0 0 0
stone 166 214 31 29 0 8 8 9 5
open 67 70 2 0 8 0 0 5 0
bear 52 64 0 0 8 0 0 12 0
Tang 69 221 2 0 9 5 12 0 19
Qin 3 70 1 0 5 0 0 19 0

We want to convert it into a three-column form:

source target value
north sky 75
north superior 334
north Heavy 45
north stone 166

So how to do it?

pandas library to read Excel files and convert them into three columns

import pandas as pd

# 读取Excel文件
df = pd.read_excel('网络.xlsx', index_col=0)

# 删除0距离值
df = df[df != 0].dropna(how='all').dropna(axis=1, how='all')

# 将表格转换为三列式
triplets = []
for row in df.index:
    for col in df.columns:
        if not pd.isna(df.loc[row, col]):
            triplets.append([row, col, df.loc[row, col]])

# 将三列式保存到Excel文件中
triplets_df = pd.DataFrame(triplets, columns=['起始城市', '目标城市', '距离值'])
triplets_df.to_excel('network.xlsx', index=False)

This code will read an Excel file called "web.xlsx" and convert it to a three-column format. It removes entries with a distance value of 0 and adds the distance value between each pair of cities in the table as a separate row to a list. Finally, it will convert this list into a new pandas DataFrame and save it to an Excel file named "network.xlsx". Note that we set index=Falseto avoid writing the DataFrame index to the Excel file

How to implement the three-column conversion symmetric matrix in the above operations?

Using Python and the pandas library, you can convert a three-column format into a table format in an Excel file

Here is a sample code:

import pandas as pd

# 读取三列式
df = pd.read_excel('三列式.xlsx')

# 将三列式转换为表格格式
pivot_table = df.pivot_table(index='起始城市', columns='目标城市', values='距离值')

# 将表格保存为Excel文件
pivot_table.to_excel('网络.xlsx')

This code will read an Excel file called "three-column.xlsx" and convert it to a table format. It will use the start city and destination city as index and column labels, and the distance value as the cell value. Finally, it will save the table to an Excel file called "web.xlsx".

Note that if you have duplicate city pairs in the three-column, then when using pivot_tablethe method you need to specify how to combine the duplicate values. By default, pivot_tablethe method uses average to combine duplicate values, but you can use aggfuncparameters to specify other aggregation functions, such as min, maxor sum.

But we found that there will be a lot of null values ​​when generated, so how to brake the null values ​​to generate 0? Excel can be operated directly, but how to implement the specific code?

#在将三列式转换为表格格式时,如果缺少某些值,则可以使用fillna()方法将缺失值填充为0。
import pandas as pd

# 读取三列式
df = pd.read_excel('三列式.xlsx')

# 将三列式转换为表格格式,并将缺失值填充为0
pivot_table = df.pivot_table(index='起始城市', columns='目标城市', values='距离值').fillna(0)

# 将表格保存为Excel文件
pivot_table.to_excel('网络.xlsx')

This code will read an Excel file called "three-column.xlsx" and convert it to a table format. It will use the start city and destination city as index and column labels, and the distance value as the cell value. It then fillna()fills the missing values ​​with 0 using the method. Finally, it will save the table to an Excel file called "web.xlsx".

Note that if you have duplicate city pairs in the three-column, pivot_tablemissing values ​​will be automatically filled with NaN when using the method. Therefore, before saving the table to the Excel file, you need to use fillna()the method to fill NaN with 0 to avoid errors.

If you need data and code, please pay attention to my WX:Jdaystudy

Guess you like

Origin blog.csdn.net/weixin_43886163/article/details/129405189