Do you know how to implement "multiple lines in one line" and "multiple lines in one line" in Python?

Fan questions

Fans raised the following question today. One of them is "Split multiple lines in one line" and the other is "Multiple lines in one line". It seems that group friends have solved it with power query. But how to do it based on Python? Then look down.
Insert picture description here

Split multiple lines in one line

For the above question, I will provide two ideas for you to choose. Of course, the simpler the better. There are some useful skills in each method, I hope everyone can learn it.

1) Method one

There are a lot of important knowledge points in the code below, we need to go down and study hard, I only provide disassembly ideas here, about how to use each knowledge point, I hope you will go and study and learn by yourself.

  • Usage of Pandas.melt() function;
  • In Series.str.split("/",expand=True), the usage of expand=True parameter;
  • Series.sort_values() sort the text;
  • Usage of enumerate() function in Python;
import pandas as pd
# 读取数据
df = pd.read_excel("test1.xlsx",sheet_name="Sheet1")
# 将一列炸裂成多列
df[["类型1","类型2","类型3"]] = df["电影类型"].str.split("/",expand=True)
# 选取想要的列
df_final = df[["电影名","类型1","类型2","类型3"]]
# 将行专列
df_final = df_final.melt(id_vars=["电影名"],value_name="类型")
# 对“电影名”字段进行排序
df_final = df_final[["电影名","类型"]]
df_final.sort_values(by="电影名",inplace=True)
# 删除“类型==None”的行
for index,value in enumerate(df_final["类型"]):
    if value == None:
        df_final.drop(df_final.index[index],inplace=True)
df_final

The results are as follows:
Insert picture description here

2) Method two

The above method really feels complicated, but there is no way. My previous version of Pandas was only 0.23.4, so I cannot use the explode() method to explode. In pandas version 0.25, a new explode method was added to the DataFrame, specifically used to change one row into multiple rows.

  • Usage of Pandas.explode() function;
import pandas as pd
# 读取数据
df = pd.read_excel("test1.xlsx",sheet_name="Sheet1")
# 将一行拆分成列表形式,注意:这里不需要使用expand=True参数
df["type"] = df["电影类型"].str.split("/")
# 直接炸裂指定列
df.explode("type")

The results are as follows:
Insert picture description here

Multiple lines

No special knowledge is used here. A good understanding of the grouping and aggregation application of a function in Pandas can easily solve this problem.

import pandas as pd
# 读取数据
df = pd.read_excel("test1.xlsx",sheet_name="Sheet2")
# 分组聚合,应用某个函数
def func(df):
    return ','.join(df.values)
df = df.groupby(by='电影名').agg(func).reset_index()
df

The results are as follows:
Insert picture description here

Guess you like

Origin blog.csdn.net/weixin_41261833/article/details/108135992