Datanovice :
I have a dataframe as follows
df = pd.DataFrame({'Product' : ['A'],
'Size' : ["['XL','L','S','M']"],
'Color' : ["['Blue','Red','Green']"]})
print(df)
Product Size Color
0 A ['XL','L','S','M'] ['Blue','Red','Green']
I need to transform the frame for an ingestion system which only accepts the following format:
target_df = pd.DataFrame({'Description' : ['Product','Color','Color','Color','Size','Size','Size','Size'],
'Agg' : ['A','Blue','Green','Red','XL','L','S','M']})
Description Agg
0 Product A
1 Color Blue
2 Color Green
3 Color Red
4 Size XL
5 Size L
6 Size S
7 Size M
I've attempted all forms of explode, groupby and even itterrows, but I can't get it to line up. I have thousands of Products. with a few groupby and explodes I can stack the column but then I have duplicate Product Names which I need to avoid, the order is important too.
any help is much appreciated.
Quang Hoang :
Here's a solution without eval
:
(df.T[0].str.strip('[]')
.str.split(',', expand=True)
.stack().str.strip("''")
.reset_index(level=1, drop=True)
.rename_axis(index='Description')
.reset_index(name='Agg')
)
Output:
Description Agg
0 Product A
1 Size XL
2 Size L
3 Size S
4 Size M
5 Color Blue
6 Color Red
7 Color Green
Guess you like
Origin http://43.154.161.224:23101/article/api/json?id=291181&siteId=1