Pandas use practical tips

Pandas use tips

A split rows

Common requirement is to split a column into multiple columns with the specified delimiter. Existing requirements, the specified delimiter split into multiple lines .

Example:

df =     A       B
      0  a       f
      1  b;c     h;g 
      2  d       k
      3  e       l

Now it needs to be split into:

df =     A       B
      0  a       f
      1  b       h
      1  c       g 
      2  d       k
      3  e       l

 

1.1 Treatment A column 

Implementation process is as follows:

df = pd.DataFrame({'A': ['a', 'b;c', 'd', 'e'], 'B': ['f', 'h;j', 'k', 'l']})
df
     A    B
0    a    f
1    b;c    h;j
2    d    k
3 el

 A column in accordance with the ";" split, and to expand DataFrame, the effect due to expand null argument:

df_a = df['A'].str.split(';', expand=True)
df_a

    0    1
0    a    None
1    b    c
2    d    None
3    e    None

 The df_a be stacked:

df_a = df_a.stack()
df_a

0  0    a
1  0    b
   1    c
2  0    d
3  0    e
dtype: object

The index is reset to the column and the inner layer removed:

df_a = df_a.reset_index(level=1, drop=True)
df_a

0    a
1    b
1    c
2    d
3    e
dtype: object

Rename the Series , or the next merger will fail:

df_a.rename('A_split', inplace=True)
df_a

0    a
1    b
1    c
2    d
3    e
Name: A_split, dtype: object

1.2 Processing Column B

Process with columns A, after the final re-named:

df_b.rename('B_split', inplace=True)
df_b

0    f
1    h
1    j
2    k
3    l
Name: B_split, dtype: object

1.3 merge A_split and B_split

After the merger of the two levels of processing is complete:

concat_a_b = pd.concat([df_a, df_b], axis=1)
concat_a_b
A_split B_split 0 a     f
1 b      h 1 c     j 2 d     k 3 e      l

1.4 Finally, the original data and merge

The data eventually processed and raw data from the index merge:

df = df.join(concat_a_b, how='inner')
df

   A      B      A_split    B_split
0    a      f      a          f
1    b;c     h;j    b          h
1    b;c     h;j    c          j
2    d       k      d          k
3    e       l      e          l

Finally we reached the desired effect.

 

Guess you like

Origin www.cnblogs.com/strivepy/p/11704595.html