summarizing 2 columns into one based on a third index column

user171558 :

I have the following dataframe

  first_char second_char type
1          a           b  1/1
2          a           b  0/1
3          a           b  0/1
4          c           d  1/1
5          c           d  0/1
6          c           d  0/0

I would like to combine these columns into one as such:

1       bb
2       ab
3       ab
4       dd
5       cd
6       cc

The type column contains the indeces separated by a forward slash for first_char and second_char columns.

jezrael :

Use this solution for filter by arrays of indices for avoid looping by apply:

ind = df['type'].str.split('/', expand=True).astype(int).to_numpy()
arr2 = df[['first_char','second_char']].to_numpy()    

df['new'] = arr2[np.arange(ind.shape[0])[:,None], ind].sum(1)
print (df)
  first_char second_char type new
1          a           b  1/1  bb
2          a           b  0/1  ab
3          a           b  0/1  ab
4          c           d  1/1  dd
5          c           d  0/1  cd
6          c           d  0/0  cc

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=168880&siteId=1