stud_eco :
I have a pandas Dataframe with a column expressing the surname and name of several tennis players like the following one:
| Player |
|---------------------|
0 | 'Roddick Andy' |
1 | 'Federer Roger' |
2 | 'Tsonga Jo Wilfred |
I want to keep the full surname and get the initial of the name and middle name if there is. So the pandas column should look like the following one:
| Player |
|-------------------|
0 | 'Roddick A.' |
1 | 'Federer R.' |
2 | 'Tsonga J.W.' | N.B. J.W. with no space
Does anyone have suggestions? Thank!
Quang Hoang :
Here's an approach with str.extractall
and groupby
:
(df.Player
.str.extractall('(?P<Surname>\w*)\s(?P<Name>\w*)')
.groupby(level=0)
.agg({'Surname':'first',
'Name': lambda x: x.str[0].add('.').sum()
})
.agg(' '.join, axis=1)
)
Output:
0 Roddick A.
1 Federer R.
2 Tsonga J.W.
dtype: object