Bruno Mello :
Suppose I have the following pandas series:
x = pd.Series(['box abcd', 'abcd box abcd', 'abcd box', 'abcdboxabcd'])
And I want to remove all the occurrences of the word box (note that I don't want to remove all occurrences of the substring box), I have done it like this:
x.apply(lambda x: ' '.join([w for w in x.split(' ') if w != 'box']))
Which gives me what I expected:
0 abcd
1 abcd abcd
2 abcd
3 abcdboxabcd
dtype: object
I would like to know if there is a way to do this using regex, for instance:
x.str.replace(regex, '')
Where regex is the regex matches the word box, I have searched a lot about regex but can't seem to find an answer, is it possible? Or there isn't such regex like that?
Quang Hoang :
You want \b
indicating word separation, and then strip extra spaces:
x.str.replace(r'\b(\s?box\s?)\b', ' ').str.strip()
Output:
0 abcd
1 abcd abcd
2 abcd
3 abcdboxabcd
dtype: object
Guess you like
Origin http://10.200.1.11:23101/article/api/json?id=390496&siteId=1