Python: Replacing multiple specific words from a list with re.sub

Rohan Gupta :

I have the following string and list 'changewords'. I would like to replace the '{word from list} \n' with '{word from list}:' I don't want to replace all instances of '\n'.

string = "Foo \n value of something \n Bar \n Another value \n"
changewords = ["Foo", "Bar"]

Desired Output:

'Foo: value of something \n Bar: Another value \n'

I have tried the following

for i in changewords:
    tem = re.sub(f'{i} \n', f'{i}:', string)
tem
Output: 'Foo \n value of something \n Bar: Another value \n'

and

changewords2 = '|'.join(changewords)
tem = re.sub(f'{changewords2} \n', f'{changewords2}:', string)
tem
Output: 'Foo|Bar: \n value of something \n Foo|Bar: Another value \n'

How can I get my desired output?

Todd :

Using replacement string:

A slightly more elegant way of doing it. This one-liner:

re.sub(rf"({'|'.join(changewords)}) \n", r"\1:", string, flags=re.I)

demo:

>>> string = "Foo \n value of something \n Bar \n Another value \n"
>>> changewords = ['Foo', 'Bar', 'Baz', 'qux']
>>> 
>>> re.sub(rf"({'|'.join(changewords)}) \n", r"\1:", string, flags=re.I)
'Foo: value of something \n Bar: Another value \n'
>>> 

You can specify case insensitive matching with the flags option. And the replacement string can be modified to have anything around \1 needed like colons or commas.

Worth noting, you can put more than one specifier on strings in Python. For instance you can have both r and f like, rf"my raw formatted string" - the order of specifiers isn't important.

Within the expression in re.sub(expr, repl, string), you can specify groups. Groups are made by placing parenthesis () around text.

Groups can then be referenced in the replacement string, repl, by using a backslash and the number of its occurrence - the first group is referred to by \1.

The re.sub() function, re.sub(rf"(A|B|C) \n", r"\1: "), associates \1 within the replacement string with the first group (A|B|C) within the expression argument.

Using replacement function:

Suppose you want to replace words in the target string with other words from a dictionary. For instance you want 'Bar' to be replaced with 'Hank' and 'Foo' with 'Bernard'. This can be done using a replacement function instead of replacement string:

>>> repl_dict = {'Foo':'Bernard', 'Bar':'Hank'}
>>> 
>>> expr = rf"({'|'.join(repl_dict.keys())}) \n"   # Becomes '(Foo|Bar) \\n'
>>>
>>> func = lambda mo: f"{repl_dict[mo.group(1)]}:"
>>> 
>>> re.sub(expr, func, string, flags=re.I)
'Bernard: value of something \n Hank: Another value \n'
>>> 

This could be another one-liner, but I broke it up for clarity...

What the lambda function does is take the match object, mo passed to it, then extract the first group's text. The first group in the reg expr is the text encompassed by (), which would be like (A|B|C).

The replacement function references this first group using, mo.group(1); similarly, the replacement string referenced it by, \1 in the previous example.

Then the repl function does the lookup in the dict and returns the final replacement string for the match.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=23689&siteId=1
Recommended