python regular expression replacement function callback function

When done with openssl certificate issuing center, the certificate to be used in the development of Chinese characters, so choose utf8 way, but the contents of the file openssl index.txt claimed in all Chinese characters, after all use utf8 encoding turn into "\ x ?? "format, ?? from the 00-FF, very easy to see, it is going to write a small script in python to be transcoded view.

First on the code:

#coding=utf8
"""将pem(文本格式)的证书里面的\x??转换为UTF8编码,显示正确的内容
"""
import sys
import re

if(len(sys.argv) == 1):
    print("Usage: " + sys.argv[0] + " <file>")
    sys.exit()
f=open(sys.argv[1], "r")
s = f.read()
f.close()
#print(s)
s2 = re.sub(r"\\x(..)", lambda x: chr(int(x.group(1), 16)), s)
print(s2.decode("utf8"))
print "\n\n-------------"
print u"程序运行结束."


We know that most of the time, when we replace the expression with a positive, are simple to replace, but this time we need to \ x ?? turn into real characters, re.sub function, the second argument takes a string the way, is no way to achieve our requirements mentioned re.sub function in python's manual, the second parameter can be a function, which is the original manual:

If repl is a function, it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string. For example:

>>> def dashrepl(matchobj):
...     if matchobj.group(0) == '-': return ' '
...     else: return '-'
>>> re.sub('-{1,2}', dashrepl, 'pro----gram-files')
'pro--gram files'
>>> re.sub(r'\sAND\s', ' & ', 'Baked Beans And Spam', flags=re.IGNORECASE)
'Baked Beans & Spam'


Kick call it a day.

Reproduced in: https: //my.oschina.net/kivensoft/blog/549359

Guess you like

Origin blog.csdn.net/weixin_33997389/article/details/92058667