and regular re module

Regular expression: the string is used to filter specific content of a string of characters with a certain logical rules of composition. Regular expressions are not unique to Python, but an independent technology, it has used in all programming languages, in Python, you must rely on re module. Regular application scenarios: for example, reptiles analysis, data.

Regular use of code character structure of certain logic, we can simplify redundant code. See the following two chart, please compare.

 

Regular expression matching string rules:

Burst [] is an expression of the string inside a relationship or
[0-9] match represents the numbers 0 to 9, or a relationship between the numbers inside the brackets, as long as there is energy corresponds to a number 0 to 9 any figures are considered a successful match. May be written in the form [0123456789] is

[Az] matches lowercase

[AZ] matches uppercase

[0-9a-fA-F] can be match case digital form af, hexadecimal characters used to verify

 

Yuan characters:

^ And $ conjunction, can achieve precise matching between $ ^ and what content, content that matches what you have to write; for example, ^ ab $, the contents must also be matched ab

^ [And], it represents the first character is a 'and' beginning; [^ and] represents any character string in addition to the brackets

abc | ab sure to place the regular length | short front on the back, his best otherwise similar two strings, after the long string to be matched string preceded successful short portion, the equivalent length character string was cut into two parts, the discard not match, the match will not be long on the back.

       


Regular on match default is greedy match (try to match more), you can add a quantifier behind? You can become greedy match non-greedy match (inert match)

Quantifiers must follow close behind in the regular symbols, quantifiers can only limit it with that of a regular symbol

Group: when a plurality of regular symbols repeated as many times as a whole, or other operations, it may be in the form of packets, packets in a regular grammar is used in parentheses ()

 Regular examples:

 

 

    

The method commonly used in the re module:

 python must use regular re module by means of a regular expression or a method of writing support

Import re (using the re module, first introducing the first)

The re module built-in method:

re.findall (print direct value)

 re.search (有匹配的结果时,返回的是一个对象,还需调用.group()才能取值)

re.match(同search)

split

sub

subn

compile

finditer

1), re.findall   表达式为: findall('正则表达式','带匹配的字符串')    找出字符串中符合正则表达式全部内容 并且返回的是一个列表,列表中的元素就是正则匹配到的结果。

2),re.search   表达式为: search('正则表达式','带匹配的字符串'),   如果对象存在,不会给你直接返回匹配到的结果 ,而是给你返回一个对象,必须调用group才能看到匹配到的结果。如果匹配不到,返回none, 调用group直接报错。search只会依据正则查一次 只要查到了结果 就不会再往后查找。

 

3), match

1.match只会匹配字符串的开头部分
2.当字符串的开头不符合匹配规则的情况下 返回的也是None 调用group也会报错

 

split  在re下的split表示切除的意思,切除之后生成的是一个列表

sub(能用正则将字符串中的数字替换掉,返回一整个字符串)  sub的表达式re.sub('正则表达式','新的内容','待替换的字符串',n),n代表要替换字符串中数字的个数

subn   返回的是一个元组 元组的第二个元素代表的是替换的个数

 

 

 

compile 

re.compile 下再调用findall,结果生成按我们指定的只能重复三个数字,切分成列表。

 

finditer (就是一个迭代器)  re.finditer('正则’,‘字符串’).__next__.group(),当取出所有值,就会报错

还可以给某一个正则表达式起别名

 

在Python里的分组,与正则无关。这是因为findall会优先把匹配结果组里内容返回,如果想要匹配结果,取消权限即可。

 

Guess you like

Origin www.cnblogs.com/zhangchaocoming/p/11204459.html