5-2

A regular expression uses a single string to describe and match a series of strings that conform to a certain grammar. The counterpart in python is the re module. The pattern to be matched and the string can use unicode encoding or ordinary 8-bit encoding, but the two cannot be mixed, that is, if the pattern to be matched is Unicode encoding, the string must also be Unicode encoding.

We know that regular expressions use backslashes "\" to represent some special characters (characters with special meanings in regular expressions), which can cause some trouble. For example, if we need to match the backslash "\" in the regular expression, then we need to express it as "\\" in our regular expression, then in the programming language we need "\\\\" (every two escaped to a "\" in the regular expression). In python, "\" is no longer regarded as a special character when represented by raw string. For example, the above can be written as r"\\" in the programming language, and r"\n" is expressed as "\" and "n" two characters.

^' : matches the beginning of the string. When MULTILINE mode is specified, the beginning of multiple lines can be matched.

'$' : Matches the end of the string, or before the newline at the end of the string; in MULTILINE mode, matches the end of each line.

'*' : Repeat 0 to n times for the previous content in RE, such as ab*, which can match 'a', 'ab', 'abbbb', etc.

'+' : Repeat the previous content in RE from 1 to n times, such as ab+, which can match 'ab', 'abbbb', but cannot match 'a'.

'?' : Repeat 0 or 1 times for the previous content in the RE, such as ab?, which can match 'a' or 'ab', but 'abbbb' cannot.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325267546&siteId=291194637
5-2