##Regular expression
A regular expression is a string that can represent a regular piece of information. Python comes with a regular expression module. Through this module, you can find, extract, and replace a regular piece of information.
In program development, to allow the computer to find the required information from a large piece of text, you need to use regular expressions.
Steps to use regular expressions:
Find patterns
Use regular symbols to represent patterns
Extract information
Basic symbols of regular expressions
1. The period mark "."
A period mark can replace any character except the newline character, including but not limited to English letters, numbers, Chinese characters, English punctuation marks and Chinese punctuation marks. For example the following
kingname
kinabcme
kin123me
kin我是谁me
kin嗨你好me
kin"m"me
The first three characters of these characters are "kim" and the last two characters are "me", so using regular expressions it can be written as kin...me. How many dots represent how many words there are between them?
2. "*"
an asterisk represents a subexpression before it (ordinary characters, another or several regular expression symbols) 0 to infinite times
For example, the following different strings:
If you are happy, smile haha
If you are happy, laugh
haha If you are happy, laugh
hahahaha If you are happy, laugh hahahahahahahahaha
In these strings, the word "ha" appears repeatedly, so if it is represented by an asterisk, it can all become:
If you are happy, just smile*
Since the asterisk can represent the character before it 0 times, even if it is written as "If you are happy, laugh" without the word "ha", it still satisfies this regular expression.
Since the asterisk can represent the character before it, what if the character before it is a period? For example, the following regular expression:
如.*哈
It means that "any number of any characters except newline characters" appear between "such" and "ha".
3. Question mark "?"
The question mark represents the subexpression before it 0 times or 1 time. Note that the question mark here is in English.
For example, "
笑起来。
笑起来哈。
Because there are zero or more "ha" between "来" and ".", it can be expressed by the following regular expression:
笑起来哈?。
The biggest role of the question mark is to be used in conjunction with "." and "*" to form ".*?". By extracting information, the most commonly used strings are
all the following strings in this combination:
如哈
如果快乐哈
如果快乐你就笑哈
如果你知道1+1=2那么请计算地球的半径哈
It can all be represented by "such as.*?ha"