1. Regular Expressions online validation tool: https://regex101.com/
2. Common syntax:
Write regular expression inside the ordinary characters are represented: direct match them.
For example, you the following text, if you're looking for all of the test, the regular expression is very simple, direct input test can be:
Character is the same, to find Chinese characters, write directly to the regular expression inside it
But some special characters, terms called metacharacters (meta-characters)
They appear in a regular expression string, they are not a direct match, but express some special meaning, these special metacharacters include the following:
^ $ . * + ? = ! : | \ / ( ) [ ] { }
We are brief their meanings:
Point - to match all characters
. Pledged to match a single character except newline
For example, the following text you want to select all the colors
Apple green orange orange banana yellow crow is black
That is, to find the end of all Israel, and includes the words of a character in front of
You can write a regular expression . Color
Which point represent any one character, a character note
The color together to represent any one of looking after the character color is the word, the word string together
As long as the expression is correct, you can write in python code, as shown below
content = "apple green orange orange banana yellow crow is black" Import Re the p-re.compile = (r 'color.') for One in p.findall (Content): Print (One) '' ' results are as follows: green orange yellow black
The asterisk - repeated any number of times to match
* Denotes subexpression matches any of the foregoing times, including zero
For example, you want the following text, select strings behind the content of each row comma, comma including itself, attention, here is the Chinese comma comma
Apples, green oranges, orange banana, yellow crow, black monkey,
You can write a regular expression ,. *
* Immediately . Behind, stands for any character can appear any number of times, so the whole expression means that in all the characters after the comma, comma including
Especially the last line, the comma behind the monkeys have no other character, but * indicates matching anywhere from zero, all expressions also true
As long as the expression is correct, you can write in python code, as shown below
content = "apple, green oranges, orange banana, yellow crow, black monkey," ' Import Re the p-re.compile = (r',. * ') for One in p.findall (Content ): Print (One) '' ' results are as follows: , green , orange , yellow , black , ' ''
Note * very common in regular expressions, represents any number of any character match
Of course, this is not * have to be a point in front, may also be other characters, such as