Common metacharacters
- 例子: hi, what are you doing? candy?
\b
matches the beginning or end of a word
How to match correctly hi
? We can use \bhi
to match, but if the word to be matched is: him,high,hidden
and other words, then the use \bhi
cannot match exactly hi
, so we can use \bhi\b
this method to match exactly hi
, the regular expression is understood as, the match starts with h and ends with i word.
.
matches any character except newline
The above columns are matched exactly hi
, but now a new requirement is added, and I want to match hi
and the latter candy
. So I thought of using \bhi\b.\bcandy\b
it to match, but this grammar has a limitation, it can only match one character, for example, it can only match hi,candy、hi1candy
this similar character, but cannot match hi, what are you doing? candy?
such a character, and the number of matches is limited.
<span id="jump"></span>
In order to solve .
the problem that there can only be one arbitrary character, we can use *
and in .
combination to write such an expression \bhi\b.*\bcandy\b
. The semantics of the expression are: Match any character that h
starts i
and ends with any character except a newline in the middle, and can continuously match the character any number of times, followed by something that cand
starts and y
ends.
\d
match numbers
Now we have a requirement to match 15
mobile phone numbers that start with, such as: 15XXXXXXXXX
We can use to \d
write 15\d\d\d\d\d\d\d\d\d
such an expression. But it's cumbersome to write this way; we can change it to this: 15\d{9}
, the expression means: match any 9 numbers after 15.
^
Matches the beginning of the string, $ matches the end of the string.
Although the above example can match mobile phone numbers starting with 15, it as15111111111sd
can also be matched if we enter this because we have no restrictions. Now we can improve the above expression to:^15\d{9}$
Remove metacharacter restrictions
When we want to use the metacharacter itself, we can use \
the metacharacter to escape, such as I match www.baidu.com
, we can use www\.baidu\.com
the escape match, and also match the computer drive letter: D:\\User
the corresponding is D:\User
. The same is true for other metacharacters: such as use *
can be replaced \*
by etc.
qualifier
*
The preceding content can be repeated 0 or any number of times in a row, eventually making the entire expression match
The example that has been mentioned above: use * to match
+
The previous content can be repeated at least 1 or any number of times in a row, and finally the entire expression can be matched
Using hi, what are you doing? candy?
the example, we want to match words that hi
start with and candy
end with. It can be written like this: ^\bhi\b.+\bcandy\b$
. The expression can match hi, what are you doing? candy?
such phrases, but cannot match hicandy
such phrases. Because +
the rule is to match at least 1 repetition. What if you want to match numbers? For example, to match 15
the data at the beginning, you can write: 15\d+
.