Regular Expression Metacharacters

metacharacter
describe
\
Converts the next character token, or a backreference, or an octal escape. For example, "\\n" matches \n. "\n" matches a newline. The sequence "\\" matches "\" and "\(" matches "(". This is equivalent to the concept of "escape character" found in many programming languages.
^
Matches the beginning of the input word line. If the Multiline property of the RegExp object is set, ^ also matches the position after "\n" or "\r".
$
Match end of input line. If the Multiline property of the RegExp object is set, $ also matches the position before "\n" or "\r".
*
Matches the preceding subexpression any number of times. For example, zo* matches "z", as well as "zo" and "zoo". *equivalent to o{0,}
+
Match the preceding subexpression one or more times (greater than or equal to 1). For example, "zo+" matches "zo" and "zoo", but not "z". + is equivalent to {1,}.
?
Matches the preceding subexpression zero or one time. For example, "do(es)?" can match "do" or "do" in "does". ? is equivalent to {0,1}.
{ n }
n is a non-negative integer. Match a certain number of n times. For example, "o{2}" cannot match the "o" in "Bob", but can match the two o's in "food".
{ n ,}
n is a non-negative integer. Match at least n times. For example, "o{2,}" would not match the "o" in "Bob", but would match all o's in "foooood". "o{1,}" is equivalent to "o+". "o{0,}" is equivalent to "o*".
{ n , m }
Both m and n are non-negative integers, where n <= m . Match at least n times and at most m times. For example, "o{1,3}" will match the first three o's in "fooooood" as a set, and the last three o's as a set. "o{0,1}" is equivalent to "o?". Note that there can be no spaces between the comma and the two numbers.
?
When the character immediately follows any one of the other qualifiers (*,+,?, { n }, { n ,}, { n , m }), the matching pattern is non-greedy. The non-greedy mode matches as little of the searched string as possible, while the default greedy mode matches as much of the searched string as possible. For example, for the string "oooo", "o+" will match as many "o" as possible, yielding the result ["oooo"], while "o+?" will match as little "o" as possible, yielding the result ['o ', 'o', 'o', 'o']
.point
Matches any single character except "\n". To match any character including "\n", use a pattern like "[\s\S]".
(pattern)
Match pattern and get that match. The retrieved matches can be obtained from the resulting Matches collection, using the SubMatches collection in VBScript and the $0…$9 properties in JScript. To match parentheses characters, use "\(" or "\)".
(?:pattern)
Non-fetching matches, matches the pattern but does not obtain the matching result, and does not store it for later use. This is useful when using the or character "(|)" to combine parts of a pattern. For example "industr(?:y|ies)" is a shorter expression than "industry|industries".
(?=pattern)
Non-acquisition matching, positive positive lookahead, matches the lookup string at the beginning of any string matching pattern, the match does not need to be acquired for later use. For example, "Windows(?=95|98|NT|2000)" can match "Windows" in "Windows2000", but not "Windows" in "Windows3.1". Lookahead consumes no characters, that is, after a match occurs, the search for the next match begins immediately after the last match, not after the character containing the lookahead.
(?!pattern)
Non-fetch matching, forward negative lookahead, matches the lookup string at the beginning of any string that does not match pattern, the match does not need to be fetched for later use. For example, "Windows(?!95|98|NT|2000)" can match "Windows" in "Windows3.1", but not "Windows" in "Windows2000".
(?<=pattern)
Non-acquisition matching, reverse positive pre-check, is similar to positive positive pre-check, but in the opposite direction. For example, "(?<=95|98|NT|2000)Windows" matches "Windows" in "2000Windows", but not "Windows" in "3.1Windows".
(?<!pattern)
Non-acquisition matches, reverse negative pre-checks, are similar to forward negative pre-checks, but in the opposite direction. For example, "(?<!95|98|NT|2000)Windows" can match "Windows" in "3.1Windows", but not "Windows" in "2000Windows". This place is incorrect, there is a problem
Any item used here cannot exceed 2 digits, such as "(?<!95|98|NT|20) Windows is correct, "(?<!95|980|NT|20) Windows reports an error, if it is used alone, then Unlimited, eg (?<!2000) Windows matches correctly
x|y
匹配x或y。例如,“z|food”能匹配“z”或“food”(此处请谨慎)。“[zf]ood”则匹配“zood”或“food”。
[xyz]
字符集合。匹配所包含的任意一个字符。例如,“[abc]”可以匹配“plain”中的“a”。
[^xyz]
负值字符集合。匹配未包含的任意字符。例如,“[^abc]”可以匹配“plain”中的“plin”。
[a-z]
字符范围。匹配指定范围内的任意字符。例如,“[a-z]”可以匹配“a”到“z”范围内的任意小写字母字符。
注意:只有连字符在字符组内部时,并且出现在两个字符之间时,才能表示字符的范围; 如果出字符组的开头,则只能表示连字符本身.
[^a-z]
负值字符范围。匹配任何不在指定范围内的任意字符。例如,“[^a-z]”可以匹配任何不在“a”到“z”范围内的任意字符。
\b
匹配一个单词边界,也就是指单词和空格间的位置(即正则表达式的“匹配”有两种概念,一种是匹配字符,一种是匹配位置,这里的\b就是匹配位置的)。例如,“er\b”可以匹配“never”中的“er”,但不能匹配“verb”中的“er”。
\B
匹配非单词边界。“er\B”能匹配“verb”中的“er”,但不能匹配“never”中的“er”。
\cx
匹配由x指明的控制字符。例如,\cM匹配一个Control-M或回车符。x的值必须为A-Z或a-z之一。否则,将c视为一个原义的“c”字符。
\d
匹配一个数字字符。等价于[0-9]。grep 要加上-P,perl正则支持
\D
匹配一个非数字字符。等价于[^0-9]。grep要加上-P,perl正则支持
\f
匹配一个换页符。等价于\x0c和\cL。
\n
匹配一个换行符。等价于\x0a和\cJ。
\r
匹配一个回车符。等价于\x0d和\cM。
\s
匹配任何不可见字符,包括空格、制表符、换页符等等。等价于[ \f\n\r\t\v]。
\S
匹配任何可见字符。等价于[^ \f\n\r\t\v]。
\t
匹配一个制表符。等价于\x09和\cI。
\v
匹配一个垂直制表符。等价于\x0b和\cK。
\w
匹配包括下划线的任何单词字符。类似但不等价于“[A-Za-z0-9_]”,这里的"单词"字符使用Unicode字符集。
\W
匹配任何非单词字符。等价于“[^A-Za-z0-9_]”。
\x n
匹配 n,其中 n为十六进制转义值。十六进制转义值必须为确定的两个数字长。例如,“\x41”匹配“A”。“\x041”则等价于“\x04&1”。正则表达式中可以使用ASCII编码。
\ num
匹配 num,其中 num是一个正整数。对所获取的匹配的引用。例如,“(.)\1”匹配两个连续的相同字符。
\ n
标识一个八进制转义值或一个向后引用。如果\ n之前至少 n个获取的子表达式,则 n为向后引用。否则,如果 n为八进制数字(0-7),则 n为一个八进制转义值。
\ nm
标识一个八进制转义值或一个向后引用。如果\ nm之前至少有 nm个获得子表达式,则 nm为向后引用。如果\ nm之前至少有 n个获取,则 n为一个后跟文字 m的向后引用。如果前面的条件都不满足,若 nm均为八进制数字(0-7),则\ nm将匹配八进制转义值 nm
\ nml
如果 n为八进制数字(0-7),且 ml均为八进制数字(0-7),则匹配八进制转义值 nml
\u n
匹配 n,其中 n是一个用四个十六进制数字表示的Unicode字符。例如,\u00A9匹配版权符号(&copy;)。
\p{P}
小写 p 是 property 的意思,表示 Unicode 属性,用于 Unicode 正表达式的前缀。中括号内的“P”表示Unicode 字符集七个字符属性之一:标点字符。
其他六个属性:
L:字母;
M:标记符号(一般不会单独出现);
Z:分隔符(比如空格、换行等);
S:符号(比如数学符号、货币符号等);
N:数字(比如阿拉伯数字、罗马数字等);
C:其他字符。
*注:此语法部分语言不支持,例:javascript。
\<
\>
匹配词(word)的开始(\<)和结束(\>)。例如正则表达式\<the\>能够匹配字符串"for the wise"中的"the",但是不能匹配字符串"otherwise"中的"the"。注意:这个元字符不是所有的软件都支持的。
( ) 将( 和 ) 之间的表达式定义为“组”(group),并且将匹配这个表达式的字符保存到一个临时区域(一个正则表达式中最多可以保存9个),它们可以用 \1 到\9 的符号来引用。
| 将两个匹配条件进行逻辑“或”(Or)运算。例如正则表达式(him|her) 匹配"it belongs to him"和"it belongs to her",但是不能匹配"it belongs to them."。注意:这个元字符不是所有的软件都支持的。

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324728510&siteId=291194637