Regular expression modifiers and metacharacters | understand at a glance! ! ! (three)

Table of contents

1. Regular expressions - modifiers (markers)

(I. Overview

 (2) g modifier

 (3) i modifier

 (4) m modifier

 (5) s modifier

 Two, regular expressions - meta characters

(1) The following table contains the complete list of metacharacters and their behavior in the context of regular expressions:

 (2) Examples

1. Next, we analyze a regular expression that matches the mailbox, as follows:

2. The text of the following markup is the obtained matching expression:


1. Regular expressions - modifiers (markers)

(I. Overview

Tags are also known as modifiers, and tags of regular expressions are used to specify additional matching strategies.

Tags are not written in the regular expression, the tag is outside the expression, the format is as follows:

/pattern/flags

 The following table lists commonly used modifiers for regular expressions:

Modifier meaning describe
i ignore - case insensitive Sets matching to be case-insensitive, and searches are case-insensitive: A and a make no difference.
g global - match globally Find all matches.
m multi line - match on multiple lines Make the boundary characters ^ and $ match the beginning and end of each line, remembering multiple lines, not the beginning and end of the entire string.
s The special character dot. contains newline characters\n By default, the dot . matches any character except the newline character \n. After adding the s modifier, the . contains the newline character \n.

 (2) g modifier

The g modifier finds all occurrences of a string:

example

Find "runoob" in a string:

var str="Google runoob taobao runoob"; var n1=str.match(/runoob/); // find the first match var n2=str.match(/runoob/g); // find all matches
one time"

 (3) i modifier

The i modifier is case-insensitive matching, examples are as follows:

example

Find "runoob" in a string:

var str="Google runoob taobao RUNoob"; var n1=str.match(/runoob/g); // case sensitive var n2=str.match(/runoob/gi); // case insensitive try it
»

 (4) m modifier

The m modifier causes ^ and $ to match the beginning and end of each line in a text.

g only matches the first line, add m to achieve multiple lines.

 The following example strings use \n for line breaks:

Find "runoob" in a string:

var str="runoobgoogle\ntaobao\nrunoobweibo"; var n1=str.match(/^runoob/g); // match one var n2=str.match(/^runoob/gm); // multi-line match
try »

 (5) s modifier

By default, the dot . matches any character except the newline character \n. After adding s, the . contains the newline character \n.

 Examples of the s modifier are as follows:

Find in string:

var str="google\nrunoob\ntaobao"; var n1=str.match(/google./); // Without using s, it cannot match\n var n2=str.match(/runoob./s); // Use s, match\
nTry it out »

 Two, regular expressions - meta characters

(1) The following table contains the complete list of metacharacters and their behavior in the context of regular expressions:

character describe
\

Mark the next character as a special character, or a literal character, or a backreference, or an octal escape. For example, 'n' matches the character "n". '\n' matches a newline character. The sequence '\\' matches "\" and "\(" matches "(".

^

Matches the beginning of the input string. ^ also matches the position after '\n' or '\r' if the Multiline property of the RegExp object is set.

$

Matches the end of the input string. If the Multiline property of the RegExp object is set, $ also matches the position before '\n' or '\r'.

*

Matches the preceding subexpression zero or more times. For example, zo* would match "z" as well as "zoo". * is equivalent to {0,}.

+

Matches the preceding subexpression one or more times. For example, 'zo+' would match "zo" and "zoo", but not "z". + is equivalent to {1,}.

?

Matches the preceding subexpression zero or one time. For example, "do(es)?" would match either "do" or "does". ? is equivalent to {0,1}.

{n}

n is a non-negative integer. Matches exactly n times. For example, 'o{2}' would not match the 'o' in "Bob", but would match both o's in "food".

{n,}

n is a non-negative integer. Matches at least n times. For example, 'o{2,}' would not match 'o' in "Bob", but would match all o's in "foooood". 'o{1,}' is equivalent to 'o+'. 'o{0,}' is equivalent to 'o*'.

{n,m}

Both m and n are non-negative integers, where n <= m. Matches at least n times and at most m times. For example, "o{1,3}" will match the first three o's in "fooooood". 'o{0,1}' is equivalent to 'o?'. Note that there can be no spaces between the comma and the two numbers.

?

When this character immediately follows any of the other qualifiers (*, +, ?, {n}, {n,}, {n,m}), the matching pattern is non-greedy. The non-greedy mode matches the search string as little as possible, while the default greedy mode matches the search string as much as possible. For example, for the string "oooo", 'o+?' will match a single "o", and 'o+' will match all 'o's.

.

Match any single character except newline (\n, \r). To match any character including '\n', use a pattern like " (.|\n) ".

(pattern)

Matches pattern and retrieves this match. The acquired matches can be obtained from the generated Matches collection, using the SubMatches collection in VBScript, and using the $0…$9 properties in JScript. To match parenthesis characters, use '\(' or '\)'.

(?:pattern)

Matches pattern but does not get the matching result, that is, it is a non-getting match and is not stored for later use. This is useful when using the "or" character (|) to combine parts of a pattern. For example, 'industr(?:y|ies) is a shorter expression than 'industry|industries'.

(?=pattern)

Look ahead positive assert, which matches the lookup string at the beginning of any string that matches pattern. This is a non-acquiring match, that is, the match does not need to be acquired for later use. For example, "Windows(?=95|98|NT|2000)" matches "Windows" in "Windows2000", but not "Windows" in "Windows3.1". Lookaheads don't consume characters, that is, after a match occurs, the search for the next match begins immediately after the last match, not after the character containing the lookahead.

(?!pattern)

A positive negative assert matches the search string at the beginning of any string that does not match pattern. This is a non-acquiring match, that is, the match does not need to be acquired for later use. For example, "Windows(?!95|98|NT|2000)" can match "Windows" in "Windows3.1", but not "Windows" in "Windows2000". Lookaheads don't consume characters, that is, after a match occurs, the search for the next match begins immediately after the last match, not after the character containing the lookahead.

(?<=pattern) Reverse (look behind) positive pre-check is similar to forward positive pre-check, but the direction is opposite. For example, " (?<=95|98|NT|2000)Windows" matches " " 2000Windowsin " Windows", but not " " 3.1Windowsin " Windows".
(?<!pattern) Reverse Negative Lookup is similar to Forward Negative Lookup, but in the opposite direction. For example, " (?<!95|98|NT|2000)Windows" can match " " 3.1Windowsin " Windows", but not " " 2000Windowsin " Windows".
x|y

Match x or y. For example, 'z|food' would match "z" or "food". '(z|f)ood' matches "zood" or "food".

[xyz]

collection of characters. Matches any one of the contained characters. For example, '[abc]' would match 'a' in "plain".

[^xyz]

Negative character set. Matches any character not contained. For example, '[^abc]' would match 'p', 'l', 'i', 'n' in "plain".

[a-z]

range of characters. Matches any character in the specified range. For example, '[az]' matches any lowercase alphabetic character in the range 'a' through 'z'.

[^a-z]

负值字符范围。匹配任何不在指定范围内的任意字符。例如,'[^a-z]' 可以匹配任何不在 'a' 到 'z' 范围内的任意字符。

\b

匹配一个单词边界,也就是指单词和空格间的位置。例如, 'er\b' 可以匹配"never" 中的 'er',但不能匹配 "verb" 中的 'er'。

\B

匹配非单词边界。'er\B' 能匹配 "verb" 中的 'er',但不能匹配 "never" 中的 'er'。

\cx

匹配由 x 指明的控制字符。例如, \cM 匹配一个 Control-M 或回车符。x 的值必须为 A-Z 或 a-z 之一。否则,将 c 视为一个原义的 'c' 字符。

\d

匹配一个数字字符。等价于 [0-9]。

\D

匹配一个非数字字符。等价于 [^0-9]。

\f

匹配一个换页符。等价于 \x0c 和 \cL。

\n

匹配一个换行符。等价于 \x0a 和 \cJ。

\r

匹配一个回车符。等价于 \x0d 和 \cM。

\s

匹配任何空白字符,包括空格、制表符、换页符等等。等价于 [ \f\n\r\t\v]。

\S

匹配任何非空白字符。等价于 [^ \f\n\r\t\v]。

\t

匹配一个制表符。等价于 \x09 和 \cI。

\v

匹配一个垂直制表符。等价于 \x0b 和 \cK。

\w

匹配字母、数字、下划线。等价于'[A-Za-z0-9_]'。

\W

匹配非字母、数字、下划线。等价于 '[^A-Za-z0-9_]'。

\xn

Matches n, where n is a hex escaped value. Hex escape values ​​must be a certain two digits long. For example, '\x41' matches "A". '\x041' is equivalent to '\x04' & "1". ASCII encoding can be used in regular expressions.

\num

Matches num, where num is a positive integer. A reference to the hit that was fetched. For example, '(.)\1' matches two consecutive identical characters.

\n

Identifies an octal escape value or a backreference. If \n is preceded by at least n acquired subexpressions, then n is a backreference. Otherwise, if n is an octal digit (0-7), then n is an octal escape value.

\nm

Identifies an octal escape value or a backreference. nm is a backreference if \nm is preceded by at least nm obtained subexpressions. If \nm is preceded by at least n acquisitions, then n is a backreference followed by the literal m. If none of the preceding conditions are true, then \nm will match the octal escape value nm if both n and m are octal digits (0-7).

\nml

If n is an octal digit (0-3), and both m and l are octal digits (0-7), then the octal escape value nml is matched.

\and

Matches n, where n is a Unicode character represented by four hexadecimal digits. For example, \u00A9 matches a copyright symbol (?).

 (2) Examples

1. Next, we analyze a regular expression that matches the mailbox, as follows:

var str = "abcd [email protected] 1234";

var patt1 = /\b[\w.%+-]+@[\w.-]+\.[a-zA-Z]{2,6}\b/g;

document.write(str.match(patt1));

2. The text of the following markup is the obtained matching expression:

 

 

Guess you like

Origin blog.csdn.net/wuds_158/article/details/131544182