Regular Expressions - syntax, metacharacters

Non-printing characters

\ n Matches a newline. Equivalent to \ x0a and \ cJ.
\ r match a carriage return. Equivalent to \ x0d and \ cM.
\ s Matches any whitespace characters, including spaces, tabs, page breaks, and so on. Is equivalent to [\ f \ n \ r \ t \ v].
\ S Matches any non-whitespace characters. Is equivalent to [^ \ f \ n \ r \ t \ v].

Special characters

$ Matches the input end of the string. If the object is set RegExp Multiline property, also matches the $ '\ n' or '\ r'. To match the $ character itself, use $.
() Mark the beginning and end position of a sub-expression. Sub-expressions can be obtained for later use. To match these characters, use (and).

  • Matches the preceding subexpression zero or more times. To match * characters, use *.
  • Matches the preceding subexpression one or more times. To match the + character, use the +.
    . In addition to matching newline \ any single character other than n. To match. Please use.
    [Marks the start of expression in parentheses. To match [Use [.
    ? Matches the preceding subexpression zero or one, or specify a non-greedy qualifiers. To match? Characters, use?.
      The next character is marked as or special characters, or literals, or back-reference, or an octal escape. For example, 'n' matches the character 'n'. '\ n' match a newline. Sequence '\' match '', and '(' matching the "(".
    ^ Matches the beginning of the string, unless the use of the expression in square brackets, which at this time represents not accept the set of characters to match character ^ itself, use ^
    {marks the start of the expression to match qualifier {, use {..
    | a choice between two specified to match |., use |.

Qualifier

  • Matches the preceding subexpression zero or more times. For example, zo * matches "z" and "zoo". * Is equivalent to {0}.
  • Matches the preceding subexpression one or more times. For example, 'zo +' will match "zo" and "zoo", but can not match the "z". + Is equivalent to {1}.
    ? Matches the preceding subexpression zero or one. For example, "do (es)?" Matches "do" or "does" in the "does" or "doxy" in the "do". ? Is equivalent to {0,1}.
    {n} n is a non-negative integer. Matching the determined n times. For example, 'o {2}' does not match the "Bob" in the 'o', but can match the "food" in the two o.
    {n,} n is a non-negative integer. Matching at least n times. For example, 'o {2,}' does not match the "Bob" in the 'o', but it can match all o "foooood" in. 'o {1,}' is equivalent to 'o +'. 'o {0,}' is equivalent to 'o *'.
    {n, m} m and n are non-negative integers, where n <= m. Match at least n times and match up to m times. For example, "o {1,3}" will match "fooooood" in the previous three o. 'o {0,1}' is equivalent to 'o?'. Please note that no spaces between the comma and the two numbers.

Locator

^ Matches the input string starting position. If the object is set RegExp Multiline property, ^ also matches the position after the \ n or \ r.
$ Matches the input end of the string position. If the object is set RegExp Multiline property, also with $ \ n or \ r position before matching.
\ b matches a word boundary, that is, the position between a word and a space.
\ B matches non-word boundary.

Metacharacters

 
The next character is marked as a special character, or a literal character, or a backward reference, or an octal escape. For example, 'n' matches the character "n". '\ n' matches a newline. Sequence '\' match ' "and" ( "matching the" ( ".
^
Location matching the beginning of the string. If the object is set RegExp Multiline property, also matches ^' \ n 'or' \ r 'after .
$
matches the input end of the string. If the object is set RegExp Multiline property, $ also matches the position before the '\ n' or '\ r'.

  • Matches the preceding subexpression zero or more times. For example, zo * matches "z" and "zoo". * Is equivalent to {0}.
  • Matches the preceding subexpression one or more times. For example, 'zo +' will match "zo" and "zoo", but can not match the "z". + Is equivalent to {1}.
    ?
    Matches the preceding subexpression zero or one. For example, "do (es)?" Matches "do" or "does". ? Is equivalent to {0,1}.
    n-} {
    n-is a non-negative integer. Matching the determined n times. For example, 'o {2}' does not match the "Bob" in the 'o', but can match the "food" in the two o.
    n-{,}
    n-it is a non-negative integer. Matching at least n times. For example, 'o {2,}' does not match the "Bob" in the 'o', but it can match all o "foooood" in. 'o {1,}' is equivalent to 'o +'. 'o {0,}' is equivalent to 'O '.
    n {, m}
    m and n are non-negative integers, where n <= m. Match at least n times and match up to m times. For example, "o {1,3}" will match "fooooood" in the previous three o. 'o {0,1}' is equivalent to 'o?'. Please note that no spaces between the comma and the two numbers.
    ?
    When the character immediately to any other qualifiers (
    , +,?, N-{}, n-{,}, {n-, m}) when the rear, non-greedy matching pattern. Non-greedy pattern matches as little as possible the search string, and the default greedy pattern matches as much of the string search. For example, the string "oooo", 'o +?' Matches a single "o", and 'o +' will match all 'o'.
    .
    Matches any single character except "\ n" is. To match including the '\ n', including any character, use the like "(|. \ N)" mode.
    (pattern)
    Match the pattern and get the match. The matching can be obtained from the Matches have been used in collection SubMatches VBScript, JScript is used in the $ 0 ... $ 9 properties. To match parentheses characters, use '(' or ')'.
    (:? pattern)
    matches the pattern but do not get matching results, that this is a non-access match, not stored for later use. This use of "or" character (|) to combine the various parts of a model is useful. For example, 'industr (:? Y | ies) is a ratio of' industry | more brief expressions industries'.
    (? = pattern)
    Positive pre-investigation, matching the search string at the beginning of the string any pattern matching. This is a non-access match, that is, the match does not need to obtain for later use. For example, 'Windows (= 95 |? 98 | NT | 2000)' can match the "Windows 2000" in the "Windows", but can not match the "Windows 3.1" "Windows". Pre-check does not consume characters, that is, after a match occurs, the last match after the next match to start the search immediately, rather than starting from the characters that contains pre-investigation.
    (?! pattern)
    Negative pre-investigation, at the beginning of any match pattern string matching search string. This is a non-access match, that is, the match does not need to obtain for later use. For example 'Windows (95 |?! 98 | NT | 2000)' can match the "Windows 3.1" "Windows", but can not match the "Windows 2000" The "Windows". Pre-check does not consume characters, that is, after a match occurs, the last match after the next match to start the search immediately, rather than starting from the characters that contains pre-investigation.
    x | y
    Match x or y. For example, 'z | food' can match the "z" or "food". '(z | f) ood' the match "zood" or "food".
    [xyz]
    set of characters. Matches any character included. For example, '[abc]' matches "plain" in the 'a'.
    [^ xyz]
    negative set of characters. Matches any character not included. For example, '[^ abc]' matches "plain" the 'p', 'l', 'i', 'n'.
    [az]
    character range. Matches any character within the specified range. For example, '[az]' match 'a' to any lowercase alphabetic characters 'z' within range.
    [^ az]
    Negative character range. Matches any character not in any specified range. For example, '[^ az]' can not match any 'a' to an arbitrary character 'z' within range.
    \ b
    matches a word boundary, that is, it refers to the location and spaces between words. For example, 'er \ b' matches "never" in the 'er', but does not match the "verb" in the 'er'.
    \ B
    matches non-word boundary. 'er \ B' matches "verb" in the 'er', but not "






    \ f
    match for a website page. Equivalent to \ x0c and \ cL.
    \ n
    Matches a newline. Equivalent to \ x0a and \ cJ.
    \ r
    match a carriage return. Equivalent to \ x0d and \ cM.
    \ s
    Matches any whitespace characters, including spaces, tabs, page breaks, and so on. Is equivalent to [\ f \ n \ r \ t \ v].
    \ S
    Matches any non-whitespace characters. Is equivalent to [^ \ f \ n \ r \ t \ v].
    \ t
    matches a tab. Equivalent to \ x09 and \ cI.
    \ v
    matches a vertical tab. Equivalent to \ x0b and \ cK.
    \ w
    matches any word character including underscore. It is equivalent to the '[A-Za-z0-9_] '.
    \ W
    matches any non-word character. It is equivalent to the '[^ A-Za-z0-9_ ]'.
    \ xn
    Matches n, where n is a hexadecimal escape value. Hexadecimal escape values must be determined by two digits long. For example, '\ x41' match "A". '\ x041' is equivalent to '\ x04' & "1" . Regular expressions can be used in ASCII encoding.
    \ num
    match num, where num is a positive integer. A reference to the acquired match. For example, '(.) \ 1' It matches two consecutive identical characters.
    \ n
    identifies an octal escape value or a backward reference. If before \ n at least n captured subexpressions, n is a reference back. Otherwise, if n is an octal digit (0-7), then n is an octal escape value.
    \ nm
    identifies an octal escape value or a backward reference. If you have to obtain a sub-expressions nm at least before nm \, then nm is a reference back. If \ There are at least n before obtaining nm, then n is a backward reference character m in the heel. When these conditions are not met, if n and m are octal digits (0-7), \ nm matches octal escape value nm.
    \ NML
    If n is an octal digit (0-3), and m and l are octal digits (0-7), the matching octal escape value nml.
    \ un
    Matches n, where n is a Unicode character with four hexadecimal digits represented. For example, \ u00A9 matching copyright symbol (?).

Operator Precedence

  Escape character
(), (? :), (? =), [] Parentheses and brackets
*, +,?, {N }, {n,}, {n, m} qualifier
^, $, \ any meta character, any character sequence and the anchor point (namely: the location and order)
| Alternatively, "or" operator
characters having a higher priority than the operator of the replacement, so that "m | food" matching "m" or "Food." To match "mood" or "food", use parentheses to create a sub-expression, resulting in "(m | f) ood" .

Guess you like

Origin www.cnblogs.com/liujiuzhou/p/11547309.html