Common tools - regular expression

1. matches a single character

You can match any single character. (Most can not match newline)

2. Match a set of characters

[] Define a set of characters; 0-9, az one character interval; ^ negated operation.

abc[^0-9]  //abc开头,最后一位不为数字

3. metacharacters

3.1 match whitespace characters

Metacharacters Explanation
[\b] Back one character
\f Page breaks
\n Newline
\r Carriage return
\t Tabs
\ v Vertical tab

3.2 Matching specific character

Metacharacters Explanation
\d Numeric characters, equivalent [0-9]
\D Non-numeric characters, equivalent [^ 0-9]
\w + + Number underline case, equivalent [a-zA-Z0-9_]
\W Taking a negation of the above, equivalent to [^ a-zA-Z0-9_]
\s Any blank character, equivalent to [\ f \ n \ t \ v \ r]
\S To take on non

4. Repeat Match

  • + Matches one or more characters
  • * Matches zero or more characters
  • ? Matches zero or one characters
    + and * are greedy type metacharacter will match the content as much as possible,? Lazy type.
[\w.]+@\w+\.\w+    //匹配邮箱地址
a.+c    //这个就能匹配abcabcabcabc
  • {N} n characters matches
  • {M, n} m ~ n characters match
  • {M,} characters match at least m

The position matching

  • \ B matches a word boundary
  • \ B matches the position of non-word boundary
  • ^ Matches the beginning of a string
  • $ Matches the end
^\s*\/\/.*$   //匹配以//开头的注释行

6. subexpression

  • () Is defined subexpression
  • | The two parts of the left and right as long as there is a match you can
(ab){2,}  //匹配至少两个ab相连
(19|20)\d{2}  //匹配19或者20开头的年份

7. trackback

\ N is a reference to the n-th expression, with reference to the sub-expression matches the same content.

<(h[1-6])>\w*?<\/\1>    //匹配HTML的标题元素<h1>x</h1>

8. Replace

//查找正则表达式
(\d{3})(-)(\d{3})(-)(\d{4})
//替换正则表达式
//在第一个子表达式查找的结果加上 () 
//然后加一个空格
//在第三个和第五个字表达式查找的结果中间加上-进行分隔。
($1) $3-$5

8.1 Case Conversion

Metacharacters Explanation
\l The next character to lowercase
\ in The next character to uppercase
\L The \ character between L and \ E converting all lowercase
\ In The \ characters between U and \ E all converted to uppercase
\E End \ L or \ U
(\w)(\w{2})(\w) //查找正则
$1\U$2\E$3  //将第二个和第三个字符转为大写

9. Find a before and after

  • ? = Lookahead, matching text does not appear to match the results, the child must be used in expressions.
.+(?=:)  //匹配网址,返回http而不是http:
  • ? <= Look back, matching text does not appear to match the results, the child must be used in expressions.
(?<=\$)[0-9.]+  //匹配金钱数目,返回23.54而不是$23.54
  • Negated manipulation!

10. The embedded condition

  • Trackback conditions: first determine a sub-expression, if the match is to continue matching subsequent content.
//(\() 匹配一个左括号
//?(1) 当表达式1匹配,则继续后续匹配,这里是匹配右括号
(\()?abc(?(1)\))
  • Before and after the search criteria: after finding the forward or backward lookup operation to succeed, it continues to match the subsequent content. Rarely used.
发布了117 篇原创文章 · 获赞 8 · 访问量 3701

Guess you like

Origin blog.csdn.net/qq_34761012/article/details/104559431