shell study-18day--Introduction to regular expressions

1. The use of regular expressions

Regular expressions, also known as regular expressions. Often abbreviated as regex, regexp or RE in the code ), a concept of computer science. Regular expressions are usually used to retrieve and replace text that meets a certain pattern (rule). It can be used to check whether a string contains a certain substring, replace the matched substring, or extract from a string that meets a certain condition The substring etc.

The method of constructing regular expressions is the same as that of creating mathematical expressions. That is, using a variety of metacharacters and operators can combine small expressions to create larger expressions. The components of a regular expression can be a single character, a set of characters, a range of characters, a selection between characters, or any combination of all these components. Regular expressions are text patterns composed of ordinary characters (such as characters a to z) and special characters (called "metacharacters"). The pattern describes one or more strings to be matched when searching for text. The regular expression is used as a template to match a certain character pattern with the searched string.

1. The composition of Shell regular expressions

( 1) Ordinary characters

Common characters include all printable and non-printable characters that are not explicitly designated as metacharacters. This includes all uppercase and lowercase letters, all numbers, all punctuation marks, and some other symbols.

character

meaning

[abc]

Match  all characters in [...] , for example [ se ] matches all se letters in the string " test shell " 

[^abc]

In addition to matching [...]  the other characters , for example, [ ^ se ] match the string " Test the shell " in addition to se the letter  

[A-Z]

[AZ] means a range, matching all uppercase letters, [az] means all lowercase letters

[\s\S]

Match all. \s matches all whitespace characters, including newlines, \S non-whitespace characters, including newlines

\w

Match letters, numbers, and underscores. Equivalent to [A-Za-z0-9]

( 2) Non-printing characters

character

meaning

\cx

Matches the control character specified by x. \cM matches a ctrl-M or carriage return character. The value of x must be one of AZ or az.

\f

Matches a form feed character. Equivalent to \x0c and \cL

\n

Match a newline character. Equivalent to \x0a and \cJ

\r

Matches a carriage return character. Equivalent to \x0d and \cM

\s

Matches any blank characters, including spaces, tabs, form feeds, etc. Equivalent to [\f\n\r\t\v]

\S

Match any non-whitespace character. Equivalent to [^ \f\n\r\t\v]

\t

Matches a tab character. Equivalent to \x09 and \cI

\ v

Matches a vertical tab character. Equivalent to \x0b and \cK

( 3) Special characters

character

meaning

$

Match the end position of the string, use \$, match $ itself

()

Mark the beginning and end of a sub-expression. To match these characters, use \( and \)

*

Matches the preceding sub-expression zero or more times. To match * characters, use \*

+

Match the preceding sub-expression one or more times. To match the + character, use \+

.

Matches any single character except the newline character \n. To match., use \. 

[

Mark the beginning of a bracket expression. To match [, use \[.

?

Matches the preceding subexpression zero or one time. To match the? Character, use \?.

\

将下一个字符标记为或特殊字符、或原义字符、或向后引用、或八进制转义符。例如, 'n' 匹配字符 'n'。'\n' 匹配换行符。序列 '\\' 匹配 "\",而 '\(' 则匹配 "("

^

匹配输入字符串的开始位置,除非在方括号表达式中使用,当该符号在方括号表达式中使用时,表示不匹配该方括号表达式中的字符集合。要匹配^ 字符本身,请使用 \^

{

标记限定符表达式的开始。要匹配 {,请使用 \{。

|

指明两项之间的一个选择。要匹配 |,请使用 \|。

4)限定符

限定符用来指定正则表达式的一个给定组件必须要出现多少次才能满足匹配。有 * 或 + 或 ? 或 {n} 或 {n,} 或 {n,m} 共6种

字符

含义

*

匹配前面的子表达式零次或多次。例如,zo* 能匹配 "z" 以及 "zoo"。* 等价于{0,}。

+

匹配前面的子表达式一次或多次。'zo+'匹配 "zo" 以及 "zoo",但不能匹配 "z"。+ 等价于 {1,}。

匹配前面的子表达式零次或一次。例如,"do(es)?" 可以匹配 "do" 、 "does" 中的 "does" 、 "doxy" 中的 "do" 。? 等价于 {0,1}

{n}

n 是一个非负整数。匹配确定的n次。'o{2}'不能匹配"Bob"中的'o',但是能匹配 "food" 中的两个 o

{n,}

n 是一个非负整数。至少匹配n 次。例如,'o{2,}' 至少匹配o两次,不能匹配 "Bob" 中的 'o',但能匹配 "foooood" 中的所有 o。'o{1,}' 等价于 'o+'。'o{0,}' 则等价于 'o*'。

{n,m}

m 和 n 均为非负整数,其中n <= m。最少匹配 n 次且最多匹配 m次。例如,"o{1,3}" 将匹配 "fooooood" 中的前三个 o。'o{0,1}' 等价于 'o?'。请注意在逗号和两个数之间不能有空格。

个人公众号:

image.png

Guess you like

Origin blog.51cto.com/13440764/2575398