Regular expression
Regular expressions are divided into basic regular expressions and extended regular expressions. It is not a tool program, but a standard basis for string processing. It uses a single string to search and match a series of strings that meet a certain grammatical rule.
Definition of regular expression
- Regular expression is also called regular expression, regular expression
- Use strings to describe and match a series of strings that meet a certain rule
- The composition of regular expressions:
-Ordinary characters: uppercase and lowercase letters, numbers, punctuation marks and some other symbols
-Metacharacters: special characters with special meaning in regular expressions - Tools that support regular expressions:
-vi editor: supports basic regular expressions; does not support extended regular expressions
-grep: supports basic regular expressions; does not support extended regular expressions
-egrep: supports basic regular expressions; supports Extended regular expressions
-sed: supports basic regular expressions; does not support extended regular expressions
-awk: supports basic regular expressions; supports extended regular expressions
Basic regular expression
- Basic regular expressions are commonly used regular expression parts
- In addition to ordinary characters, common metacharacters:
Metacharacter | effect |
---|---|
\ | The escape character is used to cancel the meaning of special symbols; for example: \! ,\N |
^ | The starting position of the matching character; for example: ^word matches the line starting with word |
$ | The end position of the matching character; for example: word$ matches the line ending with word |
. | Match any character except \n (newline) |
* | Match the preceding sub-expression 0 or more times |
[list] | Match a character in the list; for example: [0-9] matches any digit |
[^list] | Match a character not in the list; for example: [^0-9] matches any non-digit character |
\ {n\ } | Match the preceding sub-expression n times; for example: [0-9]\ {2\} matches two digits |
\ {n,\ } | The sub-expression that matches money noodles is not less than n times; for example: [0-9]\ {2,\} means two or more digits |
\ {n,m\ } | Match the preceding sub-expression n to m times; for example: [az]\ {2,3\} matches two to three lowercase letters |
[] | Character set; matches any character contained; for example: "[abc]" can match the "a" in "[abc]" |
[n1-n2] | Character range; matches any character not included |
Take the grep tool and the /etc/passwd file as examples, and give some examples:
grep root /etc/passed :筛选文件中包含root的行
grep ^root /etc/passwd :筛选出以root开头的行
grep bash$ /etc/passwd :筛选出以bash结尾的行
grep -v root /etc/passwd :筛选文件中不包含root的行
grep 'r..d' /etc/passwd :筛选出 r 和 d 之间有两个字符的行
grep '[^s]bin' /etc/passwd :筛选 bin 前面不是 s 的行
grep “^$” /etc/passwd :筛选出空白行,没有空白行的所以没输出
grep 't[es]' /etc/passwd :筛选包含字符串 te 或 ts 的行
grep '0\ {1,\}' /etc/passwd :查找数字 0 出现1次以上
grep -e "ntp" -e "root" /etc/passwd :-e 参数查找多个模式
"*" means any character in the wildcard. In regular expressions, it means matching the preceding sub-expression 0 or more times
grep 0* /etc/passwd :这里的0*会匹配所有内容(若是有空白行的文件,甚至包括空白行)
grep 00* /etc/passwd :这里的00* 匹配至少包含一个 0 的行(第一个 0 必须出现,第二个0可以出现0次或者多次)
Extended regular expression
- Extended regular expression is to expand and deepen the basic regular expression
- Common metacharacters
Metacharacter | effect |
---|---|
+ | Repeat one or more of the previous character |
? | 0 or one character before |
Pipe symbol | Use or (or) to find multiple characters |
() | Find the "group" character |
()+ | Identify multiple duplicate groups |
Take the egrep tool, the /etc/passwd file as an example, and give some examples:
egrep 0+ /etc/passwd :匹配至少一个 0 的行
egrep '(root|ntp)' /etc/passwd :匹配包含 root 或 ntp 的行
egrep ro?t /etc/passwd :匹配 rt 或者 rot 的行
egrep -v '^$|^#' /etc/passed :过滤文件中的空白行与 # 开头的行,没有空白行与 # 号开头的行,所以没有任何输出