shell script regular expression one of The Three Musketeers (grep, egrep)

shell script regular expression one of The Three Musketeers (grep, egrep)

Shell scripts of positive expression

A regular expression is one of the Three Musketeers: grep

1. Learn the regular expression before we take a useless exercise as a test profile

[root@localhost ~]# vim chen.txt

#version=DEVEL
 System authorization information
auth --enableshadow --passalgo=sha512# Use CDROM installation media
cdrom
thethethe
THE
THEASDHAS
 Use graphical install
graphical
 Run the Setup Agent on first boot
firstboot --enable
ignoredisk --only-use=sda
wood
wd
wod
woooooooood
124153
3234
342222222
faasd11
2
ZASASDNA
short
shirt

2. Find the specific character

"-Vn" Invert Selection. Find does not contain "the" character line, it needs to be achieved by "-vn" option grep command.
-n "represents a display line numbers
" -i "represents not case-sensitive
command execution, character meet the matching criteria, font color changes to red

[root@localhost ~]# grep -n 'the' chen.txt
6:thethethe
11:# Run the Setup Agent on first boot
[root@localhost ~]# grep -in 'the' chen.txt
6:thethethe
7:THE
8:THEASDHAS
11:# Run the Setup Agent on first boot
[root@localhost ~]# grep -vn 'the' chen.txt
1:#version=DEVEL
2:# System authorization information
3:auth --enableshadow --passalgo=sha512
4:# Use CDROM installation media
5:cdrom
7:THE
8:THEASDHAS
9:# Use graphical install
10:graphical
12:firstboot --enable
13:ignoredisk --only-use=sda
14:wood
15:wd
16:wod
17:woooooooood
18:124153
19:3234
20:342222222
21:faasd11
22:2
23:ZASASDNA
24:
short
shirt

3. brackets "[]" to find a collection of characters
when you want to find "shirt" and "short" two strings can be found in the two strings contains "sh" and "rt". Run the following command to simultaneously find "shirt" and "short" two strings. "[]" In regardless of the number of characters, are only representative of a character, that "[io]" means match "i" or "o".

[root@localhost ~]# grep -n 'sh[io]rt' chen.txt  //过滤short或shirt中都有io集合字符
24:short
25:shirt

When To find duplicate contains a single character "oo", just execute the following command.

[root@localhost ~]# grep -n 'oo' chen.txt 
11:# Run the Setup Agent on first boot
12:firstboot --enable
14:wood
17:woooooooood

If the Find "OO" not preceded by a "w" of the character string, only the character set by selecting a reverse "[^]" This object is achieved, such as the implementation of "grep -n '[^ w] oo'test.txt" Find command represents "oo" not preceded by "w" in the string of text test.txt

[root@localhost ~]# grep -n '[^w]oo' chen.txt //过滤w开头oo的字符串
11:# Run the Setup Agent on first boot
12:firstboot --enable
17:woooooooood

Found "woood" and "wooooood" matching rule is also consistent with the execution result of the command, both containing "w". In fact, it can be seen from the results of the matching criteria in line with bold characters, and these results may be that, "# woood #" is shown in bold in the "ooo", and "oo" in front of "o" the match rule. Similarly "#woooooood #" is also consistent with the matching rule.
If desired "oo" exists in front of lower case letters may be used "grep -n '[^ az] oo'test.txt" command is implemented, wherein "az" represents lowercase letters, uppercase letters through "AZ" FIG.

[root@localhost ~]# grep -n '[^a-z]oo' chen.txt 
19:Foofddd

Finds the row number may be achieved by "grep -n '[0-9]' test.txt" command

[root@localhost ~]# grep -n '[0-9]' chen.txt
3:auth --enableshadow --passalgo=sha512
20:124153
21:3234
22:342222222
23:faasd11
24:2

Find the line song "^" and end of line characters "$"

[root@localhost ~]# grep -n '^the' chen.txt
6:thethethe

Queries line lowercase letter can be filtered by "1" rule,

[root@localhost ~]# grep -n '^[a-z]' chen.txt
3:auth --enableshadow --passalgo=sha512
5:cdrom
6:thethethe
10:graphical
12:firstboot --enable
13:ignoredisk --only-use=sda
14:wood
15:wd
16:wod
17:woooooooood
18:dfsjdjoooooof
23:faasd11
26:short
27:shirt

Queries begin with a capital letter

[root@localhost ~]# grep -n '^[A-Z]' chen.txt
7:THE
8:THEASDHAS
19:Foofddd
25:ZASASDNA

If the query is not to use the line that starts with the letter "[a-zA-Z]" rule.

[root@localhost ~]# grep -n '^[^a-zA-Z]' chen.txt
1:#version=DEVEL
2:# System authorization information
4:# Use CDROM installation media
9:# Use graphical install
11:# Run the Setup Agent on first boot
20:124153
21:3234
22:342222222
24:2

"^" Sign element set of characters "[]" to internal and external symbol is not the same, in the "[]" symbols represent reverse selection, the outer "[]" symbol represents the positioning of the line. Conversely, if you want to find a particular character to the end of the line you can use the "$" website. For example, the following command can achieve a query to the decimal point (.) At the end of the line. Because the decimal point (.) In a regular expression is a meta-characters (will be mentioned later), so here need to use the escape character "\" character has special meaning converted into ordinary characters.

[root@localhost ~]# grep -n '\.$' chen.txt
5:cdrom.
6:thethethe.
9:# Use graphical install.
10:graphical.
11:# Run the Setup Agent on first boot.

When the query blank line, the implementation of "grep -n '^ $' chen.txt

Find any character. "" Repetitive "*" character
expressions in the positive decimal point (.) Is a meta-character, stands for any one character. For example, execute the following command to search for a string "w ?? d", ie a total of four characters, beginning with w d end.

[root@localhost ~]# grep -n 'w..d' chen.txt
14:wood

In the results, "wood" string "w ... d" matching rules. If you want to query oo, ooo, ooooo, etc., you need to use the asterisk () meta-characters. But it notes that "" represents the repeat zero or more of the preceding single character. "O " represents that it has zero (i.e. the null character) is greater than or equal to a "o" character, as allow null characters, perform "grep -n'o all content 'test.txt" command will have the text print output. If it is "oo ", the first o must exist, and the second o is zero or more o, so that both contain o, oo, ooo, ooo, information, etc. are standard. Similarly, if the query contains at least two or more o strings, perform "grep -n'ooo 'test.txt" command.

[root@localhost ~]# grep -n 'ooo*' chen.txt
11:# Run the Setup Agent on first boot.
12:firstboot --enable
14:wood
17:woooooooood
18:dfsjdjoooooof
19:Foofddd

W d query beginning end of the intermediate string comprising at least one o, the following command can be realized.

[root@localhost ~]# grep -n 'woo*d' chen.txt
14:wood
16:wod
17:woooooooood

W d query beginning end, optional intermediate character string.

[root@localhost ~]# grep -n 'w.*d' chen.txt
14:wood
15:wd
16:wod
17:woooooooood

Query any number row.

[root@localhost ~]# grep -n '[0-9][0-9]*' chen.txt
3:auth --enableshadow --passalgo=sha512
20:124153
21:3234
22:342222222
23:faasd11
24:2

Find a range of consecutive characters "{}"
use "." And "*" to set zero to an infinite number of repeating characters, if you want to restrict duplicate strings within a range of how to achieve it? For example, three to five consecutive characters to find the o, this time on the basis of regular characters need to limit the scope of the expression "{}." Because the "{}" has a special meaning in the Shell, so when using "{}" character, need to use an escape character "\", the "{}" characters into an ordinary character.

O query more than two characters

[root@localhost ~]# grep -n 'o\{2\}' chen.txt
11:# Run the Setup Agent on first boot.
12:firstboot --enable
14:wood
17:woooooooood
18:dfsjdjoooooof
19:Foofddd

Beginning to end of the query to w d, intermediate containing 2-5 o string.

[root@localhost ~]# grep -n 'wo\{2,5\}d' chen.txt
14:wood

Beginning to end of the query to w d, comprising two or more intermediate o string.

[root@localhost ~]# grep -n 'wo\{2,\}d' chen.txt
14:wood
17:woooooooood

shell script regular expression one of The Three Musketeers (grep, egrep)

II. Extended regular expressions

为了简化整个指令,需要使用范围更广的扩展正则表达式。例如,使用基础正则表达式查询除文件中空白行与行首为“#” 之外的行(通常用于查看生效的配置文件),执行“grep –v‘^KaTeX parse error: Expected group after '^' at position 22: …txt | grep –v ‘^̲#’”即可实现。这里需要使用管…|^#’test.txt”,其中,单引号内的管道符号表示或者(or)。
此外,grep 命令仅支持基础正则表达式,如果使用扩展正则表达式,需要使用 egrep 或 awk 命令。awk 命令在后面的小节进行讲解,这里我们直接使用 egrep 命令。egrep 命令与 grep 命令的用法基本相似。egrep 命令是一个搜索文件获得模式,使用该命令可以搜索文件中的任意字符串和符号,也可以搜索一个或多个文件的字符串,一个提示符可以是单个字符、一个字符串、一个字或一个句子。
常见的扩展正则表达式的元字符主要包括以下几个:

shell script regular expression one of The Three Musketeers (grep, egrep)

"+“示例:执行“egrep -n ‘wo+d’ test.txt”命令,即可查询"wood” “woood” "woooooood"等字符串

[root@localhost ~]# egrep -n 'wo+d' chen.txt
14:wood
16:wod
17:woooooooood

"?"示例:执行“egrep -n ‘bes?t’ test.txt”命令,即可查询“bet”“best”这两个字符串

[root@localhost ~]# egrep -n 'bes?t' chen.txt
11:best
12:bet

"|"示例:执行“egrep -n ‘of|is|on’ test.txt”命令即可查询"of"或者"if"或者"on"字符串

[root@localhost ~]# egrep -n 'of|is|on' chen.txt
1:#version=DEVEL
2:# System authorization information
4:# Use CDROM installation media
13:# Run the Setup Agent on first boot.
15:ignoredisk --only-use=sda
20:dfsjdjoooooof
21:Foofddd

"()" Example: "egrep -n 't (a | e) st' test.txt". "Tast" and "test" because the two words "t" and "st" is repeated, so that the "a" and "e" are shown in "()" symbols among, and to "|" separated, i.e., can be found "tast" or "test" string

[root@localhost ~]# egrep -n 't(a|e)st' chen.txt
12:test
13:tast

"() +" Example: "egrep -n 'A (xyz) + C' test.txt". This command is a query at the beginning of an "A" at the end is "C", one or more middle "xyz" string means

[root@localhost ~]# egrep -n 'A(xyz)+C' chen.txt
14:AxyzxyzxyzC

Guess you like

Origin blog.51cto.com/14449524/2441674