Wildcards and regular expressions relations

1 Overview

Wildcards and regular expressions are often confusing, the difference between the two is compared herein.

Wildcards for matching file names, shell used when doing PathnameExpansion of. Wildcards are parsed by the shell, such as find, ls, cp, mv command supports wildcards to find and other file name.

Regular expression metacharacters have character, the matching frequency and location of anchors, group. Mainly deal with the contents of the text to match the text in the string, filtering tools for text in the file content, most use regular expressions, such as grep, sed, awk, vim, less, nginx, varnish and other command support regular expressions.

2 Tsuhaifu

2.1 regular communication Hythe

Wildcards are handled by the shell (not by the order relates to the processing of the statement, in fact, we do not find in the shell command each of these wildcard description), it will only appear in the "Parameters" command in (which is not used in the command name where they do not on the operator). When the shell encounters a wildcard in the "parameters", shell interprets these as the path or file name to search for possible matches on the disk: if they meet the requirements of a match, then be replaced (path extension); otherwise the wildcard passed as an ordinary character to "command", and then processed by the command.

In short, a wildcard path is actually one kind of shell extensions implemented. After the wildcard being processed, shell will first complete reorganization of the command, and then continues to process commands after the reorganization, until the command is executed.

The shell provides three escape character escape character, single and double quotation marks, and backward slashes, meta or wildcard characters become so common character without special meaning

It should be noted: Wildcard looks a bit like a regular expression statement, but it is different with regular expressions, should not be confused with each other. Understood as the wildcard character can shell special code.

Common wildcard, other special characters wildcard escape character following table

 
FIG symbol set a wildcard

2.1 wildcard examples

2.1.1 Examples of common wildcards


1, show all beginning with L, a lower case letter to the end of the in / var, and had at least one intermediate digital file or directory

ll l*[[:digit:]]*[[:lower:]]

2, display / etc directory beginning with any digit, and ending in a non-digital file or directory

ll [[:digit:]]*[^[:digit:]]

ll [0-9]*[^0-9]

3、显示/etc/目录下以非字母开头,后面跟了一个字母及其它任意长度任意字符的文件或目录

ll [^[:alpha:]][[:alpha:]]*

4、显示/etc/目录下所有以rc开头,并后面是0-6之间的数字,其它为任意字符的文件或目录

ls -d rc[0-6]*

5、显示/etc目录下,所有以.d结尾的文件或目录

ls -d *.d

6、显示/etc目录下,所有.conf结尾,且以m,n,r,p开头的文件或目录

ls -ld [mnrp]*.conf 多了一个d 参数后就会只显示文件夹,不显示文件夹里的信息

7、只显示/root下的隐藏文件和目录

ls -Ad .*

ls -d .*[[:alnum:]]

8、只显示/etc下的非隐藏目录

ls -F | grep '/$'

ls -l  | grep '^d'


2.1.2 单引号和双引号


单引号、双引号用于用户把带有空格的字符串赋值给变量事的分界符。

[root@localhost sh]# str="Today is Monday"

[root@localhost sh]# echo $str

Today is Monday

如果没有单引号或双引号,shell会把空格后的字符串解释为命令。

[root@localhost sh]# str=Today is Monday

bash: is: command not found

单引号和双引号的区别。单引号告诉shell忽略所有特殊字符,而双引号忽略大多数,但不包括三个符号$(美元符号)、\(反斜杠)、`(反向单引号)。

[root@localhost sh]# testvalue=100

[root@localhost sh]# echo 'The testvalue is $testvalue'

The testvalue is $testvalue

[root@localhost sh]# echo "The testvalue is $testvalue"

The testvalue is 100


2.1.3 反向单引号


这里再说一下反向单引号,再键盘左上角,和波浪号一起的符号。

在Linux中起着命令替换的作用,命令替换是指shell能够将一个命令的标准输出插在一个命令行中任何位置。

如下,shell会执行反引号中的date命令,把结果插入到echo命令显示的内容中。

[root@localhost sh]# echo The date is `date`

The date is 2011年 03月 14日 星期一 21:15:43 CST


3 正则表达式

3.1 常用正则表达式

grep (global search regular expression(RE) and print out the line,全面搜索正则表达式并把行打印出来)是一种强大的文本搜索工具,它能使用正则表达式搜索文本,并把匹配的行打印出来。Unix的grep家族包括grep、egrep和fgrep。egrep和fgrep的命令只跟grep有很小不同。egrep是grep的扩展,支持更多的re元字符, fgrep就是fixed grep或fast grep,它们把所有的字母都看作单词,也就是说,正则表达式中的元字符表示回其自身的字面意义,不再特殊。linux使用GNU版本的grep。它功能更强,可以通过-G、-E、-F命令行选项来使用egrep和fgrep的功能。

grep的工作方式是这样的,它在一个或多个文件中搜索字符串模板。如果模板包括空格,则必须被引用,模板后的所有字符串被看作文件名。搜索的结果被送到屏幕,不影响原文件内容。

grep可用于shell脚本,因为grep通过返回一个状态值来说明搜索的状态,如果模板搜索成功,则返回0,如果搜索不成功,则返回1,如果搜索的文件不存在,则返回2。我们利用这些返回值就可进行一些自动化的文本处理工作。

 

 
 

3.2 例子


1、显示三个用户root、sunny、tom的UID和默认shell

grep "^root\>\|^sunny\>\|^tom\>" /etc/passwd | cut -d: -f3,7

grep -E "^root\>|^sunny\>|^tom\>" /etc/passwd

grep -E "^(root|sunny|tom)\>" /etc/passwd | cut -d : -f3,7

2、找出/etc/rc.d/init.d/functions文件中行首为某单词(包括下划线)后面跟一个小括号

grep -oE "^[[:alnum:]_]+\(\)" /etc/rc.d/init.d/functions

3、使用egrep取出/etc/rc.d/init.d/functions中其基名

echo /etc/rc.d/init.d/functions | grep -oE "[^/]+/?$"

非/开头,最后是/可有可无,然后结尾,这个肯定是最后一个字段才是这样的情况

4、使用egrep取出上面路径的目录名

echo /etc/rc.d/init.d/functions/ | grep -oE "^/.*/\<"

这道题不太清楚,最后一个位置锚定,/开头,/<这个是位置锚定,用/最后的位置锚定

5、统计last命令中以root登录的每个主机IP地址登录次数

last | grep ^root | grep -oE "([0-9]{1,3}.){3}[0-9]{1,3}"|sort| uniq -c|sort -nr

6、利用扩展正则表达式分别表示0-9、10-99、100-199、200-249、250-255

echo {0..255} | grep -oE "\<[0-9]\>" |tr "\n" " "直接截取出对应数字在换成一行

echo {0..255} | grep -oE "\<[1-9][0-9]\>"

echo {0..255} | grep -oE "\<1[0-9]{2}\>"词尾不锚定也可以

echo {0..255} | grep -oE "\<2[0-4][0-9]\>"

echo {0..255} | grep -oE "\<25[0-5]\>"

7、截取出ipv4地址

ifconfig | grep -oE "\<(([0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\>"

8、显示ifconfig命令结果中所有IPv4地址

ifconfig | grep -oE "\<([0-9]{1,3}\.){3}[0-9]{1,3}"

以下这个语句可以精确到每个位的范围

ifconfig | grep -oE "\<(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\>"

注意这里的点号要加反斜杠进行转义。

9、将此字符串:welcome to magedu linux 中的每个字符去重并排序,重复次数多的排到前面

echo "welcome to magedu linux" | grep -oE [[:print:]]|sort|uniq -c|sort -r


4 区别对比

需要明确的是,通配符是用来匹配文件名,进行文件名的查找,而正则表达式是用来匹配文件里内容的,我们常用的grep命令,交给管道符之后使用grep已经不是匹配文件名了,这是对文件的操作,并不是匹配文件名。

不同点

 
图三 差别

相同点

 



Guess you like

Origin www.cnblogs.com/fpcbk/p/12058302.html