The sort, uniq, tr commands and regular expressions of shell scripts

sort command

Sort the contents of the files by row, or according to different data types

Syntax format:

sort [选项] 参数
cat file | sort 选项

Common options:

-f: Ignore case, the uppercase letters are sorted in
front by default -b: Ignore the spaces in front of each line
-n: Sort by numbers
-r: Reverse sort
-u: Same as uniq, which means that the same data is only displayed One line
-t: Specify the field separator, and use the [Tab] key to separate by default
-k: Specify the sort field
-o <output file>: Transfer the sorted results to the specified file

Insert picture description here

Insert picture description here

Insert picture description here

Insert picture description here

Insert picture description here

7

uniq command

Used to report or ignore consecutive repeated lines in a file, often used in conjunction with the sort command

Syntax format:

uniq [选项] 参数
cat file | uniq 选项

Common options:

-c: Count, and delete the consecutive repeated lines in the file
-d: display only continuous repeated lines
-u: display only discontinuous lines that only appear once

Insert picture description here

Insert picture description here

Insert picture description here

Insert picture description here

tr command

Commonly used to replace, compress and delete characters from standard input

Syntax format:

tr [选项] [参数]

Common options:

-c: Reserve the characters of character set 1, and replace other characters (including newline \n) with character set 2.
-d: delete all characters belonging to character set 1
-s: compress the repeated string into a string ; Use character set 2 to replace character set 1
-t: character set 2 to replace character set 1, the same result without options

parameter:

  • Character set 1: Specify the original character set to be converted or deleted. When performing the conversion operation, you must use the parameter "Character Set 2" to specify the target character set for conversion. However, when executing the delete operation, the parameter "Character Set 2" is not required.

  • Character set 2: Specify the target character set to be converted.

    echo "xyz" | tr 'a-z' 'A-Z'
    echo "xyz" | tr -t 'a-z' 'A-Z'
    

Insert picture description here

echo -e "xyz\nxyzhxyz" | tr -c "xy\n" "1"

Insert picture description here

echo "gooooodhjjl" | tr -d "oj"
echo "gooooodhjjl" | tr -d 'oj'

Insert picture description here

echo "gooodeel haan" | tr -s 'oea'
echo "gooodeel haan" | tr -s 'oea' "123"

Insert picture description here

You can use tr and sort commands to sort the array

Insert picture description here

Regular expression

  • Regular expressions are usually used in judgment statements to check whether a string satisfies a certain format

  • Regular expressions are composed of ordinary characters and metacharacters

  • Common characters include uppercase and lowercase letters, numbers, punctuation marks and some other symbols

  • Metacharacters refer to special characters with special meaning in regular expressions. They can be used to specify the appearance mode of its leading character (that is, the character before the metacharacter) in the target object.

Basic regular expression

Supported tools: grep, egrep, sed, awk

Common metacharacters in basic regular expressions:

\ :转义字符,用于取消特殊符号的含义,例:\!、\n、\$等

^ :匹配字符串开始的位置,例:^a、^the、^#、^[a-z]

$ :匹配字符串结束的位置,例:word$、^$匹配空行

. :匹配除\n之外的任意的一个字符,例:go.d、g..d

* :匹配前面子表达式0次或者多次,例:goo*d、go.*d

[list] :匹配list列表中的一个字符,例:go[ola]d,[abc]、[a-z]、[a-z0-9]、[0-9]匹配任意一位数字

[^list] :匹配任意非list列表中的一个字符,例:[^0-9]、[^A-Z0-9]、[^a-z]匹配任意一位非小写字母

\{n\} :匹配前面的子表达式n次,例:go\{2\}d、'[0-9]\{2\}'匹配两位数字

\{n,\} :匹配前面的子表达式不少于n次,例:go\{2,\}d、'[0-9]\{2,\}'匹配两位及两位以上数字

\{n,m\} :匹配前面的子表达式n到m次,例:go\{2,3\}d、'[0-9]\{2,3\}'匹配两位到三位数字

注:egrep、awk使用{n}、{n,}、{n,m}匹配时“{}”前不用加“\”

Insert picture description here

Extended regular expression

Supported tools: egrep, awk
extended regular expression metacharacters:

+ :匹配前面子表达式1次以上,例:go+d,将匹配至少一个o,如god、good、goood等

? :匹配前面子表达式0次或者1次,例:go?d,将匹配gd或god

() :将括号中的字符串作为一个整体,例1:g(oo)+d,将匹配oo整体1次以上,如good、gooood等

| :以或的方式匹配字条串,例:g(oo|la)d,将匹配good或者glad

For example:

Match mobile phone numbers starting with 173

#以173开头,后面随机8位数字组合
"^173[0-9]{8}$"

egrep "^173[0-9]{8}$" shou.txt
grep "^173[0-9]\{8\}$" shou.txt 

Insert picture description here

Match E-mail address

用户名@ :^([a-zA-Z0-9_\-\.\+]+)@
子域名 :([a-zA-Z0-9_\-\.]+)
顶级域名(字符串长度一般在2到5) :\.([a-zA-Z]\{2,5\})$

egrep '^([a-zA-Z0-9_\-\.\+]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$' you.sh
awk '/^([a-zA-Z0-9_\-\.\+]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$/{print $0}' you.sh

Insert picture description here

Guess you like

Origin blog.csdn.net/shengmodizu/article/details/114745686