sort, uniq, tr, cut regular expression

sort command

Sort the contents of the files by row, or according to different data types

Grammatical format

sort [options] parameter

Common options

-f: Ignore case
-b: Ignore the space in front of each line
-n: Sort by number
-r: Reverse sort
-u: Same as uniq, which means that the same data only displays one line
-t: Specify the field separator, The default is separated by the [Tab] key
-k: specify the sort field
-o <output file>: dump the sorted results to the specified file

experiment

Insert picture description here

Insert picture description here

Insert picture description here
Insert picture description here
Insert picture description here
Insert picture description here
Insert picture description here

uniq command

Used to report or ignore consecutive repeated lines in a file, often used in conjunction with the sort command

Grammatical format

uniq [options] parameters

Common options

-c: count and delete the repeated lines in the file
-d: display only the repeated lines
-u: display only the lines that appear once
Insert picture description here
Insert picture description here
Insert picture description here
Insert picture description here
Insert picture description here

tr command

Commonly used to replace, compress and delete characters from standard input

Grammatical format

tr [options] [parameters]

Common options

-c: retain the characters of character set 1, replace other characters with character set 2 (including newline \n)
-d: delete all characters belonging to character set 1
-s: compress the repeated string into a string ; Use character set 2 to replace character set 1
-t: character set 2 to replace character set 1, the same result without options

Character set 1: Specify the original character set to be converted or deleted. When performing the conversion operation, you must use the parameter "Character Set 2" to specify the target character set for conversion. However, when performing the delete operation, the parameter "Character Set 2" is not required;
Character Set 2: Specify the target character set to be converted.

Insert picture description here
Insert picture description here
Insert picture description here
Insert picture description here
Delete blank lines
Insert picture description here
Replace the colon ":" in the variable in the path with a newline character "\n"

Insert picture description here
Delete the "^M" character caused by the windows file

When encountering a newline character ("\n") in Linux, the carriage return + linefeed operation will be performed. Instead, the carriage return character will only be displayed as a control character ("^M"), and no carriage return operation will occur. In Windows, only carriage return + line feed ("\r\n") can be used for carriage return + line feed. If a control character is missing or the order is not correct, a new line cannot be correctly started.

Insert picture description here
Array sort
Insert picture description here

cut command

Display the specified part of the line, delete the specified field in the file

Grammatical format

cut option parameters

Common options

-f: Extract by specifying which field. The cut command uses "TAB" as the default field separator
-d: "TAB" is the default separator. Use this option to change to other separators.
–Complement: This option is used to exclude the specified field
–output-delimiter: change the delimiter of the output content
Insert picture description here

Regular expression

Usually used in judgment statements to check whether a string meets a certain format

The composition of regular expressions

Regular expressions are composed of ordinary characters and metacharacters

Common characters include uppercase and lowercase letters, numbers, punctuation marks, and some other symbols
. Metacharacters refer to special characters with special meaning in regular expressions. They can be used to specify the leading character (the character before the metacharacter) in the target object. Mode of appearance in

Common metacharacters of basic regular expressions: (supported tools: grep, egrep, sed, awk)

\ :转义字符,用于取消特殊符号的含义,例:\!、\n、\$等
^ :匹配字符串开始的位置,例:^a、^the、^#、^[a-z] 
$ :匹配字符串结束的位置,例:word$、^$匹配空行
. :匹配除\n之外的任意的一个字符,例:go.d、g..d
* :匹配前面子表达式0次或者多次,例:goo*d、go.*d
[list] :匹配list列表中的一个字符,例:go[ola]d,[abc][a-z][a-z0-9][0-9]匹配任意一位数字
[^list] :匹配任意非list列表中的一个字符,例:[^0-9][^A-Z0-9][^a-z]匹配任意一位非小写字母
\{
    
    n\} :匹配前面的子表达式n次,例:go\{
    
    2\}d、'[0-9]\{2\}'匹配两位数字
\{
    
    n,\} :匹配前面的子表达式不少于n次,例:go\{
    
    2,\}d、'[0-9]\{2,\}'匹配两位及两位以上数字
\{
    
    n,m\} :匹配前面的子表达式n到m次,例:go\{
    
    2,3\}d、'[0-9]\{2,3\}'匹配两位到三位数字
注:egrep、awk使用{
    
    n}{
    
    n,}{
    
    n,m}匹配时“{
    
    }”前不用加“\”
扩展正则表达式元字符:(支持的工具:egrep、awk)
+ :匹配前面子表达式1次以上,例:go+d,将匹配至少一个o,如god、good、goood等
? :匹配前面子表达式0次或者1次,例:go?d,将匹配gd或god
() :将括号中的字符串作为一个整体,例1g(oo)+d,将匹配oo整体1次以上,如good、gooood等
| :以或的方式匹配字条串,例:g(oo|la)d,将匹配good或者glad

Guess you like

Origin blog.csdn.net/Jun____________/article/details/114748290