Shell script regular expression and sort uniq tr cut command

Sort command (sort)

Sort the contents of the files by row, or according to different data types

Syntax format:
sort [option] parameter
cat file | sort option

Common options:
-f: Ignore case
-b: Ignore the space in front of each line
-M: Sort by three-character month
-n: Sort by number
-r: Sort by reverse
-u: Same as uniq, which means the same The data only displays one line
-t: Specify the field separator, the default is separated by the [Tab] key
-k: Specify the sort field
-o <output file>: Transfer the sorted results to the specified file
Insert picture description here
Insert picture description here
Insert picture description here

Remove duplicate lines command (uniq)

Used to report or ignore consecutive repeated lines in a file, often used in conjunction with the sort command

Syntax format:
uniq [option] parameter
cat file | uniq option

Commonly used options:
-c: count and delete repeated lines in the file
-d: only display repeated lines
-u: only display lines that appear once

Insert picture description here
Insert picture description here
Insert picture description here

Character conversion command (tr)

Commonly used to replace, compress and delete characters from standard input

Syntax format:
tr [options] [parameters]

Commonly used options:
-c: retain characters in character set 1, replace other characters with character set 2 (including newline \n)
-d: delete all characters belonging to character set 1
-s: compress repetitive strings into A string; replace character set 1 with character set 2
-t: replace character set 1 with character set 2, the same result without options

Parameters:
Character set 1: Specify the original character set to be converted or deleted. When performing the conversion operation, you must use the parameter "Character Set 2" to specify the target character set for conversion. However, when performing the delete operation, the parameter "Character Set 2" is not required.
Character Set 2: Specify the target character set to be converted into

Insert picture description here

Display and link file commands (cut)

The cut command has two main functions, the first is to display the contents of the file, and the second is to connect multiple or multiple files

Syntax format:
cut [option] parameter
cat file | cut option

Commonly used options
-b Split by bytes, and only display the content of the specified direct range in the line
-c Split by characters, only display the characters in the specified range in the line
-d Custom separator, the default is a tab "TAB"
-f displays the contents of the specified field, used together with -d
-n cancels splitting multibyte characters
-complement complements the selected bytes, characters or fields
-out-delimiter specifies the field delimiter of the output content

Insert picture description here

Regular expression

Usually used in judgment statements to check whether a string meets a certain format

Regular expressions are composed of ordinary characters and metacharacters.
Ordinary characters include uppercase and lowercase letters, numbers, punctuation marks and some other symbols
. Metacharacters refer to special characters with special meaning in regular expressions and can be used to specify their leading characters ( That is, the appearance pattern of the character before the metacharacter) in the target object

Common metacharacters of basic regular expressions: (supported tools: grep, egrep, sed, awk)
\: escape character, used to cancel the meaning of special symbols, for example: !, \n, $, etc.
^: the beginning of the matching string position, for example: a, the, #, [AZ]
$: string matches the end position, for example: Word Katex the parse error: After the Expected Group '^' position AT 2:, ^ match blank line
: in addition to matching Any character other than \n, for example: go.d, g...d

  • : Match the preceding sub-expression 0 or more times, for example: goo*d, go.*d
    [list]: match a character in the list, for example: go[ola]d, [abc], [az], [a-z0-9], [0-9] match any digit
    [^list]: match any character in a non-list list, for example: [ 0-9], [ A-Z0-9], [ ^az] Match any non-lowercase letter
    {n}: Match the preceding sub-expression n times, for example: go{2}d,'[0-9]{2}' match two digits
    {n,}: Match the preceding sub-expression no less than n times, for example: go{2,}d,'[0-9]{2,}' match two or more digits
    {n,m}: match the preceding sub-expression Expressions n to m times, for example: go{2,3}d,'[0-9]{2,3}' match two to three digits.
    Note: egrep, awk use {n}, {n,} , {N,m} does not need to add "\" before "{}" when matching

Extended regular expression metacharacters: (supported tools: egrep, awk)

  • : Match the previous sub-expression more than once, for example: go+d, will match at least one o, such as god, good, goood, etc
    .?: Match the previous sub-expression 0 or 1 time, for example: go?d, will match gd or god
    (): take the string in parentheses as a whole, example 1: g(oo)+d, will match oo as a whole more than once, such as good, gooood, etc.
    |: match the string of words in an or, for example : G(oo|la)d, will match good or glad

Insert picture description here
Insert picture description here

Insert picture description here

Guess you like

Origin blog.csdn.net/MQ107/article/details/114746420