Detailed explanation of shell scripts (7)-regular expressions, sort, uniq, tr

One, sort command-sort

  • Sort the contents of the files by row, or according to different data types

1. Format

Insert picture description here

2. Common options

Options Description
-f Ignore case, uppercase letters are sorted first by default
-b Ignore the spaces in front of each line
-n Sort by number
-r Reverse sort
-u Same as uniq, which means that only one line of the same data is displayed, deduplication
-t Specify the field separator, use the tab key to separate by default
-k Specify sort field
-o <output file> Export the sorted results to the specified file

3. Example

Insert picture description here
Insert picture description here

Two, remove duplicate line operation command-uniq

  • Used to report or ignore consecutive repeated lines in the file, often used in conjunction with the sort command

1. Format

Insert picture description here

2. Common options

Options Description
-c Count and delete repeated lines in the file
-d Show only consecutive repeated rows
-u Show only lines that appear once

3. Example

Insert picture description here

Insert picture description here

Three, character conversion command-tr

  • Commonly used to replace, compress and delete characters from standard input

1. Format

Insert picture description here

2. Common options

Options Description
-c Characters in character set 1 are reserved, and other characters (including newline \n) are replaced with character set 2
-d Delete all characters belonging to character set 1
-s Compress the repetitive character string into a character string, and replace character set 1 with character set 2
-t Character set 2 replaces character set 1, the same result without options

3. Parameters

  • Character set 1:

    • Specify the original character set to be converted or deleted. When performing the conversion operation, the parameter "Character Set 2" must be used to specify the conversion operation, and the parameter "Character Set 2" must be used to specify the target character set of the conversion. But when executing the delete operation, the parameter "Character Set 2" is not required
  • Character set 2:

    • Specify the target character set to be converted

4. Example

Insert picture description here

Four, display, connect file command-cut

  • The cut command has two main functions, the first is to display the contents of the file, and the second is to connect multiple or multiple files

1. Format

Insert picture description here

2. Common options

Options Description
-b Split in bytes, and only display the content of the specified direct range in the row
-c Split by character, only display characters in the specified range in the line
-d Custom separator, the default is tab "TAB"
-f Display the content of the specified field, used with -d
-n Unsplit multibyte characters
–complement Complement selected bytes, characters or fields
–out-delimiter Specify the field separator of the output content

3. Example

Insert picture description here

Five, regular expressions

  • Usually used in judgment statements to check whether a string meets a certain format

  • Regular expressions are composed of ordinary characters and metacharacters

  • Common characters include uppercase and lowercase letters, numbers, punctuation marks and some other symbols

  • Metacharacters refer to special characters with special meaning in regular expressions. They can be used to specify the appearance of the leading character (the character before the metacharacter) in the target object.

1. Common metacharacters in basic regular expressions (support tools: egrep, awk, grep, sed)

Metacharacter Description
\ Escape characters, used to cancel the meaning of special symbols, for example: !, \n, $, etc.
^ The starting position of the matching string, for example: ^a, ^the, #, [az]
$ End of the string matching position, for example: Word Katex the parse error: After the Expected Group '^' position AT. 3:, ^ matches the null line
. Match any character except \n, for example: go.d, g...d
* Match the preceding sub-expression 0 or more times, for example: goo*d, go.*d
[list] Match a character in the list, for example: go[ola]d, [abc], [az], [a-z0-9], [0-9] match any digit
[^list] Match any character in a non-list list, for example: [^0-9], [^A-Z0-9], [^az] match any non-lowercase letter
{n} Match the preceding sub-expression n times, for example: go{2}d,'[0-9]{2}'match two digits
{n,} The sub-expression before matching is not less than n times, for example: go{2, }d,'[0-9]{2, }'matches two or more digits
{n,m} Match the preceding sub-expression n to m times, for example: go{2,3}d, '[0-9]{2,3}' matches two to three digits

Note: when egrep and awk use {n}, {n,}, {n, m} to match, there is no need to add "\" before "{}"

2. Extended regular expression metacharacters (support tools: egrep, awk)

Metacharacter Description
+ Match the preceding sub-expression more than once, for example: go+d, will match at least one o, such as god, good, goood, etc.
? Match the previous sub-expression 0 or 1 time, for example: go?d, will match gd or god
() Take the string in the brackets as a whole of h, for example 1: g(oo)+d," will match the whole oo more than once, such as good, gooood, etc.
| Match the string of characters in an or manner, for example: g (oo|la)d," will match good or glad

3. Example

①. First display the mobile phone numbers starting with 13 and 15 in the file, and then display the regional landline number

Insert picture description here

②. To display the email, the user name must start with a letter, and at most 2 symbols "-" or "." can be used in the middle, and the end of the symbol cannot be used. The length of the user name is at least 6 characters

Insert picture description here

Guess you like

Origin blog.csdn.net/Lucien010230/article/details/114703103
Recommended