Linux text processing Three Musketeers have under - grep sed awk

grep: text line filtering tools

sed: text line editor (stream editor)

awk: report generator (do formatting text output)

First, the regular expression

1, basic regular expressions

* Matches the preceding character zero or any number of times (a * represents a character zero or any number of times before the match, write no sense, will match all .aa * indicates a match contains at least one row of a)

The match any character except newline

^ $ Beginning of a line and end of line

[] Any one character in brackets

\ {N \} indicates that the character appears exactly n times \ escape characters

\ {N, \} indicate that the previous character appears is not less than n times

\ {N, m \} n times a character appears at least before said m-th most frequent

2, extended regular

You do not need an escape character

+ Character before a match once or as many times

? Matches the previous character 0 or 1 times

| Select matching two or more branches

() Matches a whole

Second, the character interception and Replace commands

1、cut

cut default separator is a tab command, which is "tab" key

-f column number

-d delimiter

-c character range

2、awk

Common parameters

-F specify the delimiter

-v manually specify variable parameters

1) printf format output

the printf 'Output Output Format Type' output content

Output Type:% ns: output string. n is a number to refer to output several characters

% Ni: integer output. n is a number to refer to the digital output of several

% M.nf: floating-point output. m and n are numbers, integer digits and decimal places of generations output. The% 8.2f 8 bits representative of the common output,

Where 2 is the fractional, 6 is an integer.

Align Right +

- Left

example:

See id system is equal to or larger than 1 and less than 500 users.

cat /etc/passwd |awk 'BEGIN{FS=":"} $3>=1&&$3<=500{printf "%-10s %-10d\n",$1,$3}'

2) awk basic use

awk '{action 1 Condition 1 Condition 2} {2} ... operation' file name

For example: df -h | grep / dev / sda3 | awk '{print $ 5}' share the root partition extraction

3) awk conditions

When awk BEGIN beginning of the program, it has not been executed before reading any data. After the BEGIN action is executed only once at the beginning of the program

End END awk program in the processing of all data, execution is about to end. After the END action is executed only once at the end of the program

> >= < <= == !=

A ~ B determines whether a string contains the substring A matches B expression

A! A ~ B determines whether the string does not contain substring matches B expression

/ Regular /

4) awk built-in variable

$ 0 represents the entire row of data awk currently being read. We know awk is read into the data line by line, $ 0 represents the entire row of data is read into the current line

$ N represents the n-th field currently read into the line.

Field owned (column) NF total current line.

NR awk currently processed row, the first few lines of the total data.

FS user-defined delimiters. awk's default delimiter is any space. If you want to use other delimiters (such as ":"), you need to define a variable FS

Separator (space by default) the OFS output field.

Examples: statistical history command the highest number of

history |awk -F '[ ]+' '{print $3}'|sort|uniq -c|sort -nr|head

3, but

sed is mainly used to select the data, replace, delete, add the command

sed [Options] '[action]' filename

Options:

-n sed command will generally all data output to the screen, this selection if added, will only output through sed command processing line to the screen.

-r supports extended regular expressions in sed in

Sed -i modification result directly modifying the file by reading data output by the screen instead of

action:

a \: Append, add one or more lines after the current line. Add multiple lines, except for the last line, the end of each row need to use "\" represents the data is not the end.

c \: row replacement and the original data with a row behind string c, replacing multiple rows, except the last, the end of each line required "\" represents the data is not the end.

i \: inserting, before inserting one or more rows in the current period. Insert multiple rows, except for the last line, the end of each row need to use "\" represents the data is not the end.

d: Delete to delete the specified line.

p: print, export specified row

s: string replacement, replacing a character string with another string. The format of "line range s / old string / new string / G" (and similar substitution pattern in vim)

Third, the character processing command

1、sort

sort [options] filename

-f: ignore case

-b: Ignore blank portion of the front of each row

-n: to sort numeric, string default ordering

-r: reverse sequencing

-u: remove duplicate rows. It is uniq command

-t: Specifies delimiter, default delimiter is a tab

-kn [, m]: sorted according to fields specified range. From the first field n, m end of the field (to the end of the line by default)

2、uniq

uniq [options] filename

-i: ignore case

3、wc

-l: only count the number of rows

-w: only count the number of words

-m: only count the number of characters

-L: Counting words number of characters

Fourth, conditional

1, in accordance with the determination condition

test

-d directory to determine whether there is

-E determine whether a file exists

-f determine whether a regular file

-S to determine whether the file is empty

2, in accordance with the file permissions judge

-r

-w

-x

-u determine whether there is suid permission

-g determine whether there sgid privileges

-k authority to determine whether there sbit

3, between the two documents is determined

2 file 1 -nt file to determine whether the file modification time is newer than the file 1 2 (if the new True)

2 file 1 -ot file to determine whether the file modification time than the old file 1 2 (If the old True)

Inode same document file -ef 1 2 1 determines whether the file and the file 2 can be understood as if two files the same file. This judgment is used to determine a hard link is a good way

4, an integer is determined

-eq equal

-gt greater than

greater than or equal -ge

-lt less than

-le less

5, the string is determined

-z determines whether the string is empty (empty return true)

-n determines whether the string is non-empty (non-empty return true)

== 1 determines whether the character string, and the string 2 are equal (equal Returns true)

! = 1 determines whether the string and the string is not equal to 2 (not equal Returns true)

6, multiple condition determination

Analyzing a 2 -a determination logic, and determines a judgment 2 are true, the end result will be true

Analyzing logic 1 or 2 -o determination, and determining a 2 determines a set up, the end result will be true

! Analyzing logical negation, so that the original determination of formula negated

Fifth, process control

1, if a single limb

if [ ];then

Implementation of body

2, if the bifurcation

if [ ];then

The implementation of conditions are met

else

Execution condition is not satisfied

3, multi-branch if

if [ ];then

The implementation of conditions are met

elif [ ];then

The implementation of conditions are met

else

All the conditions are not set up to perform

4, multi-branch case conditional statement

Judgment condition relationship case only, but can be determined if more

Variables in case

Value 1)

Implementation of body

;;

Value 2)

Implementation of body

;;

*）

If the variable is not more than the value, the implementation of this

;;

esac

5, for circulation

for i in 条件 ; do

Loop

done

for ((initial value; circulation control condition; variable change))

Loop

done

6, while circulation

the while [conditional]; do

done

2 shell programming

First, the regular expression

1, basic regular expressions

2, extended regular

Second, the character interception and Replace commands

1、cut

2、awk

3, but

Third, the character processing command

1、sort

2、uniq

3、wc

Fourth, conditional

1, in accordance with the determination condition

2, in accordance with the file permissions judge

3, between the two documents is determined

Fifth, process control

1, if a single limb

2, if the bifurcation

3, multi-branch if

4, multi-branch case conditional statement

5, for circulation

6, while circulation

Guess you like