Linux text processing Three Musketeers have under - grep sed awk
grep: text line filtering tools
sed: text line editor (stream editor)
awk: report generator (do formatting text output)
First, the regular expression
1, basic regular expressions
* Matches the preceding character zero or any number of times (a * represents a character zero or any number of times before the match, write no sense, will match all .aa * indicates a match contains at least one row of a)
The match any character except newline
^ $ Beginning of a line and end of line
[] Any one character in brackets
\ {N \} indicates that the character appears exactly n times \ escape characters
\ {N, \} indicate that the previous character appears is not less than n times
\ {N, m \} n times a character appears at least before said m-th most frequent
2, extended regular
You do not need an escape character
+ Character before a match once or as many times
? Matches the previous character 0 or 1 times
| Select matching two or more branches
() Matches a whole
Second, the character interception and Replace commands
1、cut
cut default separator is a tab command, which is "tab" key
-f column number
-d delimiter
-c character range
2、awk
Common parameters
-F specify the delimiter
-v manually specify variable parameters
1) printf format output
the printf 'Output Output Format Type' output content
Output Type:% ns: output string. n is a number to refer to output several characters
% Ni: integer output. n is a number to refer to the digital output of several
% M.nf: floating-point output. m and n are numbers, integer digits and decimal places of generations output. The% 8.2f 8 bits representative of the common output,
Where 2 is the fractional, 6 is an integer.
Align Right +
- Left
example:
See id system is equal to or larger than 1 and less than 500 users.
cat /etc/passwd |awk 'BEGIN{FS=":"} $3>=1&&$3<=500{printf "%-10s %-10d\n",$1,$3}'
2) awk basic use
awk '{action 1 Condition 1 Condition 2} {2} ... operation' file name
For example: df -h | grep / dev / sda3 | awk '{print $ 5}' share the root partition extraction
3) awk conditions
When awk BEGIN beginning of the program, it has not been executed before reading any data. After the BEGIN action is executed only once at the beginning of the program
End END awk program in the processing of all data, execution is about to end. After the END action is executed only once at the end of the program
> >= < <= == !=
A ~ B determines whether a string contains the substring A matches B expression
A! A ~ B determines whether the string does not contain substring matches B expression
/ Regular /
4) awk built-in variable
$ 0 represents the entire row of data awk currently being read. We know awk is read into the data line by line, $ 0 represents the entire row of data is read into the current line
$ N represents the n-th field currently read into the line.
Field owned (column) NF total current line.
NR awk currently processed row, the first few lines of the total data.
FS user-defined delimiters. awk's default delimiter is any space. If you want to use other delimiters (such as ":"), you need to define a variable FS
Separator (space by default) the OFS output field.
Examples: statistical history command the highest number of
history |awk -F '[ ]+' '{print $3}'|sort|uniq -c|sort -nr|head
3, but
sed is mainly used to select the data, replace, delete, add the command
sed [Options] '[action]' filename
Options:
-n sed command will generally all data output to the screen, this selection if added, will only output through sed command processing line to the screen.
-r supports extended regular expressions in sed in
Sed -i modification result directly modifying the file by reading data output by the screen instead of
action:
a \: Append, add one or more lines after the current line. Add multiple lines, except for the last line, the end of each row need to use "\" represents the data is not the end.
c \: row replacement and the original data with a row behind string c, replacing multiple rows, except the last, the end of each line required "\" represents the data is not the end.
i \: inserting, before inserting one or more rows in the current period. Insert multiple rows, except for the last line, the end of each row need to use "\" represents the data is not the end.
d: Delete to delete the specified line.
p: print, export specified row
s: string replacement, replacing a character string with another string. The format of "line range s / old string / new string / G" (and similar substitution pattern in vim)
Third, the character processing command
1、sort
sort [options] filename
-f: ignore case
-b: Ignore blank portion of the front of each row
-n: to sort numeric, string default ordering
-r: reverse sequencing
-u: remove duplicate rows. It is uniq command
-t: Specifies delimiter, default delimiter is a tab
-kn [, m]: sorted according to fields specified range. From the first field n, m end of the field (to the end of the line by default)
2、uniq
uniq [options] filename
-i: ignore case
3、wc
-l: only count the number of rows
-w: only count the number of words
-m: only count the number of characters
-L: Counting words number of characters
Fourth, conditional
1, in accordance with the determination condition
test
-d directory to determine whether there is
-E determine whether a file exists
-f determine whether a regular file
-S to determine whether the file is empty
2, in accordance with the file permissions judge
-r
-w
-x
-u determine whether there is suid permission
-g determine whether there sgid privileges
-k authority to determine whether there sbit
3, between the two documents is determined
2 file 1 -nt file to determine whether the file modification time is newer than the file 1 2 (if the new True)
2 file 1 -ot file to determine whether the file modification time than the old file 1 2 (If the old True)
Inode same document file -ef 1 2 1 determines whether the file and the file 2 can be understood as if two files the same file. This judgment is used to determine a hard link is a good way
4, an integer is determined
-eq equal
-gt greater than
greater than or equal -ge
-lt less than
-le less
5, the string is determined
-z determines whether the string is empty (empty return true)
-n determines whether the string is non-empty (non-empty return true)
== 1 determines whether the character string, and the string 2 are equal (equal Returns true)
! = 1 determines whether the string and the string is not equal to 2 (not equal Returns true)
6, multiple condition determination
Analyzing a 2 -a determination logic, and determines a judgment 2 are true, the end result will be true
Analyzing logic 1 or 2 -o determination, and determining a 2 determines a set up, the end result will be true
! Analyzing logical negation, so that the original determination of formula negated
Fifth, process control
1, if a single limb
if [ ];then
Implementation of body
be
2, if the bifurcation
if [ ];then
The implementation of conditions are met
else
Execution condition is not satisfied
be
3, multi-branch if
if [ ];then
The implementation of conditions are met
elif [ ];then
The implementation of conditions are met
else
All the conditions are not set up to perform
be
4, multi-branch case conditional statement
Judgment condition relationship case only, but can be determined if more
Variables in case
Value 1)
Implementation of body
;;
Value 2)
Implementation of body
;;
*)
If the variable is not more than the value, the implementation of this
;;
esac
5, for circulation
for i in 条件 ; do
Loop
done
for ((initial value; circulation control condition; variable change))
do
Loop
done
6, while circulation
the while [conditional]; do
done