(23)awk

AWK is an excellent text processing tools. Its name derived from its founder Alfred Aho, Peter Weinberger, and the first letter of the last name of Brian Kernighan, AWK provides an extremely powerful feature: You can load the style, flow control, mathematical operators, and even built-in process control statements variables and functions. It has a complete language should have almost all the fine features. AWK does in fact have their own language: AWK programming language, the three founders have it officially defined as "style scanning and processing language." It allows you to create short programs that read input files, sort data, process the data, perform calculations on the input and generate reports, as well as countless other functions.

awk command format:

awk 'pattern1 {action1} pattern2 {action2} ...' filename

 

About
awk is a powerful text analysis tool, relative to grep to find, edit sed of, awk in its data analysis and report generation, is particularly strong. Awk is to simply read the file line by line, as the default delimiter spaces each row of slices, cut portions then various evaluation.

There are three different versions of awk: awk, nawk and the gawk, is not particularly described, generally refers to gawk, gawk is the GNU AWK version.

awk its name derived from the first letters of its founder Alfred Aho, Peter Weinberger, and Brian Kernighan last name. AWK does in fact have their own language: AWK programming language, the three founders have it officially defined as "style scanning and processing language." It allows you to create short programs that read input files, sort data, process the data, perform calculations on the input and generate reports, as well as countless other functions.

 

Use
awk '{pattern + action}' {filenames}
while operations can be complex, but always grammar, where pattern representing the content in the data look AWK, the action is executed when a match is found in a series of command. Curly braces ({}) need not always appear in the program, they are used to group a series of instructions according to a particular pattern. pattern is a positive expression to be represented, with slash marks.

The most basic function of awk language is a browser-based rules specified in the document or string and extract information, the awk to extract information in order to carry out other text manipulation. Complete awk scripts are often used information in a formatted text file.

Usually, awk is a behavior file processing units. awk each line of the document received, and executing the first command to process text.

 

Call awk
There are three ways to call awk


1. command line
awk [-F field-separator] ' commands' input-file (s)
where, commands are really awk command, [- F field delimiter] is optional. input-file (s) is a file to be processed.
In awk, each line file, separated by a field separator for each called a domain. Typically, in the case where the unnamed -F field delimiter, the default field separator is a space.

2.shell script way
all the awk command to insert a file, and the executable program awk, awk command interpreter and then as the first line of the script, again invoked by typing the name of the script.
Equivalent to shell script the first line:! # / Bin / sh
can be replaced with:! # / Bin / awk

3. all the awk command to insert a separate file, and then call:
awk -f awk-script-file the INPUT-File (S)
which, -f option to load awk-script-file in the awk script, input-file (s ) with the above it is the same.

This chapter focuses on the command line.

 

Examples of entry
is assumed that the last 5 -n output follows
last command: latest log output record
[root @ www ~] # last -n 5 <== extracting only the first five lines
root pts / 1 192.168.1.100 Tue Feb 10 11:21 still in logged
. 1 192.168.1.100 On Feb 10 00:46 Tue the root PTS / - 02:28 (01:41)
the root PTS / Mon On Feb. 9 192.168.1.100. 1 11:41 - 18:30 (06:48)
dmtsai PTS /. 1 Mon Feb 9 11:41 192.168.1.100 - 11:41 (00:00)
root tty1 Fri Sep 5 14:09 - 14:10 (00:01)
display recently logged five accounts

#last -n 5 | awk '{print $ 1}' # print the first field of each column of
the root
the root
the root
dmtsai
the root
awk workflow is such that: there is a read record division newline '\ n', and then recording the specified field delimiter into domain, the domain is filled, then all domains $ 0, $ 1 represents the first field, $ n denotes the n-th field. The default field separator is a "key blank" or "[Tab] button", the user logged represents $ 1, $ 3 represents a login user IP, and so on.

 

If only the display / etc / passwd account

#cat / etc / the passwd | awk -F ':' 'Print $ {}. 1'
the root
daemon
bin
SYS
This is an example of the action awk +, each line will be executed action {print $ 1}.

-F specified field separator is ':'

 

If only the display / etc / passwd accounts and account corresponding shell, and the shell and between the accounts in the tab divided

#cat /etc/passwd |awk -F ':' '{print $1"\t"$7}'
root /bin/bash
daemon /bin/sh
bin /bin/sh
sys /bin/sh

If only the display / etc / passwd accounts and accounts corresponding shell, and between accounts and shell separated by commas, and add the column names in the name of all lines, shell, add "blue, / bin / nosh" in the last line.


cat / etc / passwd | awk -F ':' 'BEGIN {print "name, shell"} {print $ 1 "," $ 7} END {print "blue, / bin / nosh"}' # BEGIN-- defined header END-- tail definition table
name, the shell
the root, / bin / the bash
daemon, / bin / SH
bin, / bin / SH
SYS, / bin / SH
....
Blue, / bin / NOSH

awk workflow is such that: beging executed first, and then read the file, there is a read record / n newline character segmentation and the records in the specified field delimiter into domain, the domain is filled, then all domains $ 0, $ 1 represents the first field, $ n represents the n-th field, and then begin an operation mode corresponding to the action. Then start reading the second record ······ until all records have been read, and finally an END operation.

 

Search / etc / passwd root lines have all keywords

-F #awk: '/ the root /' / etc / the passwd
the root: X: 0: 0: the root: / the root: / bin / the bash
such example is the use of the pattern, the pattern matching (here root) The lines It will perform the action (not specified action, the default output the contents of each line).

Search supports regular, such as beginning to find the root: awk -F: '/ ^ root /' / etc / passwd

 

Search / etc / passwd root lines have all the keywords and displays the corresponding shell

Awk -F #: '/ the root / Print $ {}. 7' / etc / the passwd
/ bin / the bash
here designated action {print $ 7}

 

awk built-in variable
awk has many built-in variables to set the environment information, these variables can be changed, the most commonly used are given below variables.


Command line argument number ARGC
ARGV command line parameters are arranged
ENVIRON supports the system environment variables queue
FILENAME awk browse the file name
record number of the file browsing FNR
FS setting input field separator, which is equivalent to the command line option -F
NF History number of fields
recording the number of read NR
OFS output field separator
ORS output record separator
RS control record separator

Further, the variable $ 0 refers to the entire record. $ 1 represents the first field of the current line, $ 2 represents the second field of the current line, ...... and so on.

 

Statistics / etc / passwd: file name, line number per row, the number of columns per row, corresponding to the complete line:

#awk -F ':' '{print "filename:" FILENAME ",linenumber:" NR ",columns:" NF ",linecontent:"$0}' /etc/passwd
filename:/etc/passwd,linenumber:1,columns:7,linecontent:root:x:0:0:root:/root:/bin/bash
filename:/etc/passwd,linenumber:2,columns:7,linecontent:daemon:x:1:1:daemon:/usr/sbin:/bin/sh
filename:/etc/passwd,linenumber:3,columns:7,linecontent:bin:x:2:2:bin:/bin:/bin/sh
filename:/etc/passwd,linenumber:4,columns:7,linecontent:sys:x:3:3:sys:/dev:/bin/sh

Use printf replacement print, you can make the code more concise, easy to read

awk -F ':' '{printf("filename:%s,linenumber:%s,columns:%s,linecontent:%s\n",FILENAME,NR,NF,$0)}' /etc/passwd

printf print and
awk provides both print and printf functions of the two kinds of printout.

Wherein the print parameter function may be variable, number or string. String must be enclosed in double quotes, separated by a comma. If there is no comma, parameters can not be distinguished in series together. Here, the role of the role of the comma delimited file and the output is the same, but the latter is only a space.

printf function, its usage and c language printf substantially similar, may be formatted string, output complex, easier to use printf, the code more understandable.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

awk programming
variables and assignment

In addition to the built-in variables awk, awk can also customize variables.

The following statistics / etc / passwd account number

awk '{COUNT ++; Print $ 0;} {Print the END "User IS COUNT", COUNT}' / etc / the passwd
the root: X: 0: 0: the root: / the root: / bin / the bash
......
User COUNT 40 iS
COUNT custom variables. Before the action {} there is only one print, in fact, just print a statement, action {} may have multiple statements to; number separated.

 

There is no initialization count, although the default is 0, the appropriate approach is initialized to 0:

awk 'BEGIN {count=0;print "[start]user count is ", count} {count=count+1;print $0;} END{print "[end]user count is ", count}' /etc/passwd
[start]user count is 0
root:x:0:0:root:/root:/bin/bash
...
[end]user count is 40

Number of bytes in a file folder statistics occupied

ls -l |awk 'BEGIN {size=0;} {size=size+$5;} END{print "[end]size is ", size}'
[end]size is 8657198

If the display unit to M:

LS -l | awk 'the BEGIN {size = 0;} {size = size + $. 5;} the END {Print "[End] size IS", size / 1024/1024, "M"}'
[End] size IS 8.25889 M
Note , statistics do not include subdirectories folders.

 

Conditional statements

awk conditional statements are borrowed from the C language, see the following statement by:


if (expression) {
statement;
statement;
... ...
}

if (expression) {
statement;
} else {
statement2;
}

if (expression) {
statement1;
} else if (expression1) {
statement2;
} else {
statement3;
}

The number of bytes occupied by the files in a folder at the statistics, 4096 filter size of the file (usually a folder):

ls -l |awk 'BEGIN {size=0;print "[start]size is ", size} {if($5!=4096){size=size+$5;}} END{print "[end]size is ", size/1024/1024,"M"}'
[end]size is 8.22339 M

loop statement

awk in the same loop borrowed from the C language support while, do / while, for, break, continue, and C language semantics identical semantics of these keywords.

 

Array

Since the awk subscript of an array may be numbers and letters, subscript of an array is often referred key (key). Values ​​and keys are stored in the interior of a table for key / value hash's application. Since the hash is not stored sequentially, so when the show will find an array of content, they are not displayed as you expect out of order. And an array of variables, are automatically created when using, awk will also automatically determine which stores digital or string. Generally, the array awk used to gather information from the record, may be used to calculate the sum of the number, and statistical tracking template word to be matched and the like.

 

Show / etc / passwd account


awk -F ':' 'BEGIN { count = 0;} {name [count] = $ 1; count ++;}; END {for (i = 0; i <NR; i ++) print i, name [i]}' / etc / the passwd
0 the root
. 1 daemon
2 bin
. 3 SYS
. 4 Sync
. 5 games
......

herein for loop through the array

 

Awk programming content very much, listed here only simple common usage, please refer http://www.gnu.org/software/gawk/manual/gawk.html more

Guess you like

Origin www.cnblogs.com/paradis/p/11360675.html
awk