[Switch] Linux awk command description

Linux awk command description
linux regular expression fileactioncunix

one. AWK description
       awk is a programming language for text and data processing under linux/unix. Data can come from standard input, one or more files, or the output of other commands. It supports advanced functions such as user-defined functions and dynamic regular expressions, and is a powerful programming tool under linux/unix. It is used on the command line, but more as a script.
       The way awk handles text and data: it scans the file line by line, from the first line to the last, looking for lines that match a specific pattern, and does what you want on those lines. If no processing action is specified, the matching lines are displayed to standard output (screen), and if no pattern is specified, all lines specified by the action are processed.
       awk respectively represents the first letter of its author's last name. Because its authors are three people, namely Alfred Aho, Brian Kernighan, Peter Weinberger.
       gawk is the GNU version of awk, which provides some extensions from Bell Labs and GNU. The awk introduced below is based on GUN's gawk as an example. In the Linux system, awk has been linked to gawk, so the following are all introduced with awk.

2. awk command format and options
2.1. There are two forms of awk syntax
       awk [options] 'script' var=value file(s)
       awk [options] -f scriptfile var=value file(s)

2.2. Command options
(1) -F fs or --field-separator fs : Specify the input file fold separator, fs is a string or a regular expression, such as -F:.
(2) -v var=value or --asign var=value : Assign a user-defined variable.
(3) -f scripfile or --file scriptfile: Read awk commands from script files.
(4) -mf nnn and -mr nnn : Set an intrinsic limit on the nnn value, the -mf option limits the maximum number of blocks allocated to nnn; the -mr option limits the maximum number of records. These two functions are extensions of the Bell Labs version of awk and do not apply in standard awk.
(5) -W compact or --compat, -W traditional or --traditional : Run awk in compatibility mode. So gawk behaves exactly like standard awk, all awk extensions are ignored.
(6) -W copyleft or --copyleft, -W copyright or --copyright : Print short copyright information.
(7) -W help or --help, -W usage or --usage : Print all awk options and a short description of each option.
(8) -W lint or --lint : Print warnings for constructs that are not portable to legacy unix platforms.
(9) -W lint-old or --lint-old : Print warnings about constructs that are not portable to legacy unix platforms.
(10) -W posix : Turn on compatibility mode. However, the following restrictions are not recognized: /x, function keywords, func, escape sequences, and newline as a field separator when fs is a space; operators ** and **= cannot be substituted for ^ and ^ =; fflush is invalid.
(11) -W re-interval or --re-inerval : Allows the use of interval regular expressions, refer to (Posix character class in grep), such as bracket expressions [[:alpha:]].
(12) -W source program-text or --source program-text : Use program-text as the source code, which can be mixed with the -f command.
(13) -W version or --version : Print the version of the bug report information.

3. Patterns and actions
awk scripts are composed of patterns and actions:
              pattern {action} such as $ awk '/root/' test, or $ awk '$3 < 100' test.
       Both are optional, if there is no pattern, the action is applied to all records, if there is no action, the output matches all records. By default, each input line is a record, but the user can specify a different delimiter through the RS variable to separate.

3.1. Pattern The
pattern can be any of the following:
(1) Regular expression: an extended set of wildcards.
(2) Relational expression: You can use the relational operators in the operator table below to operate, which can be a comparison of character (3) strings or numbers, such as $2>%1 to select the second field longer than the first field Row.
(4) Pattern matching expressions: use operators ~ (match) and ~! (do not match).
(5) mode, mode: specify a range of lines. The syntax cannot include BEGIN and END patterns.
(6) BEGIN: Allows the user to specify the action that occurs before the first input record is processed, usually a global variable can be set here.
(7) END: Let the user take action after the last input record is read.

3.2. Operation
       Operation consists of one or more commands, functions, expressions, separated by newlines or semicolons, and enclosed in curly brackets. There are four main parts:
(1) variable or array assignment
(2) output command
(3) built-in function
(4) control flow command

4. awk environment variable
variable
Description
$n
the nth field of the current record, between fields by FS separated.
$0
for the complete input record.
The number of ARGC
command line arguments.
The position of the current file on the ARGIND
command line (counting from 0).
ARGV
contains an array of command line arguments.
CONVFMT
number conversion format (default is %.6g)
ENVIRON
environment variable associative array. Description of the last system error in
ERRNO . FIELDWIDTHS A list of field widths (space-separated).



FILENAME
The current filename.
FNR
is the same as NR, but relative to the current file.
FS
field separator (default is any space).
IGNORECASE
If true, do case-insensitive matching.
NF
The number of fields in the current record.
NR
Current number of records.
Output format of OFMT
numbers (default is %.6g).
OFS
output field separator (default is a space).
ORS
output record separator (default is a newline).
RLENGTH
The length of the string matched by the match function.
RS
record separator (default is a newline).
RSTART
the first position of the string matched by the match function.
SUBSEP
array subscript separator (default is /034).

Five. awk

operator Operator
description
= += -= *= /= %= ^= **=
assignment
?:
C conditional expression
||
logical or
&&
Logical AND
~ ~!
matches regex and does not match regex
< <= > >= != ==
relational operators
space
concatenation
+ -
addition, subtraction
* / &
multiplication, division and remainder
+ - !
unary addition, Subtraction and logical NOT
^***
exponentiation
++ --
increase or decrease, as a prefix or suffix
$
field references
in
array members

6. Records and fields
6.1. Records
       awk calls each newline-terminated line a record .
       Record separator: The default input and output separators are carriage returns, which are stored in the built-in variables ORS and RS.
       $0 variable: it refers to the entire record. Such as $ awk '{print $0}' test will output all records in the test file.
       Variable NR: A counter, the value of NR is incremented by 1 each time a record is processed.
       Such as $ awk '{print NR, $0}' test will output all records in the test file, and display the record number before the record.

6.2. Domains
       Each word in the record is called a "domain" and is separated by spaces or tabs by default. awk keeps track of the number of fields and stores the value in the built-in variable NF. Such as $ awk '{print $1, $3}' test will print the first and third space-separated columns (domains) in the test file.

6.3. Field Separator
       The built-in variable FS stores the value of the input field separator. The default is space or tab. We can modify the value of FS through the -F command line option. For example, $ awk -F: '{print $1,$5}' test will print the first and fifth columns separated by colons.
       Multiple domain separators can be used at the same time. In this case, the separators should be written in square brackets, such as $awk -F'[:/t]' '{print $1,$3}' test, which means spaces, colons and tab as separator.
       The delimiter of the output field is a space by default, which is stored in OFS. For example, $ awk -F: '{print $1,$5}' test, the comma between $1 and $5 is the value of OFS.

7. gawk-specific regular expression metacharacters

The following are dedicated to gawk and are not suitable for the Unix version of awk.
(1) /Y : matches an empty string at the beginning or end of a word.
(2) /B: Matches empty strings within words.
(3)/<: An empty string that matches the beginning of a word, anchored to the beginning.
(4) /> : matches an empty string at the end of a word, anchoring the end.
(5) /w : Match an alphanumeric word.
(6) /W : Match a non-alphanumeric word.
(7)/': Matches an empty string at the beginning of the string.
(8)/'

8. The matching operator (~) is
       used to match regular expressions in records or fields. For example, $ awk '$1 ~/^root/' test will display lines starting with root in the first column of the test file.

9. Compare the expressions
conditional expression1 ? expression2: expression3,
for example: $ awk '{max = {$1 > $3} ? $1: $3: print max}' test. If the first field is greater than the third field, $1 is assigned to max, otherwise $3 is assigned to max.
$ awk '$1 + $2 < 100' test. If the sum of the first and second fields is greater than 100, print these lines.
$ awk '$1 > 5 && $2 < 10' test, if the first field is greater than 5 and the second field is less than 10, print these lines.

10. Range Templates
       Range templates match all lines from the first occurrence of the first template to the first occurrence of the second template. If a template does not appear, match to the beginning or end. For example, $ awk '/root/,/mysql/' test will display all the lines between the first appearance of root and the first appearance of mysql.

Eleven. Example

1, awk '/101/' file Displays the matching lines containing 101 in the file file.
awk '/101/,/105/' file
awk '$1 == 5' file
awk '$1 == "CT"'

awk '$2 >5 && $2<=15' file

2, awk '{print NR,NF,$1,$NF,}' file Display the current record number, field number and the first and last of each line of the file file area.
awk '/101/ {print $1,$2 + 10}' file Display the first and second fields of the matching line in the file file plus 10.
awk '/101/ {print $1$2}' file
awk '/101/ {print $1 $2}' file Display the first and second fields of the matching line of file file, but there is no separator in the display time field.

3. df | awk '$4>1000000 ' Obtain the input through the pipe character, such as: display the line where the fourth field satisfies the condition.

4. awk -F "|" '{print $1}' file operates according to the new delimiter "|".
awk 'BEGIN { FS="[: /t|]" }
{print $1,$2,$3}' file modifies the input delimiter by setting the input delimiter (FS="[: /t|]").
Sep="|"
awk -F $Sep '{print $1}' file uses the value of the environment variable Sep as the separator.
awk -F '[ :/t|]' '{print $1}' file uses the value of the regular expression as the separator, here represents spaces,: , TAB, and | are used as separators at the same time.
awk -F '[][]' '{print $1}' file The value of the regular expression is used as the separator, here represents [,]

5, awk -f awkfile file The content of the file awkfile is controlled in turn.
cat awkfile
/101/{print "/047 Hello! /047"} -- print ' Hello! ' after encountering a matching line./047 represents a single quote.
{print $1,$2} -- Since there is no mode control, print the first two fields of each line.

6. awk '$1 ~ /101/ {print $1}' file displays the line (record) whose first field matches 101 in the file.

7. awk 'BEGIN { OFS="%"}
{print $1,$2}' file Modify the output format by setting the output separator (OFS="%").

8. awk 'BEGIN { max=100 ;print "max=" max}
       BEGIN indicates the operation performed before any line is processed.
{max=($1 >max ?$1:max); print $1,"Now max is "max}' file Get the maximum value of the first field of the file.

9. awk '$1 * $2 >100 {print $1}' file displays the line (record) whose first field matches 101 in the file.

10. awk '{$1 == 'Chi' {$3 = 'China'; print}' file After finding the matching line, replace the third field and then display the line (record).
awk '{$7 %= 3; print $7}' file divides the 7th field by 3, assigns the remainder to the 7th field and prints it.

11. awk '/tom/ {wage=$2+$3; printf wage}' file After finding the matching line, assign a value to the variable wage and print the variable.

12. awk '/tom/ {count++;}
END {print "tom was found "count" times"}' file

END means to process after all input lines are processed.

13. awk 'gsub(//$/,"");gsub(/,/,""); cost+=$4;
END {print "The total is $" cost>"filename"}' file
       gsub function is empty String replaces $ and, and then outputs the result to filename.
1 2 3 $1,200.00
1 2 3 $2,300.00
1 2 3 $4,000.00

awk '{gsub(//$/,"");gsub(/,/,"");
if ($4>1000&&$4<2000) c1+=$4;
else if ($4>2000&&$4<3000) c2+=$4;
else if ($4>3000&&$4<4000) c3+=$4;
else c4+=$4; }
END {printf "c1=[%d];c2=[%d] ;c3=[%d];c4=[%d]/n",c1,c2,c3,c4}"'





END {printf "c1=[%d];c2=[%d];c3=[%d];c4=[%d]/n",c1,c2,c3,c4}"' file
through exit in a certain Conditional exit, but still execute END operation.

awk '{gsub(//$/,"");gsub(/,/,"");
if ($4>3000) next;
else c4+=$4; }
END { printf "c4=[%d]/n",c4}"' file
skips the line through next in a certain condition, and executes the operation on the next line.

14. awk '{ print FILENAME, $0 }' file1 file2 file3>fileall
       Write all the file contents of file1, file2, and file3 to fileall in the format of print file and prepended file name.

15. awk ' $1!=previous { close(previous); previous=$1 }
{print substr($0,index($0," ") +1)>$1}' fileall
       re-splits the merged file into 3 document. and consistent with the original document.

16. awk 'BEGIN {"date"|getline d; print d}'
       sends the execution result of date to getline through the pipeline, assigns it to the variable d, and then prints it.

17. awk 'BEGIN {system("echo "Input your name://c""); getline d;print "/nYour name is",d,"/b!/n"}'
       Interactively input name through the getline command , and displayed.

       awk 'BEGIN {FS=":"; while(getline< "/etc/passwd" >0) { if($1~"050[0-9]_") print $1}}'
       prints in the /etc/passwd file The username contains a username of 050x_.

18. awk '{ i=1;while(i<NF) {print NF,$i;i++}}' file implements the loop through the while statement.
       awk '{ for(i=1;i<NF;i++) {print NF,$i}}' file implements the loop through the for statement.

type file|awk -F "/" '
{ for(i=1;i<NF;i++)
{ if(i==NF-1) { printf "%s",$i }
else { printf "%s/ ",$i } }}'
displays the full path of a file.

Display date with for and if
awk '




{
if (j==2&&i>28) flag=1;
if ((j==4||j==6||j==9||j==11)&&i>30) flag=1;
if ( flag==0) {printf "%02d%02d ",j,i}
}
}
}'

19. Single quotation marks must be used to call system variables in awk. If it is double quotation marks, it means the string
Flag=abcd
awk '{ print '$Flag'}' results in abcd
awk '{print "$Flag"}' results in $Flag

20. Other small examples
$ awk '/^(no|so)/' test----- print all Lines starting with pattern no or so.
$ awk '/^[ns]/{print $1}' test-----If the record starts with n or s, print the record.
$ awk '$1 ~/[0-9][0-9]$/(print $1}' test----- print this record if the first field ends with two digits.
$ awk '$1 == 100 || $2 < 50' test-----if the first or equal to 100 or the second field is less than 50, print the line.
$ awk '$1 != 10' test-----if the first Print the line if the field is not equal to 10.
$ awk '
$ awk '{print ($1 > 5 ? "ok "$1: "error"$1)}' test-----if the first field is greater than 5, print the value of the expression after the question mark, otherwise print the expression after the colon formula value.
$ awk '/^root/,/^mysql/' test---- prints records starting with the regular expression root to all records within the range of records starting with the regular expression mysql. If a new record starting with the regular expression root is found, continue printing until the next record starting with the regular expression mysql, or to the end of the file.



Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326990654&siteId=291194637