On Linux production environment, the most commonly used set of "AWK" skills

hi, Hello, everyone, little sister the taste of the most useful series ended. I remember a lot of forwards, point like oh.

The most useful series:

"On Linux production environment, the most commonly used set of" vim "skills"

"On Linux production environment, the most commonly used set of" Sed "skills"

"On Linux production environment, the most commonly used set of" AWK "skills"

"" Sed "advanced features: my little head around almost fainted."

Software name dare to do with their name, have a very strong self-confidence. For example, Yin language or something.

awkThe name is derived from its three founders surname initials else, all 80来岁of the grandfather. Of course, there are four combinations: popular GoF design patterns. But for me this game lovers, thought turned out to be the Trinity, it really is disappointing ah.

It is long like C, why so famous, in addition to its powerful features, we just assume that athe letter of it quite early. awkThan sedsimple, it is more like a programming language.

Print a column

Next, the effect of these lines is substantially the same: The first column of the print file.

#Java
System.out.println(aStr.split(" ")[0]);

#Python
print(aString.split(" ")[0])

#cut 命令
cut -d " " -f1   file

#awk命令
awk '{print $1}' file
复制代码

This is probably the most commonly used features of awk: a column in the print file . It's smart to segmentation of your data, whether it is 空格, or TAB, a high probability that you want.

For this csv files, delimited characters are ,. AWK use -Fparameters to specify. 1 and 2 the following code print csv file.

awk -F ","  '{print $1,$2}' file
复制代码

Thus, we can see a fundamental part of awk commands.

General development language, standard array is 0 at the beginning, but awk column $is 1started, and 0refers to the original string.

Network Status Statistics

In this section, the use of some of the network status awk command netstat statistics, look at the basic elements of the awk language. netstat output similar to:

Wherein the Column 6, indicating the network status in the network connections. We first give awk command, look at the statistics.

netstat  -ant | 
awk ' \
    BEGIN{print  "State","Count" }  \
    /^tcp/ \
    { rt[$6]++ } \
    END{  for(i in rt){print i,rt[i]}  }'
复制代码

The output is:

State Count
LAST_ACK 1
LISTEN 64
CLOSE_WAIT 43
ESTABLISHED 719
SYN_SENT 5
TIME_WAIT 146
复制代码

The picture below will be described in detail with the above command, I hope you understand the essence of awk.

At first glance, a good command scary, but very simple. awk program and we usually are not the same, it is divided into four sections.

1, BEGIN beginning , optional. Used to set parameters, the output number of the header, define some variables. The above command only prints one line of information only.

2, the END end portion , optional. Used to calculate the summary of some logic, or output the content. The above command, using a simple for loop, the output of the contents of the array rt.

3, Pattern matching part , is still optional. To match some lines need to be addressed. The above command line, tcp matches only at the beginning, the other does not enter the process.

. 4, the Action module . The main body of logic, row processing, printing statistics, can be.

important point

1, the main portion awk single quotes' surrounded by double quotes is not 2, index starting awk column is 0, not 1

example

We from a few simple examples, look at the role of awk.

1, Recv-Q output is not recorded 0

netstat -ant | awk '$2 > 0 {print}'
复制代码

2, external network connections, according to the packet ip

netstat -ant | awk '/^tcp/{print $4}' | awk -F: '!/^:/{print $1}' | sort | uniq -c
复制代码

3, physical memory footprint print RSS

top -b -n 1 | awk 'NR>7{rss+=$6}END{print rss}
复制代码

, Filtered (removed) blank line

awk 'NF' file
复制代码

5, print odd-numbered lines

awk 'a=!a' file
复制代码

6, the number of output lines

awk 'END{print NR}' file
复制代码

These commands are required to understand some of the internal variables awk, then we introduce.

Built-in variables

FS

The following two commands are equivalent.

awk -F ':'  '{print $3}' file
awk 'BEGIN{FS=":"}{print $3}' file
复制代码

** BEGIN block FSis internal variables, or output can be specified directly. ** If you file both with ,separated, but also useful :segmentation, FS can even specify multiple delimiters operate simultaneously.

FS="[,:|]"
复制代码

other

OFS specified delimiter output content, a lot number of columns when the simplified operation. Similar command:

awk -F ':' '{print $1,"-",$2,"-",$4}' file
awk 'BEGIN{FS=":";OFS="-"}{print $1,$2,$4}' file 
复制代码

NF series. Useful, for example, the contents of some of the series does not satisfy the filter condition.

awk -F, '{if(NF==3){print}}' file
复制代码

NR row number, for example, the following two commands are equivalent.

cat -n file
awk '{print NR,$0}' file
复制代码

RS record separator mark ORS specified output record separator flag

FILENAME file name currently being processed, is very useful when processing multiple files at one time

Programming language features

computation

As can be seen from the above code, awk can do some simple calculations. Its language is simple and does not need to define the variable type is displayed.

Such as the above rt[$6]++, it has been defined by default called a hash of rt (array?), Which is the key network status, and value can be carried out operations (+ - * /%).

Includes some built-in math (limited)

int
log
sqrt
exp
sin
cos
atan2
rand
srand
复制代码

String Manipulation

Similar to other languages, awk also built a lot of string manipulation functions. It has always been working with strings, it must be strong.

length(str) #获取字符串长度
split(input-string,output-array,separator)
substr(input-string, location, length)
复制代码

Language features

awk is a small programming language, look at its basic syntax, if you need a bit more complex logic, your own insight, including some time handler:

# logic
if(x=a){}
if(x=a){}else{}
while(x=a){break;continue;}
do{}while(x=a)
for(;;){}

# array
arr[key] = value
for(key in arr){arr[key]}
delete arr[key]

asort(arr) #简单排序
复制代码

It is said, awk qualified for all text manipulation. Because it itself is a language ah.

End

I had written a sophisticated use awk log processing and statistics. Although written more than sedhappy a lot, but still suffers. There are more on a variety of nawk, the differences between gawk version, so complexity is a growth business, habitually turn more concise, more comprehensive tool python.

awk text processing is extremely simple and convenient, the most commonly used type is a column of printing, including some formatted output. For awk, to simple verse, complex familiar, after all, some 大牛, like to write such a script yet.

More exciting articles.

"Micro service not all, only a subset of the domain-specific"

"" Sub-library sub-table "? Selection process and be careful, otherwise they will be out of control. "

So much monitoring component, there is always a right for you

"Kafka Messaging System Basics Index"

"Using Netty, in the end what we have in development? "

Five-piece like Linux.

"Linux of the" Cast Away "(a) prepare papers"

"Linux of the" Cast Away "(b) CPU articles"

"Linux of the" Cast Away "(c) Memory articles"

"Linux of the" Cast Away "(four) I / O chapter"

"Linux of the" Cast Away "(e) Network articles"

Reproduced in: https: //juejin.im/post/5d0b4ca1f265da1bbc6fdb9e

Guess you like

Origin blog.csdn.net/weixin_34072458/article/details/93177628