Linux: AWK basis

AWK is a powerful text analysis tool, be particularly useful Linux command system, and log analysis, play an especially important role in the analysis of file contents.

AWK Description

Awk is to simply read the file line by line, in a specified separator dividing each line, part of the divided various further analysis.
Description look under the command AWK

Built-in variables Explanation
$0 The current record (this variable to store the contents of an entire row)
$1 $n N-th current record fields, fields are separated by FS between
FS The default input field separator is a space or Tab
NF The number of fields in the current record is the number of columns
NO The number of records have been read out, is how many rows
FNR The current number of records, and NR difference is that this value will each own file line number
RS Input record separator, default newline
OFS Output field separator, the default is a space
ORS Output record separator, default newline
FILENAME Enter the name of the current file

AWK use

Look website access.log.

tail -f /home/wwwlogs/access.log
148.70.179.32 - - [15/Nov/2019:05:46:28 +0800] "POST /wp-cron.php?doing_wp_cron=1573767987.5338680744171142578125 HTTP/1.1" 200 31 "http://www.test.com.cn/wp-cron.php?doing_wp_cron=1573767987.5338680744171142578125" "WordPress/5.0.7; http://www.test.com.cn"
220.181.108.143 - - [15/Nov/2019:05:46:28 +0800] "GET / HTTP/1.1" 200 5596 "-" "Mozilla/5.0 (Linux;u;Android 4.2.2;zh-cn;) AppleWebKit/534.46 (KHTML,like Gecko) Version/5.1 Mobile Safari/10600.6.3 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
111.206.198.18 - - [15/Nov/2019:05:46:28 +0800] "GET /wp-includes/css/dist/block-library/style.min.css?ver=5.0.7 HTTP/1.1" 200 25658 "http://www.test.com.cn/" "Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1 (compatible; Baiduspider-render/2.0; +http://www.baidu.com/search/spider.html)"

IP access log under the print list

awk -F" " '{print $1}' /home/wwwlogs/access.log
148.70.179.32
91.228.8.210
1.119.148.54
121.51.40.28
1.119.148.54
1.119.148.54

Plus the file name to IP ranks No.

awk -F" " '{print FILENAME"|"NR"|"NF"|"$0}' /home/wwwlogs/access.log
/home/wwwlogs/access.log|9979|12|150.109.77.71
/home/wwwlogs/access.log|9980|20|150.109.77.71
/home/wwwlogs/access.log|9981|20|150.109.77.71
/home/wwwlogs/access.log|9982|22|156.220.107.221
/home/wwwlogs/access.log|9983|22|138.204.135.251
/home/wwwlogs/access.log|9984|13|148.70.179.32
/home/wwwlogs/access.log|9985|18|148.70.243.161
/home/wwwlogs/access.log|9986|18|148.70.243.161
/home/wwwlogs/access.log|9987|18|148.70.243.161
/home/wwwlogs/access.log|9988|12|201.174.10.7
/home/wwwlogs/access.log|9989|13|148.70.179.32
/home/wwwlogs/access.log|9990|23|220.181.108.143
/home/wwwlogs/access.log|9991|31|111.206.198.18
/home/wwwlogs/access.log|10000|13|170.238.36.20

Print access logs HTTP status code

awk -F" " '{print $9}' /home/wwwlogs/access.log
404
404
404
301
200
200
200
301
200
301
301
200
200
301
200
404

To several complex points, the status code printing Distribution and sorted according to size
?????
wk -F "" '{}. 9 Print $' /home/wwwlogs/access.log | Sort | the uniq -C | Sort -NR
4939 404
4497 200 is
332 301
120 499
36 400
32 "-"
18 is 166
16 403
. 9 405
···
may look distributed IP access TOP, to analyze whether the IP site crawling

awk -F" " '{print $1}' /home/wwwlogs/access.log | sort | uniq -c | sort -nr | head -50
    913 111.231.201.221
    912 140.143.147.236
    908 106.13.83.26
    906 54.179.142.122
    668 185.234.217.115
    664 148.70.179.32
    275 125.76.225.11
    240 123.151.144.37
    110 61.241.50.63
    108 101.89.19.140
    102 59.36.132.240
     69 182.254.52.17
     42 61.162.214.195
     39 183.192.179.16
     39 148.70.46.47
     38 14.18.182.223
     38 103.119.45.49
     27 58.251.121.186
     26 68.183.147.213
     26 59.36.119.227
     26 51.83.234.51
     24 144.91.94.150

The flow through the lower access log is calculated daily, estimated the bandwidth needs of the future

awk -F" " 'BEGIN {sum=0} {sum=sum+$10} END {print sum/1024/1024"M"}' /home/wwwlogs/access.log
38.7885M

AWK description of the BEGIN END, this is well understood
BEGIN {} statements before execution of
{processing} statement is executed for each line of
the END statement {Following all the rows to be executed}

awk routine maintenance in the system should be the most used commands. Particularly simple access log analysis results can be obtained by AWK want to feel under most scenarios.

Guess you like

Origin www.cnblogs.com/feixiangmanon/p/12000517.html