1. The log format of nginx
View the nginx.conf file , the default format is as follows:
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
field description
Let's look at the contents of the access log in detail:
223.104.41.37 - - [05/Jul/2022:13:34:20 +0800] "GET /api/book/info?bookId=123 HTTP/1.1" 200 14632 "http://www.zzz.com.cn/archive?bookId=123" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36"
explain:
Remote Host IP Address Access Time Time Zone Method Resource Protocol Status Code Sent Bytes Referer Browser Information
2. Access.log file location
Contents in nginx.conf :
access_log /var/log/nginx/access.log main;
Explain that our log location is under /var/log/nginx.
3. Log analysis:
1. Count the top 5 access IPs
# awk '{print $1}' /var/log/nginx/access.log | sort | uniq -c | sort -nr | head -5
7093 183.152.124.55
3719 218.108.36.18
1797 115.220.140.234
1545 112.10.236.137
1141 183.228.110.80
2. Statistically specify the access IP of a certain day
# grep "18/May/2022" /var/log/nginx/access.log | awk '{print $1}' | sort | uniq -c | sort -nr | head -5
755 112.10.236.127
358 223.94.216.200
348 116.30.149.23
283 140.243.118.204
270 183.253.242.192
# awk '/18\/May\/2022/ {print $1}' /var/log/nginx/access.log | sort | uniq -c | sort -nr | head -5
755 112.10.236.127
358 223.94.216.200
348 116.30.149.23
283 140.243.118.204
270 183.253.242.192
When the file is large, it is recommended to grep first and then awk, which is much faster.
3. Statistically specify resources
Handle lines where the 7th field ends with '.html'
# awk '$7 ~ /\.html$/ {print $1,$7,$9}' /var/log/nginx/access.log
14.104.225.143 /web/common/success.html 200
219.153.191.189 /web/common/success.html 200
152.32.189.96 /mtja.html 200
152.32.189.96 /index.html 200
152.32.189.96 /login.html 200
152.32.189.96 /mindex.html 200
4. Filter URLs
$ awk '{print $11}' /var/log/nginx/access.log | sort | uniq -c | sort -nr | head -5
12133 "http://www.zzz.com.cn/translation"
7550 "http://www.zzz.com.cn/applicationAdd"
4255 "http://www.zzz.com.cn/search"
2565 "http://www.zzz.com.cn/request"
2257 "http://www.zzz.com.cn/order"
5. Statistical flow
$ grep "03/Jul/2022" /var/log/nginx/access.log | awk '{sum+=$10} END{print sum}'
54827188
6. Statistical status code
$ awk '{print $9}' /var/log/nginx/access.log | sort | uniq -c | sort -nr | head -10
77065 200
2933 304
1519 400
148 405
106 206
65 499
9 173
5 408
2 504
2 404