命令统计nginx日志

nginx的日志格式可能有多种多样,本文举例的nginx日志格式为:

http {
# …
log_format main '[] $remote_addr - r e m o t e u s e r [ remote_user [ time_local] “KaTeX parse error: Double superscript at position 36: … '̲status b o d y b y t e s s e n t " body_bytes_sent " http_referer” ’
‘“ h t t p u s e r a g e n t " " http_user_agent" " http_x_forwarded_for”’;
# …
}

我们使用log_format指令来指定日志文件的格式,以$开头的都是变量,这些变量的含义如下:

r e m o t e a d d r remote_addr 与 http_x_forwarded_for 用以记录客户端的ip地址;
$remote_user :用来记录客户端用户名称;
$time_local : 用来记录访问时间与时区;
$request : 用来记录请求的url与http协议;
$status : 用来记录请求状态;成功是200,
$body_bytes_s ent :记录发送给客户端文件主体内容大小;
$http_referer :用来记录从那个页面链接访问过来的;
$http_user_agent :记录客户端浏览器的相关信息

日志文件内容举例为:
[] 100.116.108.148 - - [13/Jul/2017:00:05:19 +0800] “POST /message/check HTTP/1.0” 200 89 “https://www.example.com/message/add” “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36” “36.57.161.201”
[] 100.109.253.3 - - [13/Jul/2017:00:12:16 +0800] “GET /statisticDaily/index HTTP/1.0” 200 37374 “https://www.example.com/statisticDaily/index” “Mozilla/5.0 (Linux; Android 5.1.1; vivo Xplay5A Build/LMY47V; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/53.0.2785.49 Mobile MQQBrowser/6.2 TBS/043305 Safari/537.36 MicroMessenger/6.5.10.1080 NetType/WIFI Language/zh_CN” “223.87.234.226”

统计nginx访问量最多的前100个url和频次

grep -E “POST|GET” /data/logs/nginx/2017/07/13/manage.access.log | awk -F ‘"’ ‘{print $2,$3}’ | awk ‘{print $2}’| sort | uniq -c | sort -k1nr | head -100
#输出:频次 请求路径
186405 /
148257 /home
132921 /ucenter/index
80749 /login
60431 /captcha
统计nginx访问状态码非200的前100个url和频次

grep -E “POST|GET” /data/logs/nginx/2017/07/13/manage.access.log | awk -F ‘"’ ‘{print $2,$3}’ | awk ‘{if ($4!=200) {print $4,$1,$2}}’ | sort | uniq -c | sort -k1nr | head -100
#输出:频次 状态 请求方法 请求路径
52573 302 GET /
16730 302 GET /submitlogin
16477 404 GET /apple-touch-icon-precomposed.png
15427 404 GET /apple-touch-icon.png
14408 302 GET /home

统计nginx访问不正常(状态码400+)的前100个url和频次

grep -E “POST|GET” /data/logs/nginx/2017/07/13/manage.access.log | awk -F ‘"’ ‘{print $2,$3}’ | awk ‘{if ($4>=“400”) {print $4,$1,$2}}’ | sort | uniq -c | sort -k1nr | head -100
#输出:频次 状态码 请求方法 请求路径
16401 404 GET /apple-touch-icon-precomposed.png
15483 404 GET /apple-touch-icon.png
6512 404 GET /apple-touch-icon-120x120-precomposed.png
5743 404 GET /apple-touch-icon-120x120.png
4118 499 POST /statisticTrade/rechargeDetail

统计nginx访问频次最高的100个Ip
grep -E “POST|GET” /data/logs/nginx/2017/07/13/manage.access.log | awk -F ‘"’ ‘{print $(NF-1)}’ | sort | uniq -c | sort -k1nr | head -100
#输出: 频次 ip
408982 111.127.132.32
252175 120.41.162.180
170169 61.148.196.162
168990 59.173.42.117
103752 123.116.99.75

uniq -c 命令输出统计词频
sort -k1nr 解释: -k指定以那个列排序 1表示第一列 n表示使用数字而非文本排序 r表示倒序

发布了187 篇原创文章 · 获赞 30 · 访问量 8万+

猜你喜欢

转载自blog.csdn.net/Rio520/article/details/103814625