The practical operation of linux text processing (1)

Intercept nginx access log information:

“epg.interface.wsyv.topway.cn” “116.45.23.194” “-” “-” “[17/Jun/2020:00:00:27 +0800]” “GET /sztw/UserIndex?nns_func=get_collect_list_v2&nns_tag=26&nns_mac=da-cf-0f-82-b4-ed&nns_mac_id=da-cf-0f-82-b4-ed&nns_version=1.5.4.STB.FTTH.STD.DVB_SKW01.Release&nns_user_agent=nn_player%2Fstd%2F1.0.0&nns_user_id=5ae4556738b290a26ea3ab5342375813&nns_buss_id=1000052.006&nns_webtoken=6c7d705031a3af71b7c0367d197097fa&nns_output_type=xml&nns_device_id=8075588024934645 HTTP/1.1” “200” “4945” “Dalvik/1.6.0 (Linux; U; Android 4.4.2; Skyworth Box Build/KOT49H)” “10.73.129.223” “0.030” “-”
“epg.interface.wsyv.topway.cn” “11637.68.164” “-” “-” “[17/Jun/2020:00:00:27 +0800]” “GET /sztw/UserIndex?nns_func=get_playlist_v2&nns_tag=26&nns_mac=de-ca-4b-94-e5-e5&nns_mac_id=de-ca-4b-94-e5-e5&nns_version=1.5.4.STB.FTTH.STD.DVB_SKW01.Release&nns_user_agent=nn_player%2Fstd%2F1.0.0&nns_user_id=5a9b68df142ae758081163836bf774aa&nns_buss_id=1000052.004&nns_webtoken=32d33efa747b6c75a94419ae9a5b67e5&nns_output_type=xml&nns_device_id=8075588025916880 HTTP/1.1” “200” “5378” “Dalvik/1.6.0 (Linux; U; Android 4.4.2; Skyworth Box Build/KOT49H)” “10.49.194.219” “0.031” “-”

Ideas:

1)可以观察到nns_func=xxx前面是?号,后面是&符合,可以将这些特殊符号用“空格”替换,用xargs以空格为分界进行逐行输出,最后grep过滤
2)上面方法不奏效时:另外方法,将?&替换成\n换行,最后grep过滤

command:


1)cat  access_log | sed -e "s@&@ @g" -e "s@?@ @g" | xargs -n1  |  grep "nns_func="
2)cat  access_log | sed -e "s@&@\n@g" -e "s@?@\n@g" |  grep "nns_func="

Intercept log specified information (2020/6/6 task):

Want to filter out the logs of Jun 5 at 19:45:00-20:10:00

Log:

Jun  3 19:59:01 kvm_10_232_2_15 systemd: Started Session 7897536 of user root.
Jun  3 19:59:01 kvm_10_232_2_15 systemd: Starting Session 7897536 of user root.
Jun  3 19:59:01 kvm_10_232_2_15 systemd: Started Session 7897537 of user root. 
Jun  4 19:59:01 kvm_10_232_2_15 systemd: Starting Session 7921253 of user root.
Jun  4 19:59:01 kvm_10_232_2_15 systemd: Started Session 7921250 of user root.
Jun  5 19:51:32 kvm_10_232_2_15 iscsid: iscsi_login_eh session reopen count ++
Jun  5 19:51:32 kvm_10_232_2_15 iscsid: resolve_address, host:127.0.0.1 port:3260 last_ip:127.0.0.1
Jun  5 19:51:32 kvm_10_232_2_15 iscsid: host is ip: 127.0.0.1

command:


cat rizhi.txt|sed -e "s/://g" | awk '{if ($1 == "Jun" && $2 == "5") print $0}' | awk '{if ($3 > 194500 && $3< 201000) print $0}'

Ideas:


第三列是代表日志的时间,要取时间为:19:45:00到20:10:00的日志,做法将:替换为空,用数字来对比
第一列是代表月,第二列代表日,awk '{if ($1 == "Jun" && $2 == "5") print $0}' 输出:jun 5这天日志,print $0代表输出整行
第三列要求去19:45:00到20:10:00的日志,则用awk '{if ($3 > 194500 && $3< 201000) print $0}'来过滤

Guess you like

Origin blog.csdn.net/weixin_43010385/article/details/113064876