Share an emergency response web log: access.log file analysis gadget

Sometimes when doing emergency response, it is necessary to extract web logs such as access.log log files to analyze the specific reasons for system attacks. Since open source tools are not very easy to use, I wrote a simple log analysis tool in Python3.

First introduce the access.log log

The access.log log file records all the target's access requests to the web server. When a client visits the website, the access.log will generate an access log.

log format

An access log is generally divided into 7 fields

1.202.114.41 - - [09/Nov/2020:11:08:23 +0800] "GET / HTTP/1.1" 404 146 " https://www.baidu.com/link?url=jBUa" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:82.0) Gecko/20100101 Firefox/82.0"

1.202.114.41 represents who is accessing the server

[09/Nov/2020:11:08:23 +0800] indicates the time of the server when accessing the server. +0800 indicates that the server's time zone is 8 hours behind UTC.

GET / HTTP/1.1 request method and access path

404 is the status code of the server response, this information is very valuable, it reveals whether the request was successful or not.

146 indicates the number of bytes sent by the server to the client, but this number of bytes does not include the information of the response header. If the server does not send any content to the client, the value is "-"

https://www.baidu.com/link?url=jBUa The request source is used to indicate the page that the viewer browsed before visiting the page. Only the request linked from the previous page will have this output. If If it is a newly opened page, this item is empty. In the above example, the source page is transferred from Baidu, that is, the user clicks in from the link on Baidu.

"Mozilla/5.0 xx" indicates the UserAgent of the user terminal browser

Common web vulnerability attack logs

Through the web log, it can be judged whether the target is a malicious request or a normal request. Dangerous characters are carried in the following request path

172.16.2.1 - - [09/Feb/2023:17:57:02 0800] "GET /sqli-labs-master/Less-1/?id=1' order by 3 -- HTTP/1.1" 200 721（SQL注入）

172.16.2.1 - - [09/Feb/2023:17:57:18 0800] "GET /sqli-labs-master/Less-1/?id=1' and sleep(5) -- HTTP/1.1" 200 670 （SQL注入）

172.16.2.1 - - [09/Feb/2023:18:01:19 0800] "GET /sqli-labs-master/Less-1/?id=<script>alert(11)</script> HTTP/1.1" 200 670 （xss攻击）

...

Regular rules can be used to match whether there is any attack behavior in each request, and the fingerprint library finger can be expanded at any time according to the needs

log_tool.py

import re, os, argparse
from urllib.parse import unquote
from colorama import init,Fore,Back
init(autoreset=True)

finger = {
    "命令执行攻击":"/dev/tcp|call_user_func|preg_replace|proc_popen|popen|passthru|shell_exec|exec|/bin/bash|call_user_func_array|assert|eval|fputs|fopen|base64_decode|wget|curl.*ifs|uname|think.*invokefunction|whoami|ifconfig|ip add|echo|net user|phpinfo|jndi:|rmi:|\${",
    "sql注入攻击":"sleep|union|concat|information_schema|table_name|extractvalue|updatexml|order by|sqlmap|md5\(",
    "xss攻击":"<script|img src=|imgsrc=|document\.domain|prompt|alert\(|confirm\(|javascript:|Onerror|onclick",
    "webshell连接":"shell\.asp|shell\.jsp|shell\.jspx|shell\.php|cs\.php|tomcatwar\.jsp",
    "敏感文件攻击":"\.ssh/id_dsa|\.\./|\.\.|/etc/passwd|\.bash_profile|db\.sqlite|/win\.ini|wp-config\.php|\.htaccess|\?pwd|heapdump|/\.git"
}

data_list  = {
    '命令执行攻击':[],
    'sql注入攻击':[],
    'xss攻击':[],
    'webshell连接':[],
    '敏感文件攻击':[]
}

def get_parser():
    logo = r"""

      ______  _____________   ____  
     /  ___/ /  ___/\_  __ \_/ ___\ 
     \___ \  \___ \  |  | \/\  \___ 
    /____  >/____  > |__|    \___  >
         \/      \/              \/ 

                            Author: 山山而川
                            Blog  : https://chenchena.blog.csdn.net/?type=lately
    """
    parser = argparse.ArgumentParser(usage='python log_tool.py 日志文件')
    print(logo)
    print("正在分析日志信息，请稍等..."+"\n")
    p = parser.add_argument_group('log_tool.py的参数')
    p.add_argument("logName", type=str, help="为.log日志文件")
    args = parser.parse_args()
    return args

def extract(filename):
    with open(filename,'r',encoding='utf-8') as file:
        for line in file:                                #获取每一条日志信息
            line = unquote(line[:-1], 'utf-8')
            for k,v in finger.items():                     #遍历每一条指纹信息
                result = re.search(v,line,re.I)
                if result:
                    data = line + Fore.RED+"  匹配指纹[%s]"%result.group()
                    if k == "命令执行攻击":
                        rce = data_list.get('命令执行攻击')
                        rce.append(data)
                        break
                    if k == "sql注入攻击":
                        sql = data_list.get('sql注入攻击')
                        sql.append(data)
                        break
                    if k == "xss攻击":
                        xss = data_list.get('xss攻击')
                        xss.append(data)
                        break
                    if k == "webshell连接":
                        webshell = data_list.get('webshell连接')
                        webshell.append(data)
                        break
                    if k == "敏感文件攻击":
                        file = data_list.get('敏感文件攻击')
                        file.append(data)
                        break

    outfileName = filename.rsplit(".",1)[0] + "_result.txt"
    if os.path.exists(outfileName):
        os.remove(outfileName)
    for attack_name,attack_record in data_list.items():
        if attack_record:
            output = '疑似存在"%s":'%attack_name
            print(Fore.YELLOW+output)
            with open(outfileName,'a',encoding='utf-8') as f:
                f.write(output+"\n")
            for recode in attack_record:
                if "200" in recode:
                    print(recode + Fore.GREEN + " 响应码200")
                    with open(outfileName,'a',encoding='utf-8') as f:
                        f.write(recode + " 响应码200""\n")
                else:
                    print(recode)
                    with open(outfileName, 'a', encoding='utf-8') as f:
                        f.write(recode + "\n")
            with open(outfileName, 'a', encoding='utf-8') as f:
                f.write("\n")
            print("")
if __name__ == '__main__':
    filename = get_parser().logName
    extract(filename)

The output will be saved locally by default

Share an emergency response web log: access.log file analysis gadget

Guess you like