WEBSHELL- malicious code detection

Static killing

Extracting feature written rule base, calls the rule base killing. Rule-based, will be faster, but the omission, false positives would be more obvious, the general Webshell word Trojan deformation will be more confusion.

yara Rule

        $eval = /(<\?php|[;{}])[ \t]*@?(eval|preg_replace|system|assert|passthru|(pcntl_)?exec|shell_exec|call_user_func(_array)?)\s*\(/ nocase  // ;eval( <- this is dodgy
        $eval_comment = /(eval|preg_replace|system|assert|passthru|(pcntl_)?exec|shell_exec|call_user_func(_array)?)\/\*[^\*]*\*\/\(/ nocase  // eval/*lol*/( <- this is dodgy
        $b374k = "'ev'.'al'"
        $align = /(\$\w+=[^;]*)*;\$\w+=@?\$\w+\(/  //b374k
        $weevely3 = /\$\w=\$[a-zA-Z]\('',\$\w\);\$\w\(\);/  // weevely3 launcher
        $c99_launcher = /;\$\w+\(\$\w+(,\s?\$\w+)+\);/  // http://bartblaze.blogspot.fr/2015/03/c99shell-not-dead.html
        $nano = /\$[a-z0-9-_]+\[[^]]+\]\(/ //https://github.com/UltimateHackers/nano
        $ninja = /base64_decode[^;]+getallheaders/ //https://github.com/UltimateHackers/nano
        $variable_variable = /\${\$[0-9a-zA-z]+}/
        $too_many_chr = /(chr\([\d]+\)\.){8}/  // concatenation of more than eight `chr()`
        $concat = /(\$[^\n\r]+\.){5}/  // concatenation of more than 5 words
        $concat_with_spaces = /(\$[^\n\r]+\. ){5}/  // concatenation of more than 5 words, with spaces
        $var_as_func = /\$_(GET|POST|COOKIE|REQUEST|SERVER)\s*\[[^\]]+\]\s*\(/
        $comment = /\/\*([^*]|\*[^\/])*\*\/\s*\(/  // eval /* comment */ (php_code)

Log Analysis

Based on the analysis of log data to detect suspicious behavior exploit, extract specific period of time a specific IP access to applications behavior.

①提交数据(POST/GET)的熵
②URI的访问频率
③请求头中有/无Referer字段
④提交数据(POST/GET)中key的出现频率
⑤请求数据(POST/GET)中key关联的页面数

Motion detection

Analysis Webshell detection layer, hook function suspicious

Feature dimensions:

  • Semantic Text (n-gram / TF-IDF / word2vec / CNN / RNN)
  • Statistical characteristics (entropy / coincident index / longest word / compressibility ratio)
  • ( "Distance" off times to calculate a single file / file creation process / file type / code style / directory with permissions and other files) historical data features
  • OP LAYER command (command / call chain / text feature parameter)
  • Dynamic characteristics (document reader / network connection, may rely on the ability to execute or bypass the sandbox resolve confusion based coding case)

Text-based file attributes

  • File creation time
  • File modification time
  • File permissions file
  • File owner of the file

statistics

  • Coincident index file index of coincidenc (IC)
  • Information entropy file
  • The longest word in the file
  • Compressible files than

project

https://github.com/nbs-system/php-malware-finder
https://github.com/404notf0und/AI-for-Security-Learning

reference

[1] https://www.cdxy.me/?p=788
[2] http://www.cnetsec.com/article/22593.html
[3] https://www.s0nnet.com/archives/fshell-feature-1

Guess you like

Origin www.cnblogs.com/17bdw/p/11920974.html