Malicious code detection - have been combing see

Static Detection Technology:

  Advantages: can provide a test environment safer and faster.

  Cons: vulnerable wrapper and malicious code obfuscation techniques, affect some anti-disassembly, resulting in invalid static method.

  Main methods:

    n-gram features as byte code field for detecting malicious executable files; (ng expression refers to a sequence of n adjacent elements, and these elements can be bytes, instructions or other information about the software functions)

    Mining structural features windows executable file;

    Malicious software binaries visualized as a grayscale image, using an image processing technique and visualized classification malware;

    Extracting the configuration information from a malicious software program as a function call attribute FIG (Kong et al disassemble the sample, and the sample call graph is generated from assembly code for each function, six types of feature extraction. For each class characteristics, using discriminant distance measure cluster learning algorithm, a combination of weights obtained using the final model of malicious code detection method is susceptible to static wrapper and malicious code obfuscation techniques; a segment of disassembly, caused an invalid static method).

Dynamic detection technology:

  Advantages: can provide a test environment safer and faster.

  Cons: vulnerable to the impact of packaging and obfuscation techniques are usually required to decrypt the housing prior to analysis, and normalized.

  Main methods:

    opcodes n-gram

    API call sequence (Nair et al by [18] and variants thereof malicious code sequences dynamic analysis of API calls, malicious code is extracted from this short sequence shared API call as a signature, similar detection can be realized for malicious code .Chen and Fu [19] by dynamic analysis of malicious code to obtain a sequence of API calls, API calls API calls traversing short sequences having the same length, and then these short sequences into the vector, as malicious code signature .Firdausi et al [20 ] proposed a dynamic behavior of malicious code detection method of sample monitoring system call, and then converting the system call is reported as the vector space model feature extraction method using two methods: (1) represents a binary 1 or 0 wherein value; (2) using a frequency characteristic value indicating a system call by using a variety of machine learning classification algorithm for processing two features, the system call to obtain the frequency characteristic expressed by a slightly better effect .Ahmed et al dynamic API call sequence mining spatio-temporal information to malicious code detection. spatial information refers to the API call parameters and return values of statistics, including mean, square Difference, entropy, minimum and maximum. Time information refers to the transmission probability API call sequence above dynamic detection method using a fixed-length sequence of the system call, it is difficult to determine the length of the short sequence is reasonable, even for a optimum length value will lose a large amount of semantic information, i.e. the length of another sequence)

    Because of the need to achieve their malicious code needs to be done by means of API functions provided by the operating system, while the API call sequence can represent the behavior of malicious code and semantic information involved. API call sequence is divided into static and dynamic sequence of calls, access to documents under the premise of running the program does not require a static call sequence import table or disassembled file API call sequence, dynamic invocation sequence that is required to run in a virtual machine using the debugging program, etc. technology to obtain sequence of calls to the API interacting with the system. API call to obtain information as a feature, using a classification algorithm for testing, and finally reached a high degree of accuracy. However, due to malicious code hidden import call table API, making it impossible to get all API call information, resulting in a static API sequence as the feature detection efficiency malicious code is not high, but the malicious code need to complete their function in a timely manner hidden import table API calls, also with the operating system API interactions, thus Zhan gM [4] suspicious files in a virtual machine running a dynamic API obtained sequence, and calculates the distance and a normal file API sequence is detected as the feature. Dynamic characteristics obtained API call sequence detection techniques, it is necessary to obtain the feature to run malicious code, resulting in too much overhead, but for some able to detect the presence of viruses powerless virtual machine environment, and some malicious code using a layer-related behaviors obfuscation techniques, resulting in dynamic extraction API call sequence failed.    

    FIG. (Software control flow data flow diagram of a system call, the function call graph, and is achieved by detecting a malicious code, data mining and machine learning similarity measure) (26 is  Karbalaie et al proposed a call based on malicious code detection method of FIG mining system; and the system calls the method first sequence into FIG )

    FIG compiling a function call is a function call to one relationship between program static description, wherein nodes represent function call indicates a relationship between edge function, since the functions called by the main program of the library and system to decide, so the function calls FIG actual behavior of the program can provide effective static approximation, the program is structured representation for the local deformation based on the software source code or binary code robust, such a call graph by IDAPro sophisticated interactive disassembler generate.    

    FCG capture program calling relationships, where each vertex represents a local function. For each local function, we first convert it to an intermediate language, and then extracted six types of property. Which includes an operation code (opcode for each frequency of occurrence), API (API library number of times each function call), the memory (the number of memory read and write operations performed in this function), the IO (number) I / O read and write operation), registers (each register number of read and write operations) and in Flag (change the number of times each file). To each type of attribute, we denote as a feature vector associated with the local function.

Hybrid Detection

Biological Immunoassay

 

 

 

Improve classification accuracy, reduced FN, does not increase the load.    

 

2014 Drebin Effective and Explainable Detection of Android Malware in Your Pocket:

A collection of a lot of features, using machine learning methods classification.

CCS(CA)-2014-Semantics-Aware Android Malware Classification Using Weighted Contextual API Dependency Graphs:

Based on semantics, behavior graph,

Fan et al. - 2016 - Frequent Subgraph Based Familial Classification of Android Malware:

TDSC(JA)-2016-MADAM Effective and Efficient Behavior-based Android Malware Detection and Prevention:

Read a

DroidDetector: Android Malware Characterization and Detection Using Deep Learning

 

mali is interposed between an intermediate language code and Dalvik Java bytecodes. Many students may know, write Android APP commonly used Java language, get dex file after the Java compiler, to the Dalvik virtual machine execution. When reverse Android, for easier understanding of Dalvik bytecode or modified, i.e. the introduction of an intermediate language Smali. Java code, the code Smali relationship, similar to the Dalvik bytecode C code, assembly code, machine code relationship. Smali grammar is very simple, easy to understand, usually repackaged modified APP is also carried out at the level of Smali language. Specific rules concerning Smali syntax here do not do too much description, you can search for information on their own. We can use smali, baksmali tools to achieve conversion Smali code dex files.

 

Guess you like

Origin www.cnblogs.com/yvlian/p/11432376.html