Use the disassembly tool IDA to view the context of the abnormal assembly code to assist in the analysis of C++ software exceptions

Table of contents

1 Overview

2. How to use IDA to open and view the assembly code of the binary file

3. Find the location of the assembly instruction that crashed in IDA

3.1. How to find the assembly instruction that caused the exception in IDA

3.2. Examples

4. Reading the assembly code context requires a certain basic assembly knowledge

5. Finally


VC++ common function development summary (column article list, welcome to subscribe, continuous update...) https://blog.csdn.net/chenlycly/article/details/124272585 C++ software exception troubleshooting series tutorial from entry to mastery (column article list , Welcome to subscribe, keep updating...) https://blog.csdn.net/chenlycly/article/details/125529931 C++ software analysis tools from entry to mastery case collection (column article is being updated...) https:/ /blog.csdn.net/chenlycly/article/details/131405795 C/C++ basics and advanced (column articles, continuously updated...) https://blog.csdn.net/chenlycly/category_11931267.html        in the analysis of C++ When the software crashes abnormally, you may need to use the IDA tool to view the assembly code of the exe or dll binary file to assist in locating the problem. Today we will discuss the use of IDA tools to view the details of assembly code.

1 Overview

       When we use Windbg to open the dump file to analyze the exception, we will first check the crashed assembly instruction and the value in the related register, and then check the function call stack of the thread where the exception is located. If necessary, check the local variables in the function in the function call stack or The value of the data member variable in the C++ class object is used to assist the analysis.

       However, in a small number of scenarios, the above analysis cannot finally locate the problem. It is necessary to use IDA to view the context of the assembly code and combine it with the C++ source code for further analysis. For example, in the following two scenarios, it is necessary to view the assembly code context to assist in the analysis:

1) The C++ code line number in the function call stack displayed in Windbg does not match the latest code

       The software version where the abnormal crash occurred may be several months or years ago. The line number displayed in Windbg is the code of the cpp file a long time ago. The latest cpp file code has been modified a lot compared to this problematic version, so The line numbers do not match the latest code at all. At this time, you need to use IDA to check the assembly code context of the module where the exception occurred, and see which line of code caused it. Generally, you need to compare it with the latest code to see which line of code is in the latest code.

2) There are multiple function calls on the C++ code line indicated in Windbg where the crash occurs, and it is difficult to directly determine which function call is causing the problem

        There are multiple function calls on the line of C++ code where the crash indicated in Windbg (such as a combination of multiple conditions in the if statement), it is difficult to directly determine which function call is faulty, you can check the assembly code to determine whether it is Which function calls the problem, such as the following if condition judgment statement:

if (pContainer->IsVisible() && GetTargetImplPtr()->IsReady() && pDataProcImpl->IsBuildFinish)
{
    // 代码省略
}

       For a detailed theoretical explanation of using IDA to view assembly code to assist in the troubleshooting of C++ software exceptions, please refer to an article written before: Use IDA to view the context of assembly code to assist in the troubleshooting of C++ software exceptions
https://blog.csdn.net/chenlycly /article/details/128942626 https://blog.csdn.net/chenlycly/article/details/128942626

2. How to use IDA to open and view the assembly code of the binary file

       After the IDA installation is complete, double-click to start the program, and the following prompt box will pop up:

Click "New" to create a new object. Then pop up to choose the file to open:

You can find the path of the target file, just open the target file. You can also click Cancel and drag the file directly into IDA. When opening a file, you will be asked to choose how to load the file:

For the binary number files of the Windows version, the PE file format is used, and the default PE method can be selected.

       Next, a prompt box will pop up whether to load the pdb file:

Choose Yes. It should be noted here that we need to place the pdb file in the same level directory as the target binary file in advance, so that IDA will search for the corresponding pdb file and load the pdb file when opening the binary file. With the symbols in the pdb file, the assembly code opened by IDA will display the specific function name and variable identification, as well as a large number of comment information.

       After opening the binary file, the default Graph view view mode (displaying the relationship between each code module) is as follows:

You need to right-click and click the Text view view mode in the pop-up right-click menu to switch to the assembly source code mode.

       We can jump to the specified function, click Jump-->Jump o function in the menu bar:

A list containing all functions of the current module will pop up, click the Search button at the bottom of the window:

Directly enter the name of the target function to be viewed, and after searching for the target function, double-click the entry to jump to the assembly code of the target function:

You can also press the shortcut key g to directly jump to the assembly code line at the specified address:

3. Find the location of the assembly instruction that crashed in IDA

       In Windbg, you can see the assembly instruction where the exception occurred, and the module where this assembly instruction is located, and then find the binary file corresponding to the module, open the binary file with IDA, and you can view the assembly code of the module.

3.1. How to find the assembly instruction that caused the exception in IDA

       In Windbg, you can see the address of the abnormal assembly instruction (code segment address). Through the address of this instruction, you can find the corresponding location in the assembly code opened by IDA, and then check the context of the assembly instruction at this location, and compare it with the C++ source code. , we can further analyze the problem.

       The address of the abnormal assembly instruction displayed in Windbg is the actual address after the main program runs, which is different from the static default address displayed in IDA. When the main program starts, it will first load the dll modules it depends on into the process space, and assign code segment addresses to each module, so that the assembly instructions in each module have the actual code segment addresses at runtime.
This place needs to distinguish between the code segment address and the data segment address:

The memory of the variable defined in the code is allocated on the data segment memory, and the memory address of the variable is the address of the data segment. The address of the binary code (assembly code) instruction is the address of the code segment.

       Although the actual running address of the abnormal assembly instruction is different from the static default address displayed in IDA, the location of the assembly instruction relative to the module is fixed, that is, the address offset of the assembly instruction relative to the module is always fixed. You can calculate the offset of the assembly instruction where the exception occurred relative to the module in Windbg, and then add this offset to the default starting address of the module displayed in IDA to get the address of the assembly instruction in IDA. Then go to this address, and you can see the assembly instruction that crashed.

3.2. Examples

       Let's use a specific example to explain how to find the abnormal assembly instruction in the assembly code of the module opened by IDA. I deliberately wrote a test code that would throw an exception as follows:

SHELLEXECUTEINFO* pInfo = NULL;

CString strTip;
strTip.Format(_T("cbSize: %d"), pInfo->cbSize );

::MessageBox( NULL, strTip, _T("提示"), MB_OK);

A structure pointer variable pInfo is defined in the code, initialized to NULL, and then a valid structure address is not assigned to the pointer, and this null pointer is directly used to access the member cbSize in the structure, so an address is accessed Small amount of memory, so a memory access violation is triggered.

       When the program runs the above code, it will crash and generate a dump file. Open the dump file with Windbg, and configure the pdb file path of the program in Windbg. After opening, you can see the assembly instruction that crashed abnormally and the values ​​in each register at that time, and see the exception code and exception type. As follows:

First of all, as can be seen from the above figure, what happened was an exception of Access violation memory access violation. Then see the assembly instruction mov ecx,dword ptr [eax] where the exception occurred (the address of the instruction is 0x00eb3787), and this instruction is located in the function CTestDlgDlg::OnBnClickedButton1 of the TestDlg module.

       Next, let's demonstrate how to find the location of the assembly instruction where the exception occurred in IDA. The abnormal assembly instruction is located in the TestDlg module, so use IDA to open the TestDlg.exe binary file and see the assembly code of this module. We first calculate the offset of the abnormal assembly instruction relative to the module where it is located. The address of the abnormal assembly instruction is 0x00eb3787, use the lm command to view the starting address (code segment address) of the module TestDlg where it is located, as follows:

The starting address of the TestDlg module is 0x00ea0000, so the offset of the assembly instruction where the exception occurs relative to the starting address of the module TestDlg is:

0x00eb3787 - 0x00ea0000

Then we go to IDA, drag the mouse to the top of the assembly code, and see the static default starting address of the TestDlg module displayed by IDA, as follows:

The static default starting address of the TestDlg module is 0x400000, and the addresses of all assembly instructions in this module are expanded based on this base value. Therefore, the address of the assembly instruction where the exception occurred is displayed in IDA:

0x00eb3787 - 0x00ea0000 + 0x00400000 = 0x00413787

Then press the g shortcut key, enter 413787 in the pop-up search box, click OK, and Go to the location of the assembly instruction where the exception occurred, as follows:

In this way, we can view the context of the abnormal assembly instruction, and combine the comments in IDA and the C++ source code to further analyze the problem.

4. Reading the assembly code context requires a certain basic assembly knowledge

       To read the context of assembly code, it is necessary to master certain basic knowledge of assembly, such as understanding the purpose of some common registers, familiarizing with some commonly used assembly instructions, understanding the stack distribution of function calls, and understanding the assembly code implementation of C++ virtual function calls (Secondary addressing when calling a virtual function), etc. Here is a brief mention of the purpose of commonly used registers:

In the X86 assembly instruction, EAX is mainly used to store the return value of the function call; when calling the C++ member function, the ECX register is used to pass the C++ object address; ESI is the source address register, and EDI is the destination address register, which is mainly used for memory In the copied string operation instructions, such as the assembly implementation of memcpy.

       Regarding the basic assembly knowledge that needs to be mastered to analyze C++ software exceptions, I won’t go into details here. You can refer to my previous articles:

Summary of assembly knowledge needed to analyze C++ software exceptions

5. Finally

       It is relatively difficult to read assembly code directly, unless you have a strong assembly language knowledge and disassembly ability. In actual work, we generally read the assembly code against the C++ code, and at the same time combine the comments in the context of the assembly code to assist in viewing, which is much easier than simply reading the assembly code directly. In addition, the compiler will optimize the code a lot under Release, and it is difficult to have a one-to-one correspondence between the optimized C++ code and the optimized assembly code, which requires attention.

       When we analyze C++ software exceptions, we simply use the IDA tool to open the exe or dll binary file with IDA to view the assembly code in the file (IDA will disassemble the binary machine code in the binary file into assembly code), to assist analyse problem. This article does not describe the functions of the IDA tool in detail. Interested friends can read the IDA classic book "IDA Pro Definitive Guide".

Binary machine code, assembly code is equivalent to binary machine code, assembly code is a mnemonic of binary machine code, and assembly code is very readable. What is executed in the CPU is binary machine code, which is equivalent to the execution of assembly code. By viewing the assembly code, you can see the specific execution details of the program.

Guess you like

Origin blog.csdn.net/chenlycly/article/details/132158574#comments_28048319