Use Windbg to analyze the dump file automatically generated by the system found in the system application log to troubleshoot the problem

Table of contents

1. Try to attach Windbg to the target process for dynamic debugging, but Windbg does not capture

2. Found the dump file automatically generated by the system when the program is abnormal in the system application log

2.1. View the entry of the application log

2.2. Find the dump file automatically generated by the system in the application log

3. Use Windbg to statically analyze the dump file

3.1. Find the pdb file of the relevant module in the function call stack, and set the pdb file path to Windbg

3.2. Check the detailed function call stack and analyze it against the C++ source code

4. Summary


VC++ common function development summary (column article list, welcome to subscribe, continuous update...) https://blog.csdn.net/chenlycly/article/details/124272585 C++ software exception troubleshooting series tutorial from entry to mastery (column article list , Welcome to subscribe, keep updating...) https://blog.csdn.net/chenlycly/article/details/125529931 C++ software analysis tool case collection (column article is being updated...) https://blog.csdn .net/chenlycly/article/details/131405795 C/C++ basics and advanced (column articles, continuously updated...) https://blog.csdn.net/chenlycly/category_11931267.html        Recently, a user feedback, in When running our software on its Win10 computer, frequent crashes and flashbacks will occur, and the system will pop up a prompt that the program has stopped running. We tried to attach Windbg to the target process for analysis, but Windbg did not perceive it. Later, we checked the application log in the system to find the dump file automatically generated by the system, and then analyzed the dump file to troubleshoot the problem. This article explains in detail the complete troubleshooting process of this problem.

1. Try to attach Windbg to the target process for dynamic debugging, but Windbg does not capture

       According to user feedback, when running our software on their Win10 computers, there will be frequent crashes and flashbacks, and the system will pop up a prompt that the program has stopped running, as shown below:

The exception capture module installed in the program did not perceive the crash, and did not generate a dump file containing the exception context.

       Since no dump file is generated, let's try to attach Windbg to the target process for dynamic debugging to see if Windbg can catch the abnormal sound of the program. So let the user attach Windbg to the program when starting the program, and then reproduce the problem according to the previous problem scenario. Generally, when using Windbg dynamic debugging, if the program sounds abnormally, Windbg will sense it and stop it, so that the problem can be solved. Analyzes.

       But this problem is a bit different. When the problem reappeared, Windbg did not perceive it immediately, and the system immediately popped up a prompt that the program had stopped running. When the prompt window pops up, Windbg is also frozen and cannot be operated. This is the first time I have encountered such a scene, it is very strange!

2. Found the dump file automatically generated by the system when the program is abnormal in the system application log

        In the daily analysis of abnormal software crashes, Windbg is basically used for analysis. Either use Windbg to statically analyze dump files, or attach Windbg to the process for dynamic debugging. But for this problem, static analysis and dynamic debugging are not feasible, which is more difficult! Then it occurred to me that we could go to the system's application log to see if we could find some clues.

2.1. View the entry of the application log

       We use the remote software to remote to the user's computer. On the desktop of the Win10 system, right-click "This Computer", and click the "Management" menu item in the pop-up right-click menu, as shown below:

After opening the Computer Management window, under the System Tools node, expand the Event Viewer node, continue to expand the Windows Log node under this node, and then click the Application node, so that the system logs related to the application are displayed on the right, as shown below :

        When an exception occurs in a general program, the system senses it and automatically generates a log related to it. So according to the time point when the program went wrong, find the log record corresponding to the time point in the application log list. The problem occurred at about 2023/07/28 14:55, and the log record of this time was found in the application log list, as shown above.

2.2. Find the dump file automatically generated by the system in the application log

        So click on the record at the above time point, and see the description of the problem in the detailed information below:

Error Storage Segment, Type 0
Event Name: BEX
Response: Unavailable
Cab Id: 0

Problem Signature:
P1: XXXXXXX.exe (haha, this place makes the program name anonymous)
P2: 7.0.0.3
P3: 648a1cec
P4: ucrtbase. dll
P5: 10.0.10586.0
P6: 5632d166
P7: 00083472
P8: c0000409
P9: 00000005
P10: 

Additional file:
C:\Users\Kvs\AppData\Local\Temp\WERF6BC.tmp.WERInternalMetadata.xml
C:\Users\Kvs\ AppData\Local\Temp\WER53E1.tmp.appcompat.txt
C:\ProgramData\Microsoft\Windows\WER\ReportQueue\AppCrash_XXXXXXX.exe_a57edd93ecda26986177f6523e6d19d6506eb2_d05b1441_cab_15ff59bc\memory.hd mp
C:\ProgramData\Microsoft\Windows\WER\ReportQueue\AppCrash_XXXXXXX.exe_a57edd93ecda26986177f6523e6d19d6506eb2_d05b1441_cab_15ff59bc\triagedump.dmp WERGenerationLog.txt
The

files are available here:
C:\ProgramData\Microsoft \Windows\WER\ReportQueue\AppCrash_XXXXXXX.exe_a57edd93ecda26986177f6523e6d19d6506eb2_d05b1441_cab_15ff59bc

analysis symbols: 
Recheck Solution: 0
Report Id: 08632728-2da3-493f-a30c-ffd617076fc3
Report Status: 96
Hash Bucket: 

In the log description information, first you can see the segment fault that occurred, and then you can see the full path of the dump file automatically generated by the system:

C:\ProgramData\Microsoft\Windows\WER\ReportQueue\AppCrash_XXXXXXX.exe_a57edd93ecda26986177f6523e6d19d6506eb2_d05b1441_cab_15ff59bc\triagedump.dmp

Then copy the dump file to this path. This should be automatically generated by the system when a program exception is detected. It should be a dump file containing exception context information when a program exception occurs. Using Windbg to statically analyze this dump file should be able to analyze the problem.

3. Use Windbg to statically analyze the dump file

        Use Windbg to open the dump file, first enter the .ecxr command to switch to the thread where the exception occurred, and then enter the kn command to view the function call stack of the thread, as shown below:

The specific function name and line number of the code cannot be seen in the function call stack because the pdb symbol file of the module in the function call stack is not loaded.

3.1. Find the pdb file of the relevant module in the function call stack, and set the pdb file path to Windbg

        From the current function call stack, we can see that three modules are involved: xxlogdll.dll, xxxpsdll.dll and microblogdll.dll, so we need to get the pdb symbol files of these three modules. Use the lm command to view the timestamps of the three module binary files, and use the timestamp to find the pdb symbol file on the file server. Taking microblogdll.dll as an example, the command to view the module information is: lm vm microblogdll*, and the printed module information is as follows:

It can be seen from the figure that the timestamp (generation time) of the current microblogdll.dll file is 09:56:21 on June 13, 2023, so at this point in time, find the corresponding folder on the file server. Find the pdb symbol file corresponding to the time point.

        Find the pdb symbol files of the above three modules, and copy them to the desktop path C:\Users\Administrator\Desktop\pdbdir, and then set the pdb file path to Windbg. The set path is as follows:

C:\Users\Administrator\Desktop\pdbdir;srv*f:\mss0616*http://msdl.microsoft.com/download/symbols
Among them, C:\Users\Administrator\Desktop\pdbdir is the pdb symbol file of the business library path;

srv*f:\mss0616*http://msdl.microsoft.com/download/symbols is the online download server address of the Microsoft system library pdb, where http://msdl.microsoft.com/download/symbols is the online download server address, f:\mss0616 is the temporary storage address for downloading the pdb file from the online server.

        The reason why we need to set the pdb online download server address of the Windows system library is because the above function call stack contains the module ucrtbase.dll of the system library, and we need to check the specific function calls in the system module. Sometimes if we can see the specific function calls in the system modules, it will be more helpful for us to analyze the problem.

3.2. Check the detailed function call stack and analyze it against the C++ source code

        After loading the pdb file, you can see the complete function call stack (you can see the specific function name and the line number of the C++ code), as shown below:

Judging from the interface ucrtbase!_invalid_parameter of the system library called at the top, an invalid parameter was generated in the program, and the invalid parameter caused a program exception. Looking up along the function call stack, it is an invalid parameter caused by calling the C function vsnprintf function to format the data. Of course, it is definitely not a problem in the system C function vsnprintf. Continue to look up, the above are the interfaces for printing logs, and then continue upwards, and finally find the upper-level function lohttp::CloHttp::OnHttpStackCb that calls the interface for printing logs. The relevant code slices are as follows:

In the above code, to print the character string in a char buffer as a log, the MiCROBLOG_LOG interface is called, the format character %s used, and the parameter p to be formatted should also match. It seems that there is no question. If there is a problem, it may be that there is a problem with the data in the memory pointed to by p, which causes an exception when the underlying function is formatted.

       This code fragment is located in the module maintained by the component group, so the above function call stack and related information are sent to the colleagues in the component group to let them continue to investigate. In fact, there is a simple way to avoid this place, just comment out the line of code that calls the MiCROBLOG_LOG interface in the lohttp::CloHttp::OnHttpStackCb function. But this is just a way to circumvent it, and it is still necessary to figure out why it caused the crash.

4. Summary

       This problem is relatively rare, so I will record it in detail here. In general, when attaching Windbg to a process for dynamic debugging, if an exception occurs in the program, Windbg should be able to sense it and interrupt it. As a result, in this problem, the system prompt box that the program stopped running popped up directly, but Windbg did not perceive it. When encountering this kind of situation, you can try to check the relevant records in the application log of the system, and you may be able to find the dump file containing the exception context automatically generated by the system when the program is abnormal, and then use Windbg to statically analyze the dump file to analyze problem.

Guess you like

Origin blog.csdn.net/chenlycly/article/details/132024253