[Problem record] Under the Ubuntu 22.04 environment, the program reports: Segmentation fault (core dumped) how to use the core file and the GDB debugger to solve it?

Table of contents

environment

problem situation

Solutions

Cause Analysis

Solution

Extra knowledge


environment

  • VMware® Workstation 16 Pro (Version: 16.1.2 build-17966106)
  • ubuntu-22.04.2-desktop-amd64

problem situation

  • When I am running a server program with millions of concurrency, the program runs and reports: Segmentation fault (core has been dumped) , causing the program to exit abnormally, as follows

Solutions _

  • The first step is to determine the generation path and size limit of the core dump file. Then use a debugger such as GDB to analyze the core dump file and stack trace information to fix what caused the " segmentation fault " in the code.

Cause Analysis

1. What is a segmentation fault?

  • Segmentation Fault (Segmentation Fault) is a common program error that usually occurs when an invalid memory address is accessed. When a program tries to access a memory segment that does not belong to it, the operating system sends a signal (SIGSEGV (segment fault signal)) to the program, called a segment fault.

2. Segmentation errors may occur

  • Memory Access Error: One of the most common causes is when a program tries to access an invalid memory address or an uninitialized pointer. This could be due to a code error, buffer overflow, or memory out of bounds, etc. The operating system throws a segmentation fault when a program tries to access an area of ​​memory that the system is not allowed to access.
  • Invalid instruction or operation: Another common cause is that the program performed an invalid instruction or operation. This could be due to compilation errors, wrong code logic, or architectural incompatibilities, etc. A segmentation fault is caused when the processor attempts to execute an invalid instruction or operation.
  • Dynamic memory allocation issues: When using dynamic memory allocation such as  malloc or  new, problems such as memory leaks, repeated freeing of freed memory, or access to freed memory can result in segfaults. These problems can be caused by incorrect memory management.
  • Stack overflow: If the stack space of the program exceeds its allowed range, such as stack overflow caused by infinite recursive calls or the use of a large number of local variables, a segmentation fault will occur.
  • Library or dependency issues: Sometimes, segmentation faults can be caused by using broken libraries, incompatible versions, or missing dependencies. Misuse of the library or configuration issues can lead to segfaults.
  • Hardware problems: Although rare, hardware failures, such as memory corruption, can also cause a program to report a segmentation fault and generate a core dump.

3. Where is the core dumped?

  • When a program segfaults, the operating system generates a core dump file called  core or  core.<进程ID> , which contains the memory image and other relevant information when the program crashed. This core file is usually dumped to the current working directory.
  • But my core file is not generated in the working directory of the program, see the solution below...

solution _

1. Check the core dump file generation settings of the operating system

  • Use the command  ulimit -a to view the current core dump file size limit and other limit information
  • Look for the "core file size" field in the output, a red box  0 means core file generation is currently disabled. This limit can be changed to enable generation of core dump files.

2. Change the limit of "core file size" field to enable core dump file generation

  • The size limit of the core file can be set to unlimited by using  ulimit -c unlimited the command, but ulimitthe parameters set by the command only take effect in the current shell process, that is, the current session. Once the terminal window is closed, the settings will be reset to default. Therefore, this modification is not permanent. ( not recommended )
  • If you want to permanently modify the core file generation size limit at the system level, you need to make configuration changes to the operating system. /etc/security/limits.conf The core file size limit can be set by modifying  the file. Add or modify the following two lines:
    • *    soft    core    unlimited
    • *    hard    core    unlimited
  • Restart the virtual machine and reload the system parameter configuration to ensure that the changes take effect ( restart command: sudo reboot )

3. Not much to say, just test it directly

  • I don’t want to run the server program anymore, it’s too time-consuming, so write a test chestnut directly, the code is as follows:
  • After running the test chestnut, no core file is generated in the project directory.

4. Determine the generation path of the core file

  • Find information that the Linux kernel has a parameter kernel.core_patternfor specifying the file name and path mode when generating a core dump file, and the related configuration file is /proc/sys/kernel/core_pattern. In Linux, however,  sysctl commands can be used to examine and change the generation path restrictions for core dump files.
  • Then use  sysctl kernel.core_pattern the command to view the current core dump file generation path. It outputs the following line:
    • kernel.core_pattern = |/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E
    • What does the above line mean? The explanation is as follows:
      • |/usr/share/apport/apport: It is a special core dump handler (core dump handler), which is a tool for collecting and reporting failures. When a process receives  SIGSEGV or similar signal, the kernel will  kernel.core_pattern process and process the core dump file using the handler specified in .
      • %p: Process ID.
      • %s: The ID of the currently running thread.
      • %c: Signal codes to generate a core dump file.
      • %d: Sequence number, used to ensure unique names for core dump files generated within the same directory.
      • %P: Parent process ID.
      • %u: username.
      • %g: group name.
      • %E: The full path of the executable file that generated the core dump file.
    • Specifically, /usr/share/apport/apport a tool for Ubuntu systems that collects information about crashes and failures and generates corresponding error reports.

5. Modify the generation path of the core file

  • sudo sysctl -w kernel.core_pattern=<path_to_directory>/core This can be restored to the desired path using  the command. Make sure  <path_to_directory> it is a valid directory path. for example
    • sudo sysctl -w kernel.core_pattern=core
  • Recompile,  core.<进程ID> the core dump file generated in the current directory is as follows
  • Note:  kernel.core_pattern The value modified in the above way only takes effect at runtime and is not permanent. After the system restarts, the change will be reset to the default value.
  • You can learn about the last section " External Knowledge "

6. Use the debugging tool gdb to load and analyze the core file

  • Generate the core file: Remember to add the -g command when compiling with gcc.
  • Load the core file: use the gdb command line to load the core file, and load the core file into the debugging environment.
    • gdb <path to executable file> <path to core file>
  • View the stack trace: After running gdb, use btthe command ( or backtrace) to view the stack trace, which will show the function call chain of the program at the time of the crash.
    • (gdb) bt
  • Check variable values: You can use printcommands to check the values ​​of variables. Simply specify a variable name to view its current value.
    • (gdb) print variable_name
  • Jump to a specific frame: Use framecommands to navigate between stack frames and view stack information on a specific frame. Frame numbers are usually assigned in reverse order starting from 0, that is, the bottommost frame is numbered 0.
    • (gdb) frame frame_number
  • Analyzing the cause: Analyzing the stack backtrace and variable values ​​can help you locate the cause of the program crash. Typically, the bottommost stack frame provides the location of the original crash.
  • The operation is as follows: Note that *P is not initialized

Extra knowledge

1. Can write permission be added to /proc/sys/kernel/core_pattern?

  • The answer is no. The default permissions are as follows (owner has read and write permissions, group users and other users only have read permissions).
  • For /proc/sys/kernel/core_patternfiles, read permissions cannot be added directly. This is because /procthe directory and the files under it are procfspart of the virtual file system ( ), used to provide access to kernel and process information, and their permissions and ownership are controlled by the kernel, not restricted by the Linux file system permission model .
  • In /proca directory, the permissions of each file and directory are usually set to read-only, and users are not allowed to modify their permissions directly. This is to ensure the integrity and consistency of the information provided and to prevent unauthorized changes to the kernel and process state.
  • Therefore, there is no way to directly add read permission to or change its permissions, via regular  chmod commands or otherwise  . /proc/sys/kernel/core_patternI get an error when trying to execute a command like:
    • sudo chmod +w /proc/sys/kernel/core_pattern
  • You will get an error message like "Operation not permitted" or "Operation not permitted".

/proc/sys/kernel/core_pattern 2. Why are changes to files reset to default after a system reboot  ?

  • This is because /proc/sys/the files in the directory are dynamically generated during kernel startup, and their values ​​come from kernel parameters or other system settings. On system restart, these files are reloaded with their default values ​​or values ​​specified by certain configuration files.

3. How to achieve permanent /proc/sys/kernel/core_pattern file modification? (There is a problem with this one)

  • Edit /etc/sysctl.confthe file: ( This method is not very good. After each system restart, you need to execute the sudo sysctl -p command to modify the value of /proc/sys/kernel/core_pattern )
    • The file can be edited  /etc/sysctl.conf to add changes to the core dump file schema to the file as follows:
      • kernel.core_pattern = core
    • After saving and exiting the file, use the following command to reload the configuration for the new core dump file mode to take effect:
      • sudo sysctl -p
  • Create and edit a system startup script: You can write a script to set the core dump file mode to the desired value at system startup. Place the script in an appropriate location, such as /etc/init.d/a directory, and set it to execute at system startup. ( Tested to no avail )
    • Create Startup Script File: Creates a new file in the selected directory
      • sudo vim /etc/init.d/my_startup_script.sh
    • Write the startup script: add the following content in the script, save it successfully and exit.
      • #!/bin/bash
      • echo "core" >> /proc/sys/kernel/core_pattern
      • exit 0
    • Give the script execution permission: use the following command to give the execution permission to the startup script file
      • sudo chmod +x /etc/init.d/my_startup_script.sh
    • Configure startup script execution: Add startup scripts to the system's startup process to ensure execution when the system starts
      • sudo update-rc.d my_startup_script.sh defaults
    • To disable the startup of the script, you can use the following command (learn)
      • sudo update-rc.d -f my_startup_script.sh remove

Notice

  • It should be noted that changing  /proc/sys/kernel/core_pattern the permissions and content of files is a sensitive operation, which may affect the stability and security of the system. Always be careful and make sure you understand the impact of the changes you make.

Guess you like

Origin blog.csdn.net/weixin_43729127/article/details/131856080