After using dlclose to record the problem that so cannot be uninstalled

After using dlclose to record the problem that so cannot be uninstalled

Problem Description

There is a function similar to a plug-in, use dlopen to load a so, when upgrading so, first dlclose, and then dlopen to load. In this way, it is not necessary to restart the program when replacing so. Originally, everything worked well, but there was a strange so. After upgrading the so, some functions could not use the implementation in the new so, or the implementation in the old so.

identify the problem

Use the info sharedlibrary command in gdb to view the loaded so, and then use lsof to view the following:

(gdb) info sharedlibrary 
0x00007fff2132a2c0  0x00007fff213bc5f8  Yes         /xxxx/xxx/libxxx.so
root@probe:~# lsof |grep libxxx.so
nginx     3391245                              root  mem       REG              253,0    8604544    5250054 /xxx/xxx/libxxx.so

From these information, we can see that the inode of the so currently in use is 5250054, but use stat to view the inode information of the current so, and find that the inode is 5377891, as shown below:

root@probe:~# stat /xxx/xxx/libxxx.so
  File: /apisec/modules/component/sensitive_data/libs/libdi_rechk.so
  Size: 8604544         Blocks: 16808      IO Block: 4096   regular file
Device: fd00h/64768d    Inode: 5377891     Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2022-10-21 17:01:57.404283135 +0800
Modify: 2022-10-21 15:48:29.000000000 +0800
Change: 2022-10-21 17:00:23.423822609 +0800

The inodes of the two are different, indicating that the same so file is not used, because the old so has been deleted. But the program has dlclosed and re-dlopened, so I searched for information on the Internet, and some information said to check whether there is a NODELET mark. So use the readelf command to check:

root@probe:~# readelf -d libxxx.so |grep NODELETE
 0x000000006ffffffb (FLAGS_1)            Flags: NODELETE

If there is this flag in so, the dynamic loader has been told not to unload the library, so after calling dlclose, this so will not be unloaded from the process.

manual testing

Write a test example of test.c by yourself, the code is as follows:

#include <stdio.h>

int test()
{
        printf("this is test function.\n");
        return 0;
}

Compile with the following compile command:

gcc -fpic -shared -o libtest.so test.c

Then use the readelf command to see if there is a logo:

gcc -fpic -shared -o libtest.so test.c

found no output.
Re-use the following command to compile and use the readelf command to view:

gcc -fpic -shared -znodelete -o libtest.so test.c
readelf -d libtest.so |grep NODELETE
 0x000000006ffffffb (FLAGS_1)            Flags: NODELETE

It can be seen that after adding the -znodelete option, the compiled so already has the NODELETE option.
The -znodelete option is a parameter of the linker program ld, use ld --help to see the effect of the -znodelete option. With this option, the program will be resident in the process, and will not be deleted from the process after calling dlclose. In addition to the nodelete option, there are other options such as nodlopen.

Why does this SO add the -znodelete option?

I learned from the person who developed so that he did not manually add this option. His so was developed in go language and compiled in c language using go build -buildmode=c-shared.
So I searched for related issues on the Internet and found that the so generated by the go language does not currently support dlclose, and the way to use it is to add the -znodelete option. (https://github.com/golang/go/commit/bd7de94d7fe8a0ba7742e90b1d6a09baa468bb58)
At present (October 24, 2022), go language support dlclose is still an open bug in golang's github: https://github .com/golang/go/issues/11100

References

https://stackoverflow.com/questions/45967961/how-to-unload-all-the-dependent-shared-libraries-from-a-process
https://www.coder.work/article/1518255
https://docs.oracle.com/cd/E19683-01/816-0210/6m6nb7mcs/index.html

Guess you like

Origin blog.csdn.net/EmptyStupid/article/details/127494816