Dynamic library content analysis
Article Directory
1. Dynamic library compilation
The basic idea is:
- First write two C files, each of which implements several functions and variables, and then compile them into dynamic libraries;
- Write another C file that implements the main function, and call the functions in the first step of the dynamic library;
- Analyze the symbol table of the final executable file and dynamic library file;
1.1 The first C file: basic.c
This C file only defines and implements four functions with different formal parameters, five static variables, and one global variable. Since it only cares about the symbol table or other binary contents , it does not specifically implement specific functions.
/*************************************************************************
> File Name: basic.c
> Author: Toney Sun
> Mail: [email protected]
> Created Time: 2020年04月20日 星期一 09时50分51秒
************************************************************************/
#include<stdio.h>
int basic_func=4;
static char *Author="Toney Sun";
void func1()
{
int tmp_var;
static char *Mail="[email protected]";
}
void func2(int x)
{
static char *Mail="[email protected]";
}
int func3(char *a)
{
static char *Mail="[email protected]";
}
char * func4(int x, int y)
{
static char *Mail="[email protected]";
}
1.2 The second C file: demo.c
In demo.c I define a structure udphdr. Then define two global variables separately and implement three functions: func5, func6, fun7.
/*************************************************************************
> File Name: demo.c
> Author: Toney Sun
> Mail: [email protected]
> Created Time: 2020年04月19日 星期日 22时33分39秒
************************************************************************/
#include<stdio.h>
struct udphdr{
short dstport;
short srcport;
short checksum;
short length;
};
enum Date{
Monday,
Tuesday,
Wensday,
Thursday,
Friday,
Saturday,
Sunday,
};
struct udphdr udp1;
enum Date today = Monday;
int iphdr1=10;
extern void func1();
extern void func2(int x);
extern int func3(char *);
extern char * func4(int x, int y);
int fun5(int a)
{
struct udphdr udp2;
func1();
printf("aaaaaaaaaaa\n");
}
int fun6(char *a)
{
static struct udphdr udp1;
func2(10);
printf("aaaaaaaaaaa\n");
}
int fun7(int a, char *b)
{
func3("test");
printf("aaaaaaaaaaa\n");
}
1.3 The third C file: main.c
Main.c is mainly used to implement the main function, and to call functions and global variables implemented in other C files. In order to observe and compare the similarities and differences of different functions and variables in the symbol table.
/*************************************************************************
> File Name: main.c
> Author: Toney Sun
> Mail: [email protected]
> Created Time: 2020年04月20日 星期一 09时44分38秒
************************************************************************/
#include<stdio.h>
extern void func1();
extern int fun5(int a);
extern int fun6(char *a);
extern int fun7(int a, char *b);
extern struct udphdr udp;
extern int iphdr;
int myAge=25;
char *mail="[email protected]";
int show()
{
printf("Author: Toney Sun\n");
}
int main(int argc, char **argv)
{
int a=10;
int b=11;
fun5(a);
fun6("aaaaa");
fun7(a, "Toney Sun");
show();
udp.srcport=4500;
iphdr=10;
return 0;
}
2. Dynamic library compilation
-
Use the gcc tool to compile basic.c to libbasic.so ;
-
Use the gcc tool to compile demo.c into libdemo.so ;
-
Use the gcc tool to link main.c to the above two dynamic libraries and compile to a.out
** Note: ** We are not trying to execute the a.out, but want to view the contents (symbol table) of the above three binary files.
toney@ubantu$ gcc -shared -fpic -o libdemo.so demo.c
toney@ubantu$ gcc -shared -fpic -o libbasic.so basic.c
toney@ubantu$ gcc main.c -L./ -ldemo -lbasic
toney@ubantu$ ls -l
total 35
-rwxrwxrwx 1 root root 8552 4月 20 10:18 a.out
-rwxrwxrwx 1 root root 454 4月 20 09:52 basic.c
-rwxrwxrwx 1 root root 763 4月 20 09:49 demo.c
-rwxrwxrwx 1 root root 8016 4月 20 09:43 demo.so
-rwxrwxrwx 1 root root 7528 4月 20 10:17 libbasic.so
-rwxrwxrwx 1 root root 8128 4月 20 10:17 libdemo.so
-rwxrwxrwx 1 root root 846 4月 20 10:18 main.c
How can there be a sequence of links: (
toney@ubantu$ gcc main.c -L./ -lbasic -ldemo
.//libdemo.so: undefined reference to `func3'
.//libdemo.so: undefined reference to `func1'
.//libdemo.so: undefined reference to `func2'
collect2: error: ld returned 1 exit status
toney@ubantu$ gcc main.c -L./ -ldemo -lbasic
toney@ubantu$
3. Binary content analysis
3.1 libbasic.so analysis
3.1.1 Summary of basic.c content
Serial number | Function or variable | nature |
---|---|---|
1 | void func1() | Custom function |
2 | void func2(int x) | Custom function |
3 | int func3(char *a) | Custom function |
4 | char * func4(int x, int y) | Custom function |
5 | int basic_func; | Custom global variables |
6 | static char *Author; | Custom global static variables |
7 | static char *Mail; | Define local static variables in func1 |
8 | static char *Mail; | Define local static variables in func2 |
9 | static char *Mail; | Define local static variables in func3 |
10 | static char *Mail; | Define local static variables in func4 |
3.1.2 libbasic.so symbol table
- Use the nm tool to view the contents of the symbol table (of course, you can also use other tools, such as objdump, readelf, ldd, etc.):
toney@ubantu$ nm libbasic.so
0000000000201028 d Author ==================全局静态变量=================
0000000000201020 D basic_func ====================全局变量==================
0000000000201050 B __bss_start
0000000000201050 b completed.7698
w __cxa_finalize
00000000000005a0 t deregister_tm_clones
0000000000000630 t __do_global_dtors_aux
0000000000200e88 t __do_global_dtors_aux_fini_array_entry
0000000000201018 d __dso_handle
0000000000200e90 d _DYNAMIC
0000000000201050 D _edata
0000000000201058 B _end
00000000000006a4 T _fini
0000000000000670 t frame_dummy
0000000000200e80 t __frame_dummy_init_array_entry
00000000000007e8 r __FRAME_END__
000000000000067a T func1 ===================实现函数===================
0000000000000681 T func2 ===================实现函数===================
000000000000068b T func3 ===================实现函数===================
0000000000000696 T func4 ===================实现函数===================
0000000000201000 d _GLOBAL_OFFSET_TABLE_
w __gmon_start__
00000000000006d0 r __GNU_EH_FRAME_HDR
0000000000000568 T _init
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
0000000000201030 d Mail.2252 ================局部静态变量===================
0000000000201038 d Mail.2256 ================局部静态变量===================
0000000000201040 d Mail.2260 ================局部静态变量===================
0000000000201048 d Mail.2265 ================局部静态变量===================
00000000000005e0 t register_tm_clones
0000000000201050 d __TMC_END__
- Use nm -Da to view the content information defined by this dynamic library
toney@ubantu$ nm -Da libbasic.so
0000000000201020 D basic_func ----------------1----------------
0000000000201050 B __bss_start
w __cxa_finalize
0000000000201050 D _edata
0000000000201058 B _end
00000000000006a4 T _fini
000000000000067a T func1 ----------------2----------------
0000000000000681 T func2 ----------------3----------------
000000000000068b T func3 ----------------4----------------
0000000000000696 T func4 ----------------5----------------
w __gmon_start__
0000000000000568 T _init
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
3.1.3 Summary
- The global variables defined in this document are represented by 'D';
- The global static variables defined in this document are represented by 'd';
- The functions implemented in this document are denoted by 'T';
- The static variables defined in different functions have different symbols in the symbol table, so there is no problem of confusion .
- It can be guessed from the above: the variable represented by 'd' cannot be referenced by other files (the above 'd' identifies static variables, which is also reasonable)
3.2 libdemo.so analysis
3.2.1 Demo.c content summary
Serial number | Function or variable | nature |
---|---|---|
1 | struct udphdr udp1; | Custom global structure variables |
2 | static struct udphdr udp2; | Custom global static variables |
3 | struct udphdr udp2; | Custom local variables |
4 | enum Date today | Custom global enumeration variables |
5 | int iphdr; | Custom global variables |
6 | extern void func1 (); | External function |
7 | extern void func2(int x); | External function |
8 | extern int func3(char *); | External function |
9 | extern char * func4 (int x, int y); | External function |
10 | int fun5(int a) | Custom function |
11 | int fun6(char *a) | Custom function |
12 | int fun7(int a, char *b) | Custom function |
3.2.2 demo.so symbol table
Similarly, we use nm tool to view the dynamic library symbol table information:
toney@ubantu$ nm libdemo.so
0000000000201044 B __bss_start
0000000000201048 b completed.7698
w __cxa_finalize@@GLIBC_2.2.5
00000000000006b0 t deregister_tm_clones
0000000000000740 t __do_global_dtors_aux
0000000000200e18 t __do_global_dtors_aux_fini_array_entry
0000000000201038 d __dso_handle
0000000000200e20 d _DYNAMIC
0000000000201044 D _edata
0000000000201068 B _end
0000000000000800 T _fini
0000000000000780 t frame_dummy
0000000000200e10 t __frame_dummy_init_array_entry
0000000000000908 r __FRAME_END__
000000000000078a T fun5 =============================
00000000000007ae T fun6 =============================
00000000000007d3 T fun7 =============================
U func1 =============================
U func2 =============================
U func3 =============================
0000000000201000 d _GLOBAL_OFFSET_TABLE_
w __gmon_start__
000000000000081c r __GNU_EH_FRAME_HDR
0000000000000638 T _init
0000000000201040 D iphdr1 =============================
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
U puts@@GLIBC_2.2.5
00000000000006f0 t register_tm_clones
0000000000201048 d __TMC_END__
0000000000201050 B today =============================
0000000000201060 B udp1 =============================
0000000000201058 b udp1.2278 =============================
3.2.3 Summary
- Structure variables (non-basic variables) are identified using 'B' or 'b'.
- Global structure variables are marked with 'B'
- Local static structure variables are marked with 'b'
- Local variables are not displayed in the symbol table
- The functions used in this document are marked with a 'T'
- 引用其他文件的函数使用‘U’标识
3.3 可执行文件a.out分析
3.3.1 main.c内容汇总
序号 | 函数或变量 | 性质 |
---|---|---|
1 | extern void func1(); | 引用外部函数 |
2 | extern int fun5(int a); | 引用外部函数 |
3 | extern int fun6(char *a); | 引用外部函数 |
4 | extern int fun7(int a, char *b); | 引用外部函数 |
5 | extern struct udphdr udp1; | 引用外部结构体变量 |
6 | extern int iphdr1; | 引用外部基本类型变量 |
7 | int myAge=25; | 本地全局变量 |
8 | char *mail=“[email protected]”; | 本地全局静态变量 |
3.3.2 a.out符号表
同样使用nm工具进行查看:
toney@ubantu$ nm a.out
0000000000201020 B __bss_start
0000000000201030 b completed.7698
w __cxa_finalize@@GLIBC_2.2.5
0000000000201000 D __data_start
0000000000201000 W data_start
00000000000007a0 t deregister_tm_clones
0000000000000830 t __do_global_dtors_aux
0000000000200d88 t __do_global_dtors_aux_fini_array_entry
0000000000201008 D __dso_handle
0000000000200d90 d _DYNAMIC
0000000000201020 D _edata
0000000000201038 B _end
0000000000000974 T _fini
0000000000000870 t frame_dummy
0000000000200d80 t __frame_dummy_init_array_entry
0000000000000b2c r __FRAME_END__
U fun5 ============1==============
U fun6 ============2==============
U fun7 ============3==============
0000000000200fa0 d _GLOBAL_OFFSET_TABLE_
w __gmon_start__
00000000000009c0 r __GNU_EH_FRAME_HDR
00000000000006f8 T _init
0000000000200d88 t __init_array_end
0000000000200d80 t __init_array_start
0000000000000980 R _IO_stdin_used
0000000000201020 B iphdr1 ============4=============
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
0000000000000970 T __libc_csu_fini
0000000000000900 T __libc_csu_init
U __libc_start_main@@GLIBC_2.2.5
0000000000201018 d mail ============5==============
000000000000088d T main ============6==============
0000000000201010 D myAge ============7==============
U puts@@GLIBC_2.2.5
00000000000007e0 t register_tm_clones
000000000000087a T show ============8==============
0000000000000770 T _start
0000000000201020 D __TMC_END__
0000000000201028 B udp1 ============9==============
3.3.3 小结
- 引用的外部函数使用‘U’来标识
- 全局变量使用‘D’来标识
- 全局静态变量使用‘d’来标识
- 引用的外部全局变量(简单类型和复杂类型)使用‘B’来标识
4.总结
对符号表中常见的变量、函数标识总结如下:
序号 | 标识 | 说明 |
---|---|---|
1 | T | 自定义函数(本文件内) |
2 | t | 尚未分析 |
3 | D | 自定义标准类型全局变量(如int, char, float等) |
4 | d | 自定义标准类型静态变量(包括全局静态变量、局部静态变量) |
5 | B | 自定义扩展类型全局变量(如结构体类型,枚举型等)、引用的外部全局变量 |
6 | b | 自定义静态扩展类型变量(包括全局静态、局部静态类型变量) |
7 | U | 引用的外部函数 |
8 | 局部变量在符号表中是不存在的。 | |
… | … | … |