Recently wrote a pure Swift very well-known reference Aspects Aspect , is based on a method of exchanging Runtime, just before heard by fishhook dynamically modify the C language function, all you study a little, however, to understand fishhook, you need to Learn Mach-O, this one has been my knowledge blind spot, this time together simply take some time to digest it. Apple's source code to see here .
Mach-O Profile
Mach-O, is Mach object file format acronym, is an executable files, object code, shared libraries, and dynamically load code core dump. A.out format is an alternative. Mach-O provides greater scalability and improve the access speed of the symbol table information.
*
* Constants for the filetype field of the mach_header
*/
#define MH_OBJECT 0x1 /* relocatable object file */
#define MH_EXECUTE 0x2 /* demand paged executable file */
#define MH_FVMLIB 0x3 /* fixed VM shared library file */
#define MH_CORE 0x4 /* core file */
#define MH_PRELOAD 0x5 /* preloaded executable file */
#define MH_DYLIB 0x6 /* dynamically bound shared library */
#define MH_DYLINKER 0x7 /* dynamic link editor */
#define MH_BUNDLE 0x8 /* dynamically bound bundle file */
#define MH_DYLIB_STUB 0x9 /* shared library stub for static */
/* linking only, no section contents */
#define MH_DSYM 0xa /* companion file with only debug */
/* sections */
#define MH_KEXT_BUNDLE 0xb /* x86_64 kexts */
复制代码
We see the Mach-O has a variety of file types, common formats:
-
executable file
-
objcet
- o files (object files)
- .a static library file is actually a set of N .o files
-
DYLIB: dynamic library files
- dylib
- framework
-
Dynamic linker, dynamic linker
-
DSYM: Analysis of APP crash information
C File -> executable file
Highly recommended Mach-O file a , this is brought from this article.
-
test.c C files
int main(){ return 0; } 复制代码
-
Compiler that
clang -c test.c
generates test.o file -
By the order to view the file
file test.o
, you can see, test.o as Mach-O files, object files test.o: Mach-O 64-bit object x86_64 -
What the target file test.o by clang links
clang test.o
, text.c it into an executable file a.out of -
Execute
./a.out
, execute the conversion process -
Perform
clang -o test1 test.o
, link test.0 object file, an executable file of test1 -
Performing
clang -o test2 test.c
, at one time, the source file is an executable file test2
Mach-O structure
By the figure, it can be seen Mach-O consists of the following three parts:
- Mach-O head (Mach Header): Description of the Mach-O cpu architecture, file type and load commands.
- Load command (load command): described specific organizational structure of the data file, using the different data types represent different load command.
- Data: Data Data in each segment (segment) are stored here, a similar concept with the concept of the middle section of the ELF file. Each segment has one or more Section, they are stored specific data and code, comprising the main code, data, symbol table, for example, the dynamic symbol table and the like.
** MachO to use MachOView verify the file structure of this example: **
Simply browse mach-o executable file, specifically divided into several parts:
- Header mach64 Header
- Load command Load Commands
- Text segment __TEXT
- Data segment __TEXT
- Dynamic load library information Dynamic Loader Info
- Entry function Function Starts
- Symbol Table Symbol Table
- Dynamic library symbol table Dynamic Symbol Table
- String Table String Table
mach_header_64
/*
* The 64-bit mach header appears at the very beginning of object files for
* 64-bit architectures.
*/
struct mach_header_64 {
uint32_t magic; /* mach magic number identifier */
cpu_type_t cputype; /* cpu specifier */
cpu_subtype_t cpusubtype; /* machine specifier */
uint32_t filetype; /* type of file */
uint32_t ncmds; /* number of load commands */
uint32_t sizeofcmds; /* the size of all the load commands */
uint32_t flags; /* flags */
uint32_t reserved; /* reserved */
};
复制代码
- Magic : magic number, the system loader quickly determine whether the file is used by the 32-bit field or64.
- cputype : tag CPU architectures, such as ARM, X86, i386, etc., this field ensures that the system can be adapted to run binaries in the current architecture.
- cpusubtype: : specific CPU type, to distinguish between different versions of processors, such as arm64, armv7.
- filetype: : Description of the mach-o file type (executable files, libraries, the core dump file, the kernel extension, DYSM file, a dynamic library).
- ncmds : LoadCommands number, each LoadCommands represents a way to load the Segment.
- sizeofcmds : the total byte size of all the Load commands.
- flags : identifies the binary file support functions, mainly related to the system load, related links.
- Reserved : reserved field.
Load commands
Load commands following the mach_header. The total size of all the commands given by the sizeofcmds mach_header field. oad commands must have the first two fields cmd and cmdsize. cmd command type field filled with the constant. Each type has a special command for its structure. cmdsize load command field is a particular configuration of the size in bytes plus follow any part of it, which is part of the load command (i.e. yoke structure, string, etc.). To advance to the next load command, cmdsize may be added to the current load command offset or pointer. Cmdsize 32-bit architecture must be a multiple of 4 bytes, for a 64-bit architecture must be a multiple of eight bytes (which is always the maximum alignment any load command). Padded bytes must be zero. All tables in the target file must follow these rules, so that files can be memory mapped. Otherwise, the pointer of these tables do not work or do not work properly on some machines. All objects like zero padding will compare byte by byte.
/*
* The segment load command indicates that a part of this file is to be
* mapped into the task's address space. The size of this segment in memory,
* vmsize, maybe equal to or larger than the amount to map from this file,
* filesize. The file is mapped starting at fileoff to the beginning of
* the segment in memory, vmaddr. The rest of the memory of the segment,
* if any, is allocated zero fill on demand. The segment's maximum virtual
* memory protection and initial virtual memory protection are specified
* by the maxprot and initprot fields. If the segment has sections then the
* section structures directly follow the segment command and their size is
* reflected in cmdsize.
*/
struct segment_command { /* for 32-bit architectures */
uint32_t cmd; /* LC_SEGMENT */
uint32_t cmdsize; /* includes sizeof section structs */
char segname[16]; /* segment name */
uint32_t vmaddr; /* memory address of this segment */
uint32_t vmsize; /* memory size of this segment */
uint32_t fileoff; /* file offset of this segment */
uint32_t filesize; /* amount to map from the file */
vm_prot_t maxprot; /* maximum VM protection */
vm_prot_t initprot; /* initial VM protection */
uint32_t nsects; /* number of sections in segment */
uint32_t flags; /* flags */
};
复制代码
- cmd : the Load Commands types, all of which are used by the command to load the kernel loader. Common are the following categories:
- LC_SEGMENT: that this is a segment load command, you need to load it onto the corresponding process space.
- LC_LOAD_DYLIB: This is a required dynamic link library loading, it uses dylib_command structure representation.
- LC_MAIN: recording the main function of the executable file () position, which indicates the use entry_point_command structure.
- LC_CODE_SIGNATURE: a load command code signing, a code signing information described in Mach-O, it belongs to the link information, using linkedit_data_command structure represented.
- cmdsize : the Command of the Load size.
- segname [16] : Name 16 byte segments.
- vmaddr : virtual memory starting address of the segment.
- vmsize : virtual memory size segment.
- fileoff : Offset section in the file.
- filesize : the size of the segment in the file.
- maxprot enables : maximum segment pages require memory protection (4 = r, 2 = w , 1 = x).
- initprot : page initial segment of memory protection.
- nsects : the number of segments contained in the section.
- flags : identifier.
Section data
Segment portion (mainly referring __TEXT and the __DATA) may be further decomposed into Section. The reason accordance Segment -> Section organization structure, because in the same Segment under the Section, can control the same privileges, may not be entirely in accordance with their memory Page size, space-saving memory. The overall exposure of foreign Segment, stage maps loaded into a complete virtual memory in the program, do a better memory alignment.
struct section_64 { /* for 64-bit architectures */
char sectname[16]; /* name of this section */
char segname[16]; /* segment this section goes in */
uint64_t addr; /* memory address of this section */
uint64_t size; /* size in bytes of this section */
uint32_t offset; /* file offset of this section */
uint32_t align; /* section alignment (power of 2) */
uint32_t reloff; /* file offset of relocation entries */
uint32_t nreloc; /* number of relocation entries */
uint32_t flags; /* flags (section type and attributes)*/
uint32_t reserved1; /* reserved (for offset or index) */
uint32_t reserved2; /* reserved (for count or sizeof) */
uint32_t reserved3; /* reserved */
};
复制代码
- sectname [16] : such __text, stubs, is a first __text, it is the main program code.
- segname [16] : This section belongs segment, such __TEXT.
- addr : This section starting location in memory of Kai.
- size : the size of the section.
- offset : the section file offset.
- align = left : Size byte alignment.
- reloff : relocation entry file offset.
- nreloc : number of entries need to re-positioning.
- flags : section contains the type and attributes.
- Reserved1 : reserved field 1 (for offset or index).
- RESERVED2 : reserved field 2 (for count or sizeof).
- Reserved3 : 3 reserved field.
Naming segment is followed by an underscore two capital letters (such as __TEXT), and section name is an underscore followed by two lowercase letters (__text).
Listed below in paragraph section may contain:
- __TEXT段:
__text, __cstring, __picsymbol_stub, __symbol_stub, __const, __litera14, __litera18;
- __DATA segment
__data, __la_symbol_ptr, __nl_symbol_ptr, __dyld, __const, __mod_init_func, __mod_term_func, __bss, __commom;
- __IMPORT segment
__jump_table, __pointers;
Wherein the __TEXT __text segment is actual code portion; __data DATA __ segment is actual initial data.
About Mach-o file format to finish, and if the program can be seen from the load to be interested in the implementation process Mach-O file format and loaded into the execution process from the program and interesting exploration Mach-O: loading process , said very detailed.
Detailed MachO file structure
MachO a file
MachO file format and loaded into the execution program from
iOS base MachO inverse document (1)
to explore the file format MachO
Reproduced in: https: //juejin.im/post/5d060880f265da1b860885d7