KSCrash is to capture a frame for collapse iOS platform, recently read some of its source, in KSDynamicLinker
a file, the function code is as follows:
/** Get the segment base address of the specified image.
*
* This is required for any symtab command offsets.
*
* @param idx The image index.
* @return The image's base address, or 0 if none was found.
*/
static uintptr_t segmentBaseOfImageIndex(const uint32_t idx)
{
const struct mach_header* header = _dyld_get_image_header(idx);
// Look for a segment command and return the file image address.
uintptr_t cmdPtr = firstCmdAfterHeader(header);
if(cmdPtr == 0)
{
return 0;
}
for(uint32_t i = 0;i < header->ncmds; i++)
{
const struct load_command* loadCmd = (struct load_command*)cmdPtr;
if(loadCmd->cmd == LC_SEGMENT)
{
const struct segment_command* segmentCmd = (struct segment_command*)cmdPtr;
if(strcmp(segmentCmd->segname, SEG_LINKEDIT) == 0)
{
return segmentCmd->vmaddr - segmentCmd->fileoff;
}
}
else if(loadCmd->cmd == LC_SEGMENT_64)
{
const struct segment_command_64* segmentCmd = (struct segment_command_64*)cmdPtr;
if(strcmp(segmentCmd->segname, SEG_LINKEDIT) == 0)
{
return (uintptr_t)(segmentCmd->vmaddr - segmentCmd->fileoff);
}
}
cmdPtr += loadCmd->cmdsize;
}
return 0;
}
This function is invoked so:
const uintptr_t segmentBase = segmentBaseOfImageIndex(idx) + imageVMAddrSlide;
0 confused scene
There will be more of a image segment, the parameter idx
passed is image of the index, if the return is a segment base, then which segment?
Some would say that the comment was not to say that non-voice returns 0, it means that the image base. But in principle vmaddr - fileoff
we do not receive image base (later have to explain).
While being invoked, the plus shift caused by the ASLR, assigned to the segmentBase.
In fishhook , there is such a line of code:
uintptr_t linkedit_base = (uintptr_t)slide + linkedit_segment->vmaddr - linkedit_segment->fileoff;
Not to consider the slide caused by ASLR, then the top is mentioned vmaddr - fileoff
, where the variables are named linkedit_base
.
The so-called KSCrash segmentBase
and fishhook so-called linkedit_base
, in the end what is meant? If you refer to the __LINKEDIT end real address in memory that should be vmaddr + ASLR偏移
my son.
In the process of searching for information, I read a lot of blog, information, explanation for this piece, either did not mention, either in passing or wrong. Some believe that this value is __LINKEDIT segment base address in memory, some believe that the current image in memory base address.
1 Unveiled
1.1 Pre-knowledge
In understanding what is before this value in the end, we need some pre-knowledge.
- Mach-O file structure
- Virtual Memory
- ASLR
Below we simply say Mach-O file.
Mach-O
We know that the process is the result of an executable file in memory load obtained, but is a kind of macOS Mach-O executable file format platform.
Mach-O file is divided into three areas Header, Load commands, Data. Load commands where the command area of guidance on how to set up and load the binary data. Below are a few we care about 32-bit platforms:
instruction | Data structure corresponding | description |
---|---|---|
LC_SEGMENT | segment_command | Defines a segment of this document, Mach-O when the file is loaded into this segment will be mapped to a corresponding address space. Need to pay attention, segment_command there is a segname , through segname to find the specified segment. |
LC_SYMTAB | symtab_command | Specifies the symbol table of this document. symtab_command Symbol table contains the offset in the file, the number of offset symbols in the string table file, the size of the string table. |
segment_command code is as follows:
struct segment_command { /* for 32-bit architectures */
uint32_t cmd; /* LC_SEGMENT */
uint32_t cmdsize; /* includes sizeof section structs */
char segname[16]; /* segment name */
uint32_t vmaddr; /* memory address of this segment */
uint32_t vmsize; /* memory size of this segment */
uint32_t fileoff; /* file offset of this segment */
uint32_t filesize; /* amount to map from the file */
vm_prot_t maxprot; /* maximum VM protection */
vm_prot_t initprot; /* initial VM protection */
uint32_t nsects; /* number of sections in segment */
uint32_t flags; /* flags */
};
For each segment, the process virtual memory setting process is the appropriate content loaded into memory, is preloaded at filesize bytes of virtual memory address to vmaddr from fileoff Mach-O files occupy vmsize bytes. ** need to pay attention, some segment of it, vmsize may be greater than the filesize, as __Data, __ LINKEDIT. **
In the discussion behind, we need to care about is segname
for the __LINKEDIT
segment. __LINKEDIT segment used by the dyld, contains a symbol table, string table, and other data.
symtab_command code is as follows:
struct symtab_command {
uint32_t cmd; /* LC_SYMTAB */
uint32_t cmdsize; /* sizeof(struct symtab_command) */
uint32_t symoff; /* symbol table offset */
uint32_t nsyms; /* number of symbol table entries */
uint32_t stroff; /* string table offset */
uint32_t strsize; /* string table size in bytes */
};
In symtab_command
, the symoff
symbol table in the Mach-O file offset, stroff
it is offset in the string table Mach-O file.
1.2 Secret
We can use MachOView to open a Mach-O files, observe LC_SEGMENT (__ LINKEDIT), LC_SYMTAB. Due to space limitations, not here screenshots observed. But you should note that the symbol table, the position of the Mach-O string table file located __LINKEDIT paragraph, which also verified introduction __LINKEDIT upper segment.
Let backwards from the symbol table address in the virtual memory on top of that so-called segmentBase
, linkedit_base
to see
Let's ignore ASLR, a dark gray background indicates that the figure is virtual memory, __ TEXT segment, __ DATA segment we do not care, the figure does not reflect.
sym_vmaddr refers to the symbol table in virtual memory address and the offset in segment __LINKEDIT symbol table in the virtual memory, i.e. sym_vmaddr - vmaddr
, its offset MachO file, i.e. symoff - fileoff
equal.
That is sym_vmaddr - vmaddr = symoff - fileoff
,
vmaddr to the right, that is,sym_vmaddr = symoff - fileoff + vmaddr
What you found?
Then push the top:
subtracting the symbol table offset symoff: sym_vmaddr - symoff = vmaddr - fileoff
(Formula 1),
part of the right side of the equal sign of Formula 1 plus the offset Slide ASLR: vmaddr - fileoff + slide
, so-called segmentBase
, linkedit_base
.
At this point, the truth.
reference
Depth analysis of Mac OS X & iOS operating system
In-depth understanding of computer systems
Mach-O File Format