DEX file parsing --- 1, dex header parsing

DEX file parsing --- 1, dex header parsing


A, dex file

    dex file is a file type on the Android platform executable file. It file format can be summarized in the picture below:
dex File Format
    DEX header is usually set to 0x70 bytes in size, comprising a flag, a version number, checksum, sha-1 signatures, and other methods, the class number and offset address information. As shown below:
dex header


Two, dex header parsing the fields

    dex each file header contains the following fields:

  1. magic: dex contains the file identifier and version starts from 0x00 8 bytes in length
  2. checksum: dex file checksum, offset: 0x08, a length of 4 bytes.
  3. signature: dex sha-1 signature, 0x0C offset, a length of 20 bytes
  4. file_szie: dex file size, offset 0x20, a length of 4 bytes
  5. header_size: dex header size, offset 0x24, a length of 4 bytes, usually 0x70
  6. endian_ Tag: determining whether endian DEX file exchange, offset 0x28, a length of 4 bytes, is generally 0x78563412
  7. link_size: dex file link segment size, 0 indicates statically linked, 0x2C offset, a length of 4 bytes
  8. link_off: dex file link segment offset, offset 0x30, a length of 4 bytes
  9. map_off: dex file map data segment offset, the offset position is 0x34, a length of 4 bytes
  10. string_ids_size: dex file contains the number of strings, offset 0x38, a length of 4 bytes
  11. string_ids_off: dex string file start offset, offset 0x3C, a length of 4 bytes
  12. type_ids_size: dex number of file types, offset 0x40, a length of 4 bytes
  13. type_ids_off: dex class file offset position, offset 0x44, a length of 4 bytes
  14. photo_ids_size: dex file number prototype method, offset 0x48, a length of 4 bytes
  15. photo_ids_off: dex method Prototype file offset position, offset 0x4C, a length of 4 bytes
  16. field_ids_size: dex number fields in the file, offset 0x50, a length of 4 bytes
  17. field_ids_off: dex file field offset position offset is 0x54, a length of 4 bytes
  18. method_ids_size: dex file number of methods, offset 0x58, a length of 4 bytes
  19. method_ids_off: dex file offset method, offset 0x5c, a length of 4 bytes
  20. class_defs_size: Number dex class definition file, offset 0x60, a length of 4 bytes
  21. class_defs_off: dex class definition file offset position, offset 0x64, a length of 4 bytes
  22. data_size: dex segment size, offset 0x68, a length of 4 bytes
  23. data_off: dex data segment offset, offset 0x6C, a length of 4 bytes

Three, dex header code analysis sample (Python)

    dex open the open function with the binary file, then move the file pointer seek function, e.g. magic is f.seek(0x00), then read the number of bytes to the corresponding information, such as reading the version number f.seek(0x04) f.read(4), and accordingly the printing operation on the line, the more dex header simple, does not involve coding, etc. so we do not resolve it feels a belt. . . . . Specific code can be seen below or github, code running FIG attached below:
Figure code runs


Four, dex header parsing code that implements (python achieve)

import binascii

def parserHeader(f):
f.seek(0x00)
magic_mask = f.read(4)
magic_mask = binascii.b2a_hex(magic_mask)
magic_mask = str(magic_mask,encoding='utf-8')
print('文件标识符: ',end='')
print(magic_mask)  

f.seek(0x04)
magic_version = f.read(4)
magic_version = binascii.b2a_hex(magic_version)
magic_version = str(magic_version,encoding='utf-8')
print('文件版本: ',end='')
print(magic_version)

f.seek(0x08)
checksum = f.read(4)
checksum = binascii.b2a_hex(checksum)
checksum = str(checksum,encoding='utf-8')
print('校验码: ',end='')
print(checksum)

f.seek(0x0c)
signature = f.read(20)
signature = binascii.b2a_hex(signature)
signature = str(signature,encoding='utf-8')
print('SHA-1签名: ',end='')
print(signature)

f.seek(0x20)
file_size = f.read(4)
a = bytearray(file_size)
a.reverse()
file_size = bytes(a)
file_size = binascii.b2a_hex(file_size)
file_size = str(file_size,encoding='utf-8')
print('文件大小: ',end='')
print(int(file_size,16),end='')
print(' byte')

f.seek(0x24)
header_size = f.read(4)
a = bytearray(header_size)
a.reverse()
header_size = bytes(a)
header_size = binascii.b2a_hex(header_size)
header_size = str(header_size,encoding='utf-8')
print('文件头大小: ',end='')
print(int(header_size,16),end='')
print(' byte')

f.seek(0x28)
endian_tag = f.read(4)
endian_tag = binascii.b2a_hex(endian_tag)
endian_tag = str(endian_tag,encoding='utf-8')
print('字节序交换标志: ',end='')
print(endian_tag)

f.seek(0x2c)
link_size = f.read(4)
a = bytearray(link_size)
a.reverse()
link_size = bytes(a)
link_size = binascii.b2a_hex(link_size)
link_size = str(link_size,encoding='utf-8')
print('链接段大小: ',end='')
print(int(link_size,16),end='')
print(' byte')

f.seek(0x30)
link_off = f.read(4)
a = bytearray(link_off)
a.reverse()
link_off = bytes(a)
link_off = binascii.b2a_hex(link_off)
link_off = str(link_off,encoding='utf-8')
print('链接段偏移位置: ',end='')
print(hex(int(link_off,16)))

f.seek(0x34)
map_off = f.read(4)
a = bytearray(map_off)
a.reverse()
map_off = bytes(a)
map_off = binascii.b2a_hex(map_off)
map_off = str(map_off,encoding='utf-8')
print('map数据偏移位置: ',end='')
print(hex(int(map_off,16)))

f.seek(0x38)
stringidsSize = f.read(4)
a = bytearray(stringidsSize)
a.reverse()
stringidsSize = bytes(a)
stringidsSize = binascii.b2a_hex(stringidsSize)
stringidsSize = str(stringidsSize,encoding='utf-8')
print('字符串数量: ',end='')
print(int(stringidsSize,16),end='')
print('(',end='')
print(hex(int(stringidsSize,16)),end='')
print(')')

f.seek(0x3c)
string_ids_off = f.read(4)
a = bytearray(string_ids_off)
a.reverse()
string_ids_off = bytes(a)
string_ids_off = binascii.b2a_hex(string_ids_off)
string_ids_off = str(string_ids_off,encoding='utf-8')
print('字符串偏移位置: ',end='')
print(hex(int(string_ids_off,16)))

f.seek(0x40)
type_ids_size = f.read(4)
a = bytearray(type_ids_size)
a.reverse()
type_ids_size = bytes(a)
type_ids_size = binascii.b2a_hex(type_ids_size)
type_ids_size = str(type_ids_size,encoding='utf-8')
print('类数量: ',end='')
print(int(type_ids_size,16),end='')
print('(',end='')
print(hex(int(type_ids_size,16)),end='')
print(')')

f.seek(0x44)
type_ids_off = f.read(4)
a = bytearray(type_ids_off)
a.reverse()
type_ids_off = bytes(a)
type_ids_off = binascii.b2a_hex(type_ids_off)
type_ids_off = str(type_ids_off,encoding='utf-8')
print('类偏移位置: ',end='')
print(hex(int(type_ids_off,16)))

f.seek(0x48)
photo_ids_size = f.read(4)
a = bytearray(photo_ids_size)
a.reverse()
photo_ids_size = bytes(a)
photo_ids_size = binascii.b2a_hex(photo_ids_size)
photo_ids_size = str(photo_ids_size,encoding='utf-8')
print('方法原型数量: ',end='')
print(int(photo_ids_size,16),end='')
print('(',end='')
print(hex(int(photo_ids_size,16)),end='')
print(')')

f.seek(0x4c)
photo_ids_off = f.read(4)
a = bytearray(photo_ids_off)
a.reverse()
photo_ids_off = bytes(a)
photo_ids_off = binascii.b2a_hex(photo_ids_off)
photo_ids_off = str(photo_ids_off,encoding='utf-8')
print('方法原型偏移位置: ',end='')
print(hex(int(photo_ids_off,16)))

f.seek(0x50)
field_ids_size = f.read(4)
a = bytearray(field_ids_size)
a.reverse()
field_ids_size = bytes(a)
field_ids_size = binascii.b2a_hex(field_ids_size)
field_ids_size = str(field_ids_size,encoding='utf-8')
print('字段数量: ',end='')
print(int(field_ids_size,16),end='')
print('(',end='')
print(hex(int(field_ids_size,16)),end='')
print(')')

f.seek(0x54)
field_ids_off = f.read(4)
a = bytearray(field_ids_off)
a.reverse()
field_ids_off = bytes(a)
field_ids_off = binascii.b2a_hex(field_ids_off)
field_ids_off = str(field_ids_off,encoding='utf-8')
print('字段偏移位置: ',end='')
print(hex(int(field_ids_off,16)))

f.seek(0x58)
method_ids_size = f.read(4)
a = bytearray(method_ids_size)
a.reverse()
method_ids_size = bytes(a)
method_ids_size = binascii.b2a_hex(method_ids_size)
method_ids_size = str(method_ids_size,encoding='utf-8')
print('方法数量: ',end='')
print(int(method_ids_size,16),end='')
print('(',end='')
print(hex(int(method_ids_size,16)),end='')
print(')')

f.seek(0x5c)
method_ids_off = f.read(4)
a = bytearray(method_ids_off)
a.reverse()
method_ids_off = bytes(a)
method_ids_off = binascii.b2a_hex(method_ids_off)
method_ids_off = str(method_ids_off,encoding='utf-8')
print('方法偏移位置: ',end='')
print(hex(int(method_ids_off,16)))

f.seek(0x60)
class_defs_size = f.read(4)
a = bytearray(class_defs_size)
a.reverse()
class_defs_size = bytes(a)
class_defs_size = binascii.b2a_hex(class_defs_size)
class_defs_size = str(class_defs_size,encoding='utf-8')
print('类定义数量: ',end='')
print(int(class_defs_size,16),end='')
print('(',end='')
print(hex(int(class_defs_size,16)),end='')
print(')')

f.seek(0x64)
class_defs_off = f.read(4)
a = bytearray(class_defs_off)
a.reverse()
class_defs_off = bytes(a)
class_defs_off = binascii.b2a_hex(class_defs_off)
class_defs_off = str(class_defs_off,encoding='utf-8')
print('类定义偏移位置: ',end='')
print(hex(int(class_defs_off,16)))

f.seek(0x68)
data_size = f.read(4)
a = bytearray(data_size)
a.reverse()
data_size = bytes(a)
data_size = binascii.b2a_hex(data_size)
data_size = str(data_size,encoding='utf-8')
print('数据段大小: ',end='')
print(int(data_size,16),end='')
print('(',end='')
print(hex(int(data_size,16)),end='')
print(')')

f.seek(0x6c)
data_off = f.read(4)
a = bytearray(data_off)
a.reverse()
data_off = bytes(a)
data_off = binascii.b2a_hex(data_off)
data_off = str(data_off,encoding='utf-8')
print('数据段偏移位置: ',end='')
print(hex(int(data_off,16)))

if __name__ == '__main__':
f = open("C:\\Users\\admin\\Desktop\\android_nx\\classes.dex", 'rb', True)
parserHeader(f)
f.close()

V. Related Links

  Reference links

  An author github link (related accessories download): https://github.com/windy-purple/parserDex

  PS: Some pictures from the network, intrusion deleted

Guess you like

Origin www.cnblogs.com/aWxvdmVseXc0/p/11879093.html