Swift's in-depth analysis of the underlying principles of "objects"

Introduction to Swift compilation
  • For the Swift compilation environment configuration and compilation process, please refer to my previous blog: Swift source code compilation environment setup and compilation process ;
  • Create a new Swift project, create a YDWTeacher class in main.swift, and use the default initializer to create an instance object and assign it to t, as follows:
	class YDWTeacher {
    
     
		var age: Int = 18 
		var name: String = "YDW"  
	} 
	let t = YDWTeacher()
  • Then view the abstract syntax tree in the terminal: swiftc -dump-ast main.swift, as follows:

Insert picture description here

  • The next thing to study is what kind of operation does this initializer do? So introduce SIL (Swift intermediate language);

  • iOS development language, whether it is OC or Swift, the bottom layer is compiled by LLVM to generate .o executable file, as shown below:
    Insert picture description here

  • it's easy to see:

    • OC is compiled into IR through clang compiler, and then executable file .o (ie machine code) is generated;
    • In swift, use the swiftc compiler to compile into IR, and then generate executable files;
  • Let's take another look: what steps go through the compilation process of a Swift file:
    Insert picture description here

  • The following is the compilation process in Swift, where SIL (Swift Intermediate Language) is the intermediate code in the Swift compilation process, which is mainly used for further analysis and optimization of Swift code. As shown in the figure below, SIL is located between AST and LLVM IR:

Insert picture description here

  • The difference between Swift and OC is that Swift generates high-level SIL; the front-end compiler used by Swift during compilation is Swiftc, which is different from the clang used in our previous OC.
  • Use the swiftc -h terminal command to see what swiftc can do:

Insert picture description here

  • Analysis description:
    • -dump-ast syntax and type check, print AST syntax tree
    • -dump-parse syntax check, print AST syntax tree
    • -dump-pcm Dump debugging information about precompiled Clang modules
    • -dump-scope-maps expanded-or-list-of-line:column
      Parse and type-check input file(s) and dump the scope map(s)
    • -dump-type-info Output YAML dump of fixed-size types from all imported modules
    • -dump-type-refinement-contexts
      Type-check input file(s) and dump type refinement contexts(s)
    • -emit-assembly Emit assembly file(s) (-S)
    • -emit-bc output a BC file of LLVM
    • -emit-executable output an executable file
    • -emit-imported-modules show the list of imported modules
    • -emit-ir show IR intermediate code
    • -emit-library output a dylib dynamic library
    • -emit-object output a .o machine file
    • -emit-pcm Emit a precompiled Clang module from a module map
    • -emit-sibgen output a .sib original SIL file
    • -emit-sib output a .sib standard SIL file
    • -emit-silgen show the original SIL file
    • -emit-sil show standard SIL files
    • -index-file Generate index data for source files
    • -parse parse files
    • -print-ast parse the file and print the (pretty/concise) syntax tree
    • -resolve-imports resolve files imported by import
    • -typecheck check file type
SIL
1. What is SIL analysis?
  • SIL relies on swift's type system and declarations, so SIL syntax is an extension of swift. A sil file is a swift source file with SIL definition added;
  • There is no implicit import in the SIL file. If you use swift or Buildin standard components, you must explicitly import it;
  • A SIL function consists of one or more blocks. A block is a linear sequence of instructions. The last instruction in each block transfers control to another block or returns from the function.
  • If you want to explore the content of SIL in detail, please refer to: 2015 LLVM Developers' Meeting
2. SIL analysis mian function
  • After viewing the abstract syntax tree, continue to call the swiftc -emit-sil main.swift >> ./main.sil && code main.sil command in the terminal to generate the main.sil file;
  • Open the SIL file with VSCode:
// main
//`@main`:标识当前main.swift的`入口函数`,SIL中的标识符名称以`@`作为前缀
sil @main : $@convention(c) (Int32, UnsafeMutablePointer<Optional<UnsafeMutablePointer<Int8>>>) -> Int32 {
    
    
//`%0、%1` 在SIL中叫做寄存器,可以理解为开发中的常量,一旦赋值就不可修改,如果还想继续使用,就需要不断的累加数字(注意:这里的寄存器,与`register read`中的寄存器是有所区别的,这里是指`虚拟寄存器`,而`register read`中是`真寄存器`)
bb0(%0 : $Int32, %1 : $UnsafeMutablePointer<Optional<UnsafeMutablePointer<Int8>>>):
//`alloc_global`:创建一个`全局变量`,即代码中的`t`
  alloc_global @$s4main1tAA10YDWTeacherCvp        // id: %2
//`global_addr`:获取全局变量地址,并赋值给寄存器%3
  %3 = global_addr @$s4main1tAA10YDWTeacherCvp : $*YDWTeacher // user: %7
//`metatype`获取`YDWTeacher`的`MetaData`赋值给%4
  %4 = metatype $@thick YDWTeacher.Type           // user: %6
//将`__allocating_init`的函数地址赋值给 %5
  // function_ref YDWTeacher.__allocating_init()
  %5 = function_ref @$s4main10YDWTeacherCACycfC : $@convention(method) (@thick YDWTeacher.Type) -> @owned YDWTeacher // user: %6
//`apply`调用 `__allocating_init` 初始化一个变量,赋值给%6
  %6 = apply %5(%4) : $@convention(method) (@thick YDWTeacher.Type) -> @owned YDWTeacher // user: %7
//将%6的值存储到%3,即全局变量的地址(这里与前面的%3形成一个闭环)
  store %6 to %3 : $*YDWTeacher                   // id: %7
//构建`Int`,并`return`
  %8 = integer_literal $Builtin.Int32, 0          // user: %9
  %9 = struct $Int32 (%8 : $Builtin.Int32)        // user: %10
  return %9 : $Int32                              // id: %10
} // end sil function 'main'
  • analysis:
    • @main This identifies the current main.swift function, and the identifier name in SIL is prefixed with @;
    • %0, %1... are also called registers in SIL, which can be understood as constants in daily development, and can't be modified once they are assigned. If SIL continues to be used, then the values ​​are continuously accumulated. At the same time, the registers mentioned here are virtual, and finally run on the machine, using real registers;
    • alloc_gobal: create a global variable;
    • global_addr: Get the address of the global variable and assign it to %3;
    • Metatype gets the Metadata of YDWTeacher and assigns it to %4 and assigns the function address of __allocating_init to %5;
    • __apply calls __allocating_init and returns the value to %6;
    • Store the value of %6 in %3 (that is, the address of the global variable just created);
    • Construct Int, and return;
  • Note: The code command is configured as follows in .zshrc, you can specify the software in the terminal to open the corresponding file:
$ open .zshrc
// ****** 添加以下别名
alias subl='/Applications/SublimeText.app/Contents/SharedSupport/bin/subl'
alias code='/Applications/Visual\ Studio\ Code.app/Contents/Resources/app/bin/code'

// ****** 使用
$ code main.sil

// 如果想SIL文件高亮,需要安装插件:VSCode SIL
  • From the SIL file, it can be seen that the code is obfuscated and can be restored with the following command. Take s4main1tAA10YDWTeacherCvp as an example: xcrun swift-demangle s4main1tAA10YDWTeacherCvp, the result is as follows:
	xcrun swift-demangle s4main1tAA10YDWTeacherCvp
	$s4main1tAA10YDWTeacherCvp ---> main.t : main.YDWTeacher
  • Search for s4main10YDWTeacherCACycfC in the SIL file. Its internal implementation is mainly to allocate memory + initialize variables:
    • allocing_ref: Create an instance of YDWTeacher, the reference count of the current instance is 1;
    • Call the init method;
	// ********* main入口函数中代码 *********
	%5 = function_ref @$s4main10YDWTeacherCACycfC : $@convention(method) (@thick YDWTeacher.Type) -> @owned YDWTeacher 
	
	// s4main10YDWTeacherCACycfC 实际就是__allocating_init()
	// YDWTeacher.__allocating_init()
	sil hidden [exact_self_class] @$s4main10YDWTeacherCACycfC : $@convention(method) (@thick YDWTeacher.Type) -> @owned YDWTeacher {
    
    
	// %0 "$metatype"
	bb0(%0 : $@thick YDWTeacher.Type):
	// 堆上分配内存空间
	%1 = alloc_ref $YDWTeacher                      // user: %3
	// function_ref YDWTeacher.init() 初始化当前变量
	%2 = function_ref @$s4main10YDWTeacherCACycfc : $@convention(method) (@owned YDWTeacher) -> @owned YDWTeacher // user: %3
	// 返回
	%3 = apply %2(%1) : $@convention(method) (@owned YDWTeacher) -> @owned YDWTeacher // user: %4
	return %3 : $YDWTeacher                         // id: %4
	} // end sil function '$s4main10YDWTeacherCACycfC'
Symbolic breakpoint debugging
  • Set the "__allocating_init" symbol breakpoint in our TestSwift project;

Insert picture description here

  • Then execute, you can see: the internal call is swift_allocObject;

Insert picture description here

Source code analysis
  • Write the following code (you can also copy) in the REPL (command interaction line, similar to python, you can write code here) in VSCode, and search for the *_swift_allocObject function and add a breakpoint, as shown below:

Insert picture description here

  • Then initialize an instance object t and press Enter:

Insert picture description here

  • It can be seen from Local here: requiredSize is the memory size, requiredAlignmentMask is the memory alignment; requiredAlignmentMask is the byte alignment in swift, this is the same as in OC, it must be a multiple of 8, and the insufficient will be automatically filled. The purpose is to exchange space for time to improve memory operation efficiency;
static HeapObject *_swift_allocObject_(HeapMetadata const *metadata,
                                       size_t requiredSize,
                                       size_t requiredAlignmentMask) {
    
    
  assert(isAlignmentMask(requiredAlignmentMask));
  auto object = reinterpret_cast<HeapObject *>(
      swift_slowAlloc(requiredSize, requiredAlignmentMask));

  // NOTE: this relies on the C++17 guaranteed semantics of no null-pointer
  // check on the placement new allocator which we have observed on Windows,
  // Linux, and macOS.
  new (object) HeapObject(metadata);

  // If leak tracking is enabled, start tracking this object.
  SWIFT_LEAKS_START_TRACKING_OBJECT(object);

  SWIFT_RT_TRACK_INVOCATION(object, swift_allocObject);

  return object;
}
  • The source code of swift_allocObject is as follows, mainly divided into:
    • Allocate memory through swift_slowAlloc and perform memory byte alignment;
    • Initialize an instance object through new + HeapObject + metadata;
    • The return value of the function is of type HeapObject, so the memory structure of the current object is the memory structure of HeapObject;
  • Enter the swift_slowAlloc function, its internal is mainly to allocate a size of memory space in the heap through malloc, and return the memory address, which is mainly used to store instance variables:
void *swift::swift_slowAlloc(size_t size, size_t alignMask) {
    
    
  void *p;
  // This check also forces "default" alignment to use AlignedAlloc.
  if (alignMask <= MALLOC_ALIGN_MASK) {
    
    
#if defined(__APPLE__)
    p = malloc_zone_malloc(DEFAULT_ZONE(), size);
#else
	// 堆中创建size大小的内存空间,用于存储实例变量
    p = malloc(size);
#endif
  } else {
    
    
    size_t alignment = (alignMask == ~(size_t(0)))
                           ? _swift_MinAllocationAlignment
                           : alignMask + 1;
    p = AlignedAlloc(size, alignment);
  }
  if (!p) swift::crash("Could not allocate memory.");
  return p;
}
  • To enter the HeapObject initialization method, two parameters are required: metadata, refCounts:
struct HeapObject {
    
    
  /// This is always a valid pointer to a metadata object.
  HeapMetadata const *metadata;

  SWIFT_HEAPOBJECT_NON_OBJC_MEMBERS;

#ifndef __swift__
  HeapObject() = default;

  // Initialize a HeapObject header as appropriate for a newly-allocated object.
  constexpr HeapObject(HeapMetadata const *newMetadata) 
    : metadata(newMetadata)
    , refCounts(InlineRefCounts::Initialized)
  {
    
     }
  
  // Initialize a HeapObject header for an immortal object
  constexpr HeapObject(HeapMetadata const *newMetadata,
                       InlineRefCounts::Immortal_t immortal)
  : metadata(newMetadata)
  , refCounts(InlineRefCounts::Immortal)
  {
    
     }
  • analysis:
    • The metadata type is HeapMetadata, which is a pointer type and occupies 8 bytes;
    • refCounts (reference count, type is InlineRefCounts, and InlineRefCounts is an alias of class RefCounts, occupying 8 bytes), swift uses arc reference count;
to sum up
  • For the instance object t, its essence is a HeapObject structure, with a default memory size of 16 bytes (metadata 8 bytes + refCounts 8 bytes). The comparison with OC is as follows:
    • The essence of the instance object in OC is a structure, which is inherited from objc_object as a template, in which there is an isa pointer, occupying 8 bytes;
    • There is one more instance object in Swift than in OC by default. refCounted reference count size, the default attribute occupies 16 bytes;
  • The memory allocation process of objects in Swift is: _ allocating_init --> swift_allocObject --> _swift_allocObject --> swift_slowAlloc --> malloc;
  • The responsibility of init is to initialize variables, which is consistent with OC.

Guess you like

Origin blog.csdn.net/Forever_wj/article/details/112001532