Block principle (1)

What exactly is a Block, let's start with the C++ code

Start with the simplest block structure

image.png

clang -rewrite-objc main.m -o main.cpp && open main.cpp

image.png

image.png

To make it easier to read, let's simplify the code

image.png

In order to facilitate further reading, the naming is simplified here, refer to the following simple process

image.png

  • Combined with clang to compile the intermediate C++ code, through the creation of block, combined with the above picture, first outline a sketch in mind

    • Create a two-tier structure

      • BlockCreate structure

      • Block structure, a member of BlockCreate

    • Construct parameters through BlockCreate, instantiate BlockCreate member Block::block

    • The final return is a BlockCreate structure pointer

    • Through the first address of the BlockCreate structure, we can get the member Block::block, and the first address of BlockCreate is the same as the first address of the member block, because the block is located at the beginning of the BlockCreate memory space

      Since you can get the first address (member block address), you can also get the address of the member Desc through the memory offset

    • By getting the address of the member Block::block, you can call the member method FuncPtr of Block::block, and FuncPtr is precisely the entry address of the fun function assigned when the Block::block member is instantiated through the BlockCreate construct.

  • Be sure to understand this two-layer structure. Although it is not the real source code, it is very helpful for us to analyze the source code later.

The previous example does not use variables, we can operate it again in the previous way, and compare the difference

image.png

image.png

When accessing a local variable outside the custom structure

You will find that the c++ code generated by clang has changed. Compare the instantiation process above.

  • There is one more member int a in the BlockCreate structure

  • The BlockCreate construction also has an additional parameter

  • The variable a is accessed inside the func function through BlockCreate::*self of func(BlockCreate *self) to get a copy

  • You will find that there are 3 places where the variable a exists

    • The local variable a inside the main function

    • BlockCrete 结构体内的成员变量a

    • func方法内部的局部变量a

    其实这3个变量a分别是3个不同的变量了

把局部变量a改为static修饰,继续clang c++查看

image.png

image.png

用static修饰变量a,不一样了

BlockCreate构造传参,此时传递的是 a的地址,而BlockCreate成员 a也变成了 指针, func内部的局部变量a 也变成了 指针,func内部的a是通过 BlockCreate::*self 的指针a 赋值 给func内部的局部变量 指针a

所以static修饰a后,func内部访问的a其实还是 main函数内部的 指针a

把局部变量a改为 __block修饰,继续clang c++查看

image.png

image.png

希望你不会觉得懵,这次复杂了些

  • 出现了一个结构 __Block_byref_a_0

  • BlockCreate 成员Desc的结构内部多了两个 函数 copy & dispose

这里简单解释下

  • 普通的局部变量a 变成了一个结构 __Block_byref_a_0, a是这个结构的成员

    • 成员 void *__isa

    • 成员 __block_byref_a_0 *__forwarding;

    • 成员 int __flags;

    • 成员 int __size

    • 成员 int a

    在main里声明的__block修饰的局部变量, 地址赋值给了 __forwarding, 值赋给了 Block_byref结构里的成员a,注意这个设定, 虽然成员也叫a,只是起到一个接收值的作用,关键在于__forwarding 拿到了原来的a的指针

    先看下__block修饰的a究竟是怎么访问的

image.png

__forwarding 类型 __Block_byref_a_0 *,类似于链表节点,所以也是一个指向 __Block_byref_a_0 结构的指针 至于有什么用,暂存疑,后面源码接着分析

对比着看,其实很明显,不难理解

image.png

block源码 - libclosure-79 查看

源码入口该怎么查看呢,我们先通过汇编看下

image.png

既然retainBlock,说明block开辟了空间,进入查看

image.png

继续跳转 br x16

image.png

目前找到了_Block_copy这样一个符号,然后进入源码查看

image.png

你会看到一个结构Block_layout

image.png

Block_layout 就是前面通过clang c++代码 分析出的 两层结构BlockCreate成员 Block::block

__block 修饰变量 测试代码放进 block源码进行调试

image.png

这段代码是在block源码中测试的

image.png

这其实就是依照Block_layout 栈上的空间结构,在堆区创建了一个Block_layout结构

同时 新开辟的Block_layout结构->invoke 从原来栈上Block_layout->invoke拷贝过来

image.png

既然是堆上开辟空间创建的Block_layout结构,自然isa 指向 _NSConcreteMallocBlock (堆block)

block分析源码遇到问题

现在还有两块没探索到源码,就是 前面通过clang 编译生成的c++代码中__Block_byref_a_0这样的结构,还有一块是BlockCreate构造逻辑部分

那么接下来该何去何从?

我选择最原始的方式 汇编 + 下符号断点 + 结合clang c++代码分析

image.png

先把代码断到此处,防止dyld其他流程干扰

image.png

下符号断点 同时把前面分析过的 _Block_copy 符号也下下来,为了方便分析流程

跟着调试 进入 _Block_object_dispose:

image.png

回到之前clang编译出的c++代码看下

image.png

既然下到了符号_Block_object_dispose 那么同样也把符号 _Block_object_copy下下来继续调试

没有的话 就试试 _Block_object_assign, 之所以没有找到 _Block_object_copy符号,是因为那是由编译器决定的

成功断点符号 _Block_object_assign

image.png

找到头绪,自然我们又回到了源码

image.png

  • 看下源码注释

    When Blocks or Block_byrefs hold objects then their copy routine helpers use this entry point to do the assignment.

    当Blocks(可以理解为前面的有成员func的那个结构) 或者 Block_byref持有对象时候,这个入口就会被触发 执行赋值操作

image.png

  • __block int a = 10 类型为 BLOCK_FIELD_IS_BYREF | BLOCK_FIELD_IS_WEAK or BLOCK_FIELD_IS_BYREF

    执行 _Block_byref_copy()

_Block_byref_copy

在分析_Block_byref_copy流程之前,我们需要了解下Block_byref 是什么

image.png

从前面clang编译拿到的c++代码,可以看到,Block_byref 是对常规变量的封装,封装结构里还多了isa,__forwarding成员

image.png

源码中还存在 Block_byref_2 Block_byref_3 两个结构,暂且不表,后面会继续说明

我们可以做个假设,目前我们测试的实例 是block引用外部 __block修饰的变量,我们也是这么用的,既然block内部访问外部变量,那么也会对于这个变量的引用计数产生影响 flags就是存储引用计数的

_Block_byref_copy翻译

image.png

如果源byref结构已经在heap上,则不需要执行拷贝,引用计数+1

image.png

中间有一段内存偏移的代码,还没解析,继续

从源码中我们看到

Block_byref_2 *src2 = src + 1
Block_byref_3 *src3 = src2 + 2

那么 Block_byref Block_byref_2 Block_byref_3 是连续的内存结构,根据条件判断是否解析 Block_byref_2 Block_byref_3

认知遗留问题

找遍了源码 clang编译出的c++代码里 __main_block_impl_0 这样的结构并没有发现

image.png

byref_keep byref_destroy 究竟实现了什么功能

因为我们用的常规变量a测试 我们换成object看下

将变量a换为object测试

image.png

clang c++代码

image.png

image.png

从源码得知

image.png

编译阶段,Block_byref结构 flag被设置为 1 << 25, 标识是有 Block_byref_2结构的

image.png

image.png

131有什么意义

image.png

两个参数 + 40 什么意思

image.png

按照编译的逻辑,byref_keep 就是 object类型的对象的 拷贝

但是运行时会做修正 流程有差别

同样 byref_destroy:

image.png

以上为 Block_byref 逻辑,再通过clang得到的c++ 看下 Block_layout 的处理

image.png

image.png

再确认下 __block修饰的 object对象,在block体里 究竟是如何访问的

image.png

总结

  • __block 修饰变量之后,编译器会在栈上构建一个 栈Block_byref(包含变量指针)

  • 定义block,可以理解为编译器生成一个中间结构BlockCreate(这个名字是特意起的,知道是个结构,为了便于理解,你可以这么理解)

    • 同时编译器会在栈上初始化构建一个 栈Block_layout(包含func成员)
  • 执行BlockCreate构造方法

    • 通过Block_layout首地址偏移 得到 Block_copy函数地址, 执行Block_copy,把 栈Block_byref 拷贝 到堆Block_byref

    • 构造参数 栈Block_byref,通过Block_byref首地址偏移 得到 Block_byref_2(包含_Block_byref_copy 即byref拷贝函数)首地址, 执行 _Block_byref_copy函数, 把栈Block_byref 拷贝到 堆Block_byref

    • 继续上一步的位置 内存偏移 8字节,得到堆上开辟的 object内存空间首地址, 这里当然就存放 object对象了

    • 需要注意的一个细节 栈Block_byref 拷贝到 堆Block_byref之后,由于堆上是新的内存空间,那么栈与堆不就两个空间了吗,如何保障访问的是同一块内存?

      I think the solution is to point the forwarding in the stack Block_byref and heap Block_byref to the heap Block_byref after copying, that is, the heap forwarding points to itself again

      After __block modifies a variable, whether it is accessing the variable within the block block or accessing the variable outside the block, it accesses the heap space through forwarding, and then accesses the variable in the target space, thus ensuring that the accessed variable is the same piece of memory space

    image.png

    image.png

image.png

  • The life cycle of the variable held by Block_byref ends, execute _Block_object_dispose

    • Execute the _Block_byref_release function, find the first address of Block_byref_2 according to the offset of the first address of Block_byref, continue to offset 8 bytes to get byref_destroy Execute the destructor to reclaim the heap memory space
  • Block_layout scope ends or life cycle ends, execute _Block_release

    • Find the first address of Block_descriptor_2 according to the first address offset of Block_layout, continue to offset 8 bytes, and then dispose executes the destructor to reclaim the Block_layout heap memory space opened on the heap

read register view

image.png

Symbolic breakpoint_Block_copy

image.png

Before _Block_copy is executed, register rax receives parameters (arm64 reads register x1)

image.png

After execution, ret returns, and the rax register stores the return value

image.png

  • The variable a is changed to __block modification

image.png

Because of the __block modification, the copy function address appears in Block_layout, and through copy, _Block_copy is executed

Without __block modification, there is no copy dispose function, and _Block_copy is executed by default

This difference is caused by the difference in flags when constructing parameters. Before __block modification, it is 0, and after __block modification, 1 << 25

image.png

Guess you like

Origin juejin.im/post/7118386172414328868