[iOS] Implementation principle of weak keyword

Preface

Regarding what weak keywords are, you can read my previous blog: [OC] Attribute Keywords

weak principle

1. SideTable

The SideTable structure, the predecessors gave it a very vivid name called reference count and weak reference dependency table, because it is mainly used to manage the reference count and weak table of objects. Declare its data structure in NSObject.mm:

struct SideTable {
    
    
// 保证原子操作的自旋锁
    spinlock_t slock;
    // 引用计数的 hash 表
    RefcountMap refcnts;
    // weak 引用全局 hash 表
    weak_table_t weak_table;
    
    SideTable() {
    
    
        memset(&weak_table, 0, sizeof(weak_table));
    }

    ~SideTable() {
    
    
        _objc_fatal("Do not delete SideTable.");
    }

    void lock() {
    
     slock.lock(); }
    void unlock() {
    
     slock.unlock(); }
    void reset() {
    
     slock.reset(); }

    // Address-ordered lock discipline for a pair of side tables.

    template<HaveOld, HaveNew>
    static void lockTwo(SideTable *lock1, SideTable *lock2);
    template<HaveOld, HaveNew>
    static void unlockTwo(SideTable *lock1, SideTable *lock2);
}

slock is a spin lock selected to prevent competition.
refcnts is a variable that assists the extra_rc common reference count of the isa pointer of the object (for object results, mentioned later)

Next, let's take a look at these three member variables in SideTable:

1.1 spinlock_t slock spin lock

Spin locks are more efficient than mutex locks. However, we should note that since the CPU is not released during spin, the thread holding the spin lock should release the spin lock as soon as possible, otherwise the thread waiting for the spin lock will keep spinning, which will waste CPU time.

Lock SideTable when operating reference counting to avoid data errors.

1.2 RefcountMap

Insert image description here

typedef objc::DenseMap<DisguisedPtr<objc_object>,size_t,true> RefcountMap;

Among them, DenseMap is another template class:

template<typename KeyT, typename ValueT,
         bool ZeroValuesArePurgeable = false, 
         typename KeyInfoT = DenseMapInfo<KeyT> >
class DenseMap : public DenseMapBase<DenseMap<KeyT, ValueT, 
  ZeroValuesArePurgeable, KeyInfoT>, KeyT, ValueT, KeyInfoT, 
  ZeroValuesArePurgeable> {
    
    
  ...
  BucketT *Buckets;
  unsigned NumEntries;
  unsigned NumTombstones;
  unsigned NumBuckets;
  ...
}

The more important members include the following:

1.The ZeroValuesArePurgeabledefault value is false, but RefcountMapspecify its initialization as true. Whether this member tag can use a bucket with a value of 0 (reference count of 1). Because the initial value of an empty bucket is 0, there is no difference between a bucket with a value of 0 and an empty bucket. If the bucket with a value of 0 is allowed to be used, if the bucket corresponding to the object is not found when searching for a bucket, and the tombstone bucket is not found, the bucket with a value of 0 will be used first.
2. BucketsPointers manage a continuous memory space, that is, an array. The array members are BucketTobjects of type. We BucketTcall the objects here buckets (actually this array should be called buckets. Apple calls the elements in the array buckets for the sake of image. something, not in the sense of buckets within hash buckets). After the bucket array applies for space, it will be initialized and empty buckets will be placed in all positions (when the bucket keyis , EmptyKeyit is an empty bucket). Subsequent reference counting operations will depend on the bucket.
The data type of the bucket is actually std::pairsimilar to swiftthe ancestor type in , that is, the object address and the reference count of the object (the reference count here is similar to isa, and several of them are used to save the reference count, and a few are bitreserved for bitOther flag bits) are combined into a data type.

BucketTis defined as follows:

typedef std::pair<KeyT, ValueT> BucketT;

3. NumEntriesRecord the number of used non-empty buckets in the array.

4. NumTombstones, Tombstoneliterally translated as tombstone, when the reference count of an object is 0 and it is to be taken out from the bucket, its location will be marked. It Tombstoneis NumTombstonesthe number of tombstones in the array. The role of tombstones will be introduced later.

5. NumBucketsThe number of buckets, because the array is always filled with buckets, can be understood as the size of the array.

inline uint64_t NextPowerOf2(uint64_t A) {
    
    
    A |= (A >> 1);
    A |= (A >> 2);
    A |= (A >> 4);
    A |= (A >> 8);
    A |= (A >> 16);
    A |= (A >> 32);
    return A + 1;
}

This is a 64-bit method for providing an array size. When space needs to be allocated for a bucket array, this method will be used to determine the array size. This algorithm can cover the highest bit 1 to all lower bits. For example, A = 0b10000, ( A >> 1) = 0b01000, bitwise AND will get A = 0b11000, at this time (A >> 2) = 0b00110, bitwise AND will get A = 0b11110. By analogy, the highest bit of A will be 1. It covers the high 2 bits, the high 4 bits, the high 8 bits, and the lowest bit. Finally, the binary number filled with 1 will add 1 to get a 0b1000...(N zeros). In other words, the size of the bucket array will be is 2^n.

The working logic of RefcountMap
1. Obtain the corresponding object SideTablesfrom SideTable. The reference count of the object with duplicate hash values ​​is stored in the same SideTable.
2. SideTableUsing find()methods and overloaded [] operators , determine the bucket corresponding to the object through the object address. The final search algorithm executed is LookupBucketFor().
3. The search algorithm will first judge the number of buckets. If the number of buckets is 0, it will return falsego back to the previous level and call the insertion method. If the search algorithm If an empty bucket or a tombstone bucket is found, the same return falsegoes back to the previous level to call the insertion algorithm, but the found bucket will be recorded first. If the bucket corresponding to the object is found, only its reference count + 1 or - 1 is needed. If the reference count is 0 If you need to destroy the object, keyset the value in this bucket toTombstoneKey

value_type& FindAndConstruct(const KeyT &Key) {
    
    
    BucketT *TheBucket;
    if (LookupBucketFor(Key, TheBucket))
      return *TheBucket;
    return *InsertIntoBucket(Key, ValueT(), TheBucket);
  }

4. The insertion algorithm will first check the available amount. If the available amount of the hash table (the number of tombstone buckets + empty buckets) is less than 1/4, then a larger space needs to be re-opened for the table. If there are few empty buckets in the table, is less than 1/8 (indicating that there are too many tombstone buckets), you need to clean up the tombstones in the table. In the above two cases, the hash search algorithm will find it difficult to find the correct location, and may even generate an infinite loop, so the table must be processed first. The table will then reallocate the positions of all buckets, and then re-search the available positions of the current object and insert it. If the above two situations do not occur, the reference count of the new object will be directly placed into the bucket provided by the caller.

Insert image description here

The function of tombstone:

  1. If the bucket with subscript 2 is set to an empty bucket instead of a tombstone bucket after object c is destroyed, the reference count is increased for object e at this time. When the bucket with subscript 2 is found according to the hash algorithm, it will be inserted directly. It is impossible to increase the reference count for e that is already in the bucket with subscript 4, but in our normal process, after the c object is destroyed, the bucket with subscript 2 will be set as a tombstone bucket. In this case, when the reference count is increased for the e object When the bucket with subscript 2 is found according to the hash algorithm, 2 will be skipped and the search will continue until the bucket corresponding to the e object is found, or until an empty bucket is found and a new bucket is created to store the e object.
  2. If a new object f is initialized at this time, and the bucket with subscript 2 is found according to the hash algorithm and a tombstone is found in the bucket, the subscript 2 will be recorded. Next, continue the hash algorithm to find the location, and find When the empty bucket is reached, it proves that there is no object f in the table. At this time, f uses the recorded tombstone bucket with subscript 2 instead of the found empty bucket, and can use the released position to ensure that the previous part of the hash table is They are all being used or waiting to be used.

The source code for finding the bucket corresponding to an object is as follows:

bool LookupBucketFor(const LookupKeyT &Val,
                       const BucketT *&FoundBucket) const {
    
    
    ...
    if (NumBuckets == 0) {
    
     //桶数是0
      FoundBucket = 0;
      return false; //返回 false 回上层调用添加函数
    }
    ...
    unsigned BucketNo = getHashValue(Val) & (NumBuckets-1); //将哈希值与数组最大下标按位与
    unsigned ProbeAmt = 1; //哈希值重复的对象需要靠它来重新寻找位置
    while (1) {
    
    
      const BucketT *ThisBucket = BucketsPtr + BucketNo; //头指针 + 下标, 类似于数组取值
      //找到的桶中的 key 和对象地址相等, 则是找到
      if (KeyInfoT::isEqual(Val, ThisBucket->first)) {
    
    
        FoundBucket = ThisBucket;
        return true;
      }
      //找到的桶中的 key 是空桶占位符, 则表示可插入
      if (KeyInfoT::isEqual(ThisBucket->first, EmptyKey)) {
    
     
        if (FoundTombstone) ThisBucket = FoundTombstone; //如果曾遇到墓碑, 则使用墓碑的位置
        FoundBucket = FoundTombstone ? FoundTombstone : ThisBucket;
        return false; //找到空占位符, 则表明表中没有已经插入了该对象的桶
      }
      //如果找到了墓碑
      if (KeyInfoT::isEqual(ThisBucket->first, TombstoneKey) && !FoundTombstone)
        FoundTombstone = ThisBucket;  // 记录下墓碑
      //这里涉及到最初定义 typedef objc::DenseMap<DisguisedPtr<objc_object>,size_t,true> RefcountMap, 传入的第三个参数 true
      //这个参数代表是否可以清除 0 值, 也就是说这个参数为 true 并且没有墓碑的时候, 会记录下找到的 value 为 0 的桶
      if (ZeroValuesArePurgeable  && 
          ThisBucket->second == 0  &&  !FoundTombstone) 
        FoundTombstone = ThisBucket;

      //用于计数的 ProbeAmt 如果大于了数组容量, 就会抛出异常
      if (ProbeAmt > NumBuckets) {
    
    
          _objc_fatal("...");
      }
      BucketNo += ProbeAmt++; //本次哈希计算得出的下表不符合, 则利用 ProbeAmt 寻找下一个下标
      BucketNo&= (NumBuckets-1); //得到新的数字和数组下标最大值按位与
    }
  }

Insert the code into the reference counting bucket of an object as follows:

BucketT *InsertIntoBucketImpl(const KeyT &Key, BucketT *TheBucket) {
    
    
    unsigned NewNumEntries = getNumEntries() + 1; //桶的使用量 +1
    unsigned NumBuckets = getNumBuckets(); //桶的总数
    if (NewNumEntries*4 >= NumBuckets*3) {
    
     //使用量超过 3/4
      this->grow(NumBuckets * 2); //数组大小 * 2做参数, grow 中会决定具体数值
      //grow 中会重新布置所有桶的位置, 所以将要插入的对象也要重新确定位置
      LookupBucketFor(Key, TheBucket);
      NumBuckets = getNumBuckets(); //获取最新的数组大小
    }
    //如果空桶数量少于 1/8, 哈希查找会很难定位到空桶的位置
    if (NumBuckets-(NewNumEntries+getNumTombstones()) <= NumBuckets/8) {
    
    
      //grow 以原大小重新开辟空间, 重新安排桶的位置并能清除墓碑
      this->grow(NumBuckets);
      LookupBucketFor(Key, TheBucket); //重新布局后将要插入的对象也要重新确定位置
    }
    assert(TheBucket);
    //找到的 BucketT 标记了 EmptyKey, 可以直接使用
    if (KeyInfoT::isEqual(TheBucket->first, getEmptyKey())) {
    
    
      incrementNumEntries(); //桶使用量 +1
    }
    else if (KeyInfoT::isEqual(TheBucket->first, getTombstoneKey())) {
    
     //如果找到的是墓碑
      incrementNumEntries(); //桶使用量 +1
      decrementNumTombstones(); //墓碑数量 -1
    }
    else if (ZeroValuesArePurgeable  &&  TheBucket->second == 0) {
    
     //找到的位置是 value 为 0 的位置
      TheBucket->second.~ValueT(); //测试中这句代码被直接跳过并没有执行, value 还是 0
    } else {
    
    
      // 其它情况, 并没有成员数量的变化(官方注释是 Updating an existing entry.)
    }
    return TheBucket;
  }

2. weak部分——weak_table_t

weak_table_tIn SideTablethe structure, Hashthe table that stores the object's weak reference pointer weakis the core data structure for function implementation.
First, let's take a look at weak_table_tthe source code of the structure:

struct weak_table_t {
    
    
    weak_entry_t *weak_entries;//连续地址空间的头指针,数组
    size_t    num_entries;//数组中已占用位置的个数
    uintptr_t mask;//数组下标最大值(即数组大小 -1)
    uintptr_t max_hash_displacement;//最大哈希偏移值
};

weak_table is a hash table structure. The hash value is calculated based on the address of the object pointed to by the weak pointer. Objects with the same hash value look backwards for available locations in the form of subscript + 1. It is a typical closed hash algorithm. Maximum The hash offset value is the maximum offset between the calculated hash value and the actual insertion position in all objects, which can be used as the upper limit of the loop during search.

weak_table structure diagram:
Insert image description here

2.1 Members of weak_entry_t

struct weak_entry_t {
    
    
    DisguisedPtr<objc_object> referent; //对象地址
    union {
    
      //这里又是一个联合体, 苹果设计的数据结构的确很棒
        struct {
    
    
            // 因为这里要存储的又是一个 weak 指针数组, 所以苹果继续选择采用哈希算法
            weak_referrer_t *referrers; //指向 referent 对象的 weak 指针数组
            uintptr_t        out_of_line_ness : 2; //这里标记是否超过内联边界, 下面会提到
            uintptr_t        num_refs : PTR_MINUS_2; //数组中已占用的大小
            uintptr_t        mask; //数组下标最大值(数组大小 - 1)
            uintptr_t        max_hash_displacement; //最大哈希偏移值
        };
        struct {
    
    
            //这是一个取名叫内联引用的数组
            weak_referrer_t  inline_referrers[WEAK_INLINE_COUNT]; //宏定义的值是 4
        };
    };
    // weak_entry_t 的赋值操作,直接使用 memcpy 函数拷贝 other 内存里面的内容到 this 中,
    // 而不是用复制构造函数什么的形式实现,应该也是为了提高效率考虑的...
    weak_entry_t& operator=(const weak_entry_t& other) {
    
    
        memcpy(this, &other, sizeof(other));
        return *this;
    }

    // 返回 true 表示使用 referrers 哈希数组 false 表示使用 inline_referrers 数组保存 weak_referrer_t
    bool out_of_line() {
    
    
        return (out_of_line_ness == REFERRERS_OUT_OF_LINE);
    }

    // weak_entry_t 的构造函数
    
    // newReferent 是原始对象的指针,
    // newReferrer 则是指向 newReferent 的弱引用变量的指针。
    
    // 初始化列表 referent(newReferent) 会调用: DisguisedPtr(T* ptr) : value(disguise(ptr)) { } 构造函数,
    // 调用 disguise 函数把 newReferent 转化为一个整数赋值给 value。
    weak_entry_t(objc_object *newReferent, objc_object **newReferrer)
        : referent(newReferent)
    {
    
    
        // 把 newReferrer 放在数组 0 位,也会调用 DisguisedPtr 构造函数,把 newReferrer 转化为整数保存
        inline_referrers[0] = newReferrer;
        // 循环把 inline_referrers 数组的剩余 3 位都置为 nil
        for (int i = 1; i < WEAK_INLINE_COUNT; i++) {
    
    
            inline_referrers[i] = nil;
        }
    }
}

Through the address of the object, we can weak_table_tfind the corresponding one in weak_entry_t, weak_entry_twhich stores all weakpointers pointing to this object.

Apple weak_entry_tuses another union in . The first structure out_of_line_nessoccupies 2 bits and num_refs62 bits in a 64-bit environment, so in fact both structures are 32 bytes and share an address. When pointing to this object If there are no more than 4 weak pointers, use the array directly inline_referrers, eliminating the step of hash operation. If weakthe number of pointers exceeds 4, you must use the hash table in the first structure.

2.2 The general logic of weak_table

  • Under ARC, the compiler will automatically add code to manage reference counting. weakWhen assigning a pointer, the compiler will call storeWeakto assign the value. If weakthe pointer points to an object, it will first call the method to delete the pointer weak_unregister_no_lock()from the original table. weak, and then call to insert this pointer weak_register_no_lock()into the corresponding tableweak
  • When searching, first use the address of the pointed object to calculate the hash value SideTables(), find the corresponding one SideTable, and then further use this object address SideTableto weak_tablefind the corresponding one weak_entry_t. The final operation is this weak_entry_t.
    If weakthe pointer of this object does not exceed 4 , the array will be operated directly inline_referrers, otherwise referrersmemory will be requested for the array, and the hash algorithm will be used to manage the table.
  • When the old weakpointer is deleted, the address of the object originally pointed to will be used to find the corresponding one weak_entry_t, and the pointer will be deleted from it weak. If weakthe pointer array is empty after deletion, then this pointer array will be destroyed, the weak_entry_toriginal position will be empty, and the reference isaof the pointer originally pointed to the object will be deleted. weakMark bit 0.
  • When adding a new weak pointer, if the corresponding one is found , the pointer weak_entry_twill be inserted into the array. If not found, a configured array will be created and inserted .weakreferrersweak_entry_tweak_table_t

3. Important implementation methods of weak

3.1 objc_initWeak function

objc_initWeaknewObjThe main function of the function is to initialize a __weakmodified object pointer based on the incoming object, handle the situation of invalid objects, and perform some performance optimization operations.

id objc_initWeak(id *location, id newObj) {
    
    
// 查看对象实例是否有效,无效对象直接导致指针释放
    if (!newObj) {
    
    
        *location = nil;
        return nil;
    }
    // 这里传递了三个 Bool 数值
    // 使用 template 进行常量参数传递是为了优化性能
    return storeWeakfalse/*old*/, true/*new*/, true/*crash*/>
    (location, (objc_object*)newObj);
}

Then let’s take a look at objc_initWeak()what the two parameters passed in represent:

  • location: __weakThe address of the pointer, stores the address of the pointer, so that the object it points to can be set to nil at the end.
  • newObj:The object referenced. That is in the example p.

The function of this function is as follows:

  • First, it checks if the object passed in newObjis valid, and if newObjit is an invalid object (i.e. nil), then it sets the pointer pointed to by location, and returns it directly .__weaknilnil
  • If newObjis a valid object, it will call storeWeakthe function to perform the actual weak reference initialization operation.
  • The storeWeak function is a low-level internal function that newObjstores the object into locationthe memory address pointed to, sets the flag bit, and performs some performance optimization operations.

objc_initWeakThe function has a prerequisite: it objectmust be a __weakvalid pointer that has not been registered as an object. And valueit can be nil, or point to a valid object.

3.2 objc_storeWeak()

storeWeakThe following code is a template implementation of a function used to implement weak references in the Objective-C runtime . This function is mainly used to update the pointer of the weak reference pointer and handle competition conflicts in multi-thread situations.

// HaveOld:	 true - 变量有值
// 			false - 需要被及时清理,当前值可能为 nil
// HaveNew:	 true - 需要被分配的新值,当前值可能为 nil
// 			false - 不需要分配新值
// CrashIfDeallocating: true - 说明 newObj 已经释放或者 newObj 不支持弱引用,该过程需要暂停
// 			false - 用 nil 替代存储
template bool HaveOld, bool HaveNew, bool CrashIfDeallocating>
static id storeWeak(id *location, objc_object *newObj) {
    
    
	// 该过程用来更新弱引用指针的指向
	// 初始化 previouslyInitializedClass 指针
    Class previouslyInitializedClass = nil;
    id oldObj;
    // 声明两个 SideTable
    // ① 新旧散列创建
    SideTable *oldTable;
    SideTable *newTable;
	// 获得新值和旧值的锁存位置(用地址作为唯一标示)
	// 通过地址来建立索引标志,防止桶重复
	// 下面指向的操作会改变旧值
  retry:
    if (HaveOld) {
    
    
    	// 更改指针,获得以 oldObj 为索引所存储的值地址
        oldObj = *location;
        oldTable = &SideTables()[oldObj];
    } else {
    
    
        oldTable = nil;
    }
    if (HaveNew) {
    
    
    	// 更改新值指针,获得以 newObj 为索引所存储的值地址
        newTable = &SideTables()[newObj];
    } else {
    
    
        newTable = nil;
    }
	// 加锁操作,防止多线程中竞争冲突
    SideTable::lockTwoHaveOld, HaveNew>(oldTable, newTable);
	// 避免线程冲突重处理
	// location 应该与 oldObj 保持一致,如果不同,说明当前的 location 已经处理过 oldObj 可是又被其他线程所修改
    if (HaveOld  &&  *location != oldObj) {
    
    
        SideTable::unlockTwoHaveOld, HaveNew>(oldTable, newTable);
        goto retry;
    }
    // 防止弱引用间死锁
    // 并且通过 +initialize 初始化构造器保证所有弱引用的 isa 非空指向
    if (HaveNew  &&  newObj) {
    
    
    	// 获得新对象的 isa 指针
        Class cls = newObj->getIsa();
        // 判断 isa 非空且已经初始化
        if (cls != previouslyInitializedClass  &&  
            !((objc_class *)cls)->isInitialized()) {
    
    
        	// 解锁
            SideTable::unlockTwoHaveOld, HaveNew>(oldTable, newTable);
            // 对其 isa 指针进行初始化
            _class_initialize(_class_getNonMetaClass(cls, (id)newObj));
            // 如果该类已经完成执行 +initialize 方法是最理想情况
            // 如果该类 +initialize 在线程中 
            // 例如 +initialize 正在调用 storeWeak 方法
            // 需要手动对其增加保护策略,并设置 previouslyInitializedClass 指针进行标记
            previouslyInitializedClass = cls;
			// 重新尝试
            goto retry;
        }
    }
    // ② 清除旧值
    if (HaveOld) {
    
    
        weak_unregister_no_lock(&oldTable->weak_table, oldObj, location);
    }
    // ③ 分配新值
    if (HaveNew) {
    
    
        newObj = (objc_object *)weak_register_no_lock(&newTable->weak_table, 
                                                      (id)newObj, location, 
                                                      CrashIfDeallocating);
        // 如果弱引用被释放 weak_register_no_lock 方法返回 nil 
        // 在引用计数表中设置弱引用标记位
        if (newObj  &&  !newObj->isTaggedPointer()) {
    
    
        	// 弱引用位初始化操作
			// 引用计数那张散列表的weak引用对象的引用计数中标识为weak引用
            newObj->setWeaklyReferenced_nolock();
        }
        // 之前不要设置 location 对象,这里需要更改指针指向
        *location = (id)newObj;
    }
    else {
    
    
        // 没有新值,则无需更改
    }
    SideTable::unlockTwoHaveOld, HaveNew>(oldTable, newTable);
    return (id)newObj;
}

I will explain the main steps and functions of this function step by step:

  1. First, the function declares some variables, including previouslyInitializedClassto mark the initialized class, oldObjto store the reference of the old object, oldTableand newTableto represent the old object and the new object SideTable(a data structure that stores weak reference information).
  2. retryA retry mechanism is implemented through the tag to handle thread conflicts.
  3. Then, the function obtains the old object and the new object based on the passed in template parameters HaveOldand , and locks these two to prevent multi-thread competition.HaveNewSideTableSideTable::lockTwoHaveOld, HaveNew>SideTable
  4. Prevent thread conflicts: After locking, the function will check locationwhether is oldObjconsistent with . If it is inconsistent, it means that the current locationhas been processed oldObjbut has been modified by other threads. In order to avoid conflicts, you need to re-execute retrythe code at the label and re-acquire the old object. .
  5. Prevent deadlocks between weak references: the function checks whether the new object exists and the new object is not nil. If so, get isathe pointer to the new object and check isawhether it has been initialized. If it is not initialized, initialize it first and set it previouslyInitializedClassas a tag, and then re-execute retrythe code at the tag to prevent other threads from competing.
  6. Next, the function will clear the old value (cancel the weak reference of the old object) according to the template parameter HaveOld, and HaveNewassign a new value (add a weak reference to the new object) according to the template parameter.
  7. If the new object is successfully registered and a weak reference is assigned, the weak reference bit is initialized and the locationpointed object pointer is updated to the pointer of the new object.
  8. Finally, the function unlocks SideTable::unlockTwoHaveOld, HaveNew>the two locked objects SideTableand returns the pointer of the new object.

3.3 weak_register_no_lock

id 
weak_register_no_lock(weak_table_t *weak_table, id referent_id, 
                      id *referrer_id, bool crashIfDeallocating)
{
    
    
    objc_object *referent = (objc_object *)referent_id;
    objc_object **referrer = (objc_object **)referrer_id;

    // 如果referent为nil 或 referent 采用了TaggedPointer计数方式,直接返回,不做任何操作
    if (!referent  ||  referent->isTaggedPointer()) return referent_id;

    // 确保被引用的对象可用(没有在析构,同时应该支持weak引用)
    bool deallocating;
    if (!referent->ISA()->hasCustomRR()) {
    
    
        deallocating = referent->rootIsDeallocating();
    }
    else {
    
    
        BOOL (*allowsWeakReference)(objc_object *, SEL) = 
            (BOOL(*)(objc_object *, SEL))
            object_getMethodImplementation((id)referent, 
                                           SEL_allowsWeakReference);
        if ((IMP)allowsWeakReference == _objc_msgForward) {
    
    
            return nil;
        }
        deallocating =
            ! (*allowsWeakReference)(referent, SEL_allowsWeakReference);
    }
    // 正在析构的对象,不能够被弱引用
    if (deallocating) {
    
    
        if (crashIfDeallocating) {
    
    
            _objc_fatal("Cannot form weak reference to instance (%p) of "
                        "class %s. It is possible that this object was "
                        "over-released, or is in the process of deallocation.",
                        (void*)referent, object_getClassName((id)referent));
        } else {
    
    
            return nil;
        }
    }

    // now remember it and where it is being stored
    // 在 weak_table中找到referent对应的weak_entry,并将referrer加入到weak_entry中
    weak_entry_t *entry;
    if ((entry = weak_entry_for_referent(weak_table, referent))) {
    
     // 如果能找到weak_entry,则讲referrer插入到weak_entry中
        append_referrer(entry, referrer); 	// 将referrer插入到weak_entry_t的引用数组中
    } 
    else {
    
     // 如果找不到,就新建一个
        weak_entry_t new_entry(referent, referrer);  
        weak_grow_maybe(weak_table);
        weak_entry_insert(weak_table, &new_entry);
    }

    // Do not set *referrer. objc_storeWeak() requires that the 
    // value not change.

    return referent_id;
}

This code is the implementation of a function used in the Objective-C runtime to register a weak reference in the weak reference table weak_register_no_lock. This function is used to weak_tableadd a weak reference relationship to and record the weak reference pointer of an object.

Now, I will explain step by step the main steps and effects of the code:

  1. First, the function converts the incoming referent_idand referrer_idinto objc_objectpointers of type respectively, and assigns them to referentthe and referrervariables respectively.
  2. Then, the function checks referentwhether it is nil or uses TaggedPointera count ( Tagged Pointeran optimization mechanism used to store the object pointer directly in the pointer itself in some cases without additional memory allocation, and there is no need to deal with weak references here) .
  3. Next, the function checks whether the referenced object is referentavailable, i.e. it is not in the process of destruction and supports weak references. It should be noted here that in Objective-C, some objects may allowsWeakReferencedecide whether to support weak references by overriding methods. Therefore, for objects with custom reference counting methods, the function will call allowsWeakReferencea method to check whether the object supports weak references.
  4. If the referenced object referentis in the process of destruction ( deallocating为true), then based on crashIfDeallocatingthe value of the parameter, the function will decide whether to return nil or throw an exception. If crashIfDeallocatingis true, _objc_fatalan exception will be thrown, otherwise nil will be returned.
  5. If the referenced object is referentavailable and supports weak references, continue with the following steps.
  6. The function will weak_tablecheck whether referentthe corresponding weak reference entry already exists in weak_entry. If found, referreradd to weak_entrythe reference array in the . If it is not found, create a new weak_entryand insert it into weak_table, then add to the reference array in referrerthis new .weak_entry
  7. Finally, the function returns the pointer referent_idto the referenced object passed in referent.

3.4 weak_entry_for_referent

static weak_entry_t *
weak_entry_for_referent(weak_table_t *weak_table, objc_object *referent)
{
    
    
    assert(referent);

    weak_entry_t *weak_entries = weak_table->weak_entries;

    if (!weak_entries) return nil;

    size_t begin = hash_pointer(referent) & weak_table->mask;  // 这里通过 & weak_table->mask的位操作,来确保index不会越界
    size_t index = begin;
    size_t hash_displacement = 0;
    while (weak_table->weak_entries[index].referent != referent) {
    
    
        index = (index+1) & weak_table->mask;
        if (index == begin) bad_weak_table(weak_table->weak_entries); // 触发bad weak table crash
        hash_displacement++;
        if (hash_displacement > weak_table->max_hash_displacement) {
    
     // 当hash冲突超过了可能的max hash 冲突时,说明元素没有在hash表中,返回nil 
            return nil;
        }
    }
    
    return &weak_table->weak_entries[index];
}

This is a function called that is used to find the corresponding entry weak_entry_for_referentfor a given referenced object in the weak reference table .referentweak_entry

3.5 append_referrer

static void append_referrer(weak_entry_t *entry, objc_object **new_referrer)
{
    
    
    if (! entry->out_of_line()) {
    
     // 如果weak_entry 尚未使用动态数组,走这里
        // Try to insert inline.
        //尝试插入内联引用的数组
        for (size_t i = 0; i < WEAK_INLINE_COUNT; i++) {
    
    
            if (entry->inline_referrers[i] == nil) {
    
    
                entry->inline_referrers[i] = new_referrer;
                return;
            }
        }
        
        // 如果inline_referrers的位置已经存满了,则要转型为referrers,做动态数组。
        // Couldn't insert inline. Allocate out of line.
        weak_referrer_t *new_referrers = (weak_referrer_t *)
            calloc(WEAK_INLINE_COUNT, sizeof(weak_referrer_t));
        // This constructed table is invalid, but grow_refs_and_insert
        // will fix it and rehash it.
        for (size_t i = 0; i < WEAK_INLINE_COUNT; i++) {
    
    
            new_referrers[i] = entry->inline_referrers[I];
        }
        entry->referrers = new_referrers;
        entry->num_refs = WEAK_INLINE_COUNT;
        entry->out_of_line_ness = REFERRERS_OUT_OF_LINE;
        entry->mask = WEAK_INLINE_COUNT-1;
        entry->max_hash_displacement = 0;
    }

    // 对于动态数组的附加处理:
    assert(entry->out_of_line()); // 断言: 此时一定使用的动态数组

    if (entry->num_refs >= TABLE_SIZE(entry) * 3/4) {
    
     // 如果动态数组中元素个数大于或等于数组位置总空间的3/4,则扩展数组空间为当前长度的一倍
        return grow_refs_and_insert(entry, new_referrer); // 扩容,并插入
    }
    
    // 如果不需要扩容,直接插入到weak_entry中
    // 注意,weak_entry是一个哈希表,key:w_hash_pointer(new_referrer) value: new_referrer
    
    // 细心的人可能注意到了,这里weak_entry_t 的hash算法和 weak_table_t的hash算法是一样的,同时扩容/减容的算法也是一样的
    size_t begin = w_hash_pointer(new_referrer) & (entry->mask); // '& (entry->mask)' 确保了 begin的位置只能大于或等于 数组的长度
    size_t index = begin;  // 初始的hash index
    size_t hash_displacement = 0;  // 用于记录hash冲突的次数,也就是hash再位移的次数
    while (entry->referrers[index] != nil) {
    
    
        hash_displacement++;
        index = (index+1) & entry->mask;  // index + 1, 移到下一个位置,再试一次能否插入。(这里要考虑到entry->mask取值,一定是:0x111, 0x1111, 0x11111, ... ,因为数组每次都是*2增长,即8, 16, 32,对应动态数组空间长度-1的mask,也就是前面的取值。)
        if (index == begin) bad_weak_table(entry); // index == begin 意味着数组绕了一圈都没有找到合适位置,这时候一定是出了什么问题。
    }
    if (hash_displacement > entry->max_hash_displacement) {
    
     // 记录最大的hash冲突次数, max_hash_displacement意味着: 我们尝试至多max_hash_displacement次,肯定能够找到object对应的hash位置
        entry->max_hash_displacement = hash_displacement;
    }
    // 将ref存入hash数组,同时,更新元素个数num_refs
    weak_referrer_t &ref = entry->referrers[index];
    ref = new_referrer;
    entry->num_refs++;
}

This code first determines whether to use a fixed-length array or a dynamic array. If a fixed-length array is used, just add the weak pointer address to the array. If the fixed-length array has been exhausted, you need to add the elements in the fixed-length array. Transfer to dynamic array.

Next, let's take a look at the weak pointer removal method, which is weak_entrycalled when it needs to be cleared: weak_unregister_no_lock, the old weak pointer address is removed in the method.

3.6 weak_unregister_no_lock

void
weak_unregister_no_lock(weak_table_t *weak_table, id referent_id, 
                        id *referrer_id)
{
    
    
	//对象的地址
    objc_object *referent = (objc_object *)referent_id;
    //weak指针地址
    objc_object **referrer = (objc_object **)referrer_id;

    weak_entry_t *entry;

    if (!referent) return;

    if ((entry = weak_entry_for_referent(weak_table, referent))) {
    
     // 查找到referent所对应的weak_entry_t
        remove_referrer(entry, referrer);  // 在referent所对应的weak_entry_t的hash数组中,移除referrer
       
        // 移除元素之后, 要检查一下weak_entry_t的hash数组是否已经空了
        bool empty = true;
        if (entry->out_of_line()  &&  entry->num_refs != 0) {
    
    
            empty = false;
        }
        else {
    
    
            for (size_t i = 0; i < WEAK_INLINE_COUNT; i++) {
    
    
                if (entry->inline_referrers[i]) {
    
    
                    empty = false; 
                    break;
                }
            }
        }

        if (empty) {
    
     // 如果weak_entry_t的hash数组已经空了,则需要将weak_entry_t从weak_table中移除
            weak_entry_remove(weak_table, entry);
        }
    }
    // Do not set *referrer = nil. objc_storeWeak() requires that the 
    // value not change.
}

Approximate process:

First, it will find the weak_entry_t corresponding to the referent in the weak_table. After
removing the referrer in weak_entry_t
, it will judge whether there are still elements in weak_entry_t at this time (empty==true?).
If weak_entry_t has no elements at this time, then Weak_entry_t needs to be removed from weak_table

4. understanding

4.1 rootDealloc

When the reference count of the object is 0, the bottom layer will call the _objc_rootDealloc method to release the object, and the rootDealloc method will be called in the _objc_rootDealloc method. The following is the code implementation of the rootDealloc method:

xinline void
objc_object::rootDealloc()
{
    
    
    if (isTaggedPointer()) return;  // fixme necessary?

    if (fastpath(isa.nonpointer  &&  
                 !isa.weakly_referenced  &&  
                 !isa.has_assoc  &&  
                 !isa.has_cxx_dtor  &&  
                 !isa.has_sidetable_rc))
    {
    
    
        assert(!sidetable_present());
        free(this);
    } 
    else {
    
    
        object_dispose((id)this);
    }
}

Approximate process:

  • First determine whether the object is Tagged Pointer, and if so, return it directly.
  • If the object adopts an optimized isacounting method, and if the object is not referenced by weak !isa.weakly_referenced, has no associated objects !isa.has_assoc, has no custom C++ destructor method !isa.has_cxx_dtor, and is not used SideTablefor reference counting, !isa.has_sidetable_rcit will be released quickly.
  • If the conditions in 2 cannot be met, object_disposethe method will be called.

4.2 object_dispose

void *objc_destructInstance(id obj)
{
    
    
    if (obj) {
    
    
        // Read all of the flags at once for performance.
        bool cxx = obj->hasCxxDtor();
        bool assoc = obj->hasAssociatedObjects();

        // This order is important.
        if (cxx) object_cxxDestruct(obj);
        if (assoc) _object_remove_associations(obj, /*deallocating*/true);
        obj->clearDeallocating();
    }

    return obj;
}

If there is a custom C++ destructor method, the C++ destructor is called. If there is an associated object, remove the associated object and remove itself from Association Managerthe map. Call clearDeallocatingthe method to clear the relevant references to the object.

4.3 clearDeallocating

inline void 
objc_object::clearDeallocating()
{
    
    
    if (slowpath(!isa.nonpointer)) {
    
    
        // Slow path for raw pointer isa.
        sidetable_clearDeallocating();
    }
    else if (slowpath(isa.weakly_referenced  ||  isa.has_sidetable_rc)) {
    
    
        // Slow path for non-pointer isa with weak refs and/or side table data.
        clearDeallocating_slow();
    }

    assert(!sidetable_present());
}

clearDeallocatingThere are two branches in . First, determine whether the object uses optimized isa reference counting. If not, you need to call a method to clean up the reference count data sidetable_clearDeallocatingstored in the object . SideTableIf the object uses optimized isa reference counting, determine whether there is SideTablean auxiliary reference count used (isa.has_sidetable_rc)or a weak reference (isa.weakly_referenced). If one of these two situations is met, call clearDeallocating_slowthe method.

4.4 sidetable_clearDeallocating

void 
objc_object::sidetable_clearDeallocating()
{
    
    
    SideTable& table = SideTables()[this];

    // clear any weak table items
    // clear extra retain count and deallocating bit
    // (fixme warn or abort if extra retain count == 0 ?)
    //清除所有弱表项
	//清除额外的保留计数和释放位
	//(如果额外保留计数==0,则修复警告或中止)
    table.lock();
    RefcountMap::iterator it = table.refcnts.find(this);
    if (it != table.refcnts.end()) {
    
    
        if (it->second & SIDE_TABLE_WEAKLY_REFERENCED) {
    
    
            weak_clear_no_lock(&table.weak_table, (id)this);
        }
        table.refcnts.erase(it);
    }
    table.unlock();
}

4.5 clearDeallocating_slow

NEVER_INLINE void
objc_object::clearDeallocating_slow()
{
    
    
    assert(isa.nonpointer  &&  (isa.weakly_referenced || isa.has_sidetable_rc));

    SideTable& table = SideTables()[this]; // 在全局的SideTables中,以this指针为key,找到对应的SideTable
    table.lock();
    if (isa.weakly_referenced) {
    
     // 如果obj被弱引用
        weak_clear_no_lock(&table.weak_table, (id)this); // 在SideTable的weak_table中对this进行清理工作
    }
    if (isa.has_sidetable_rc) {
    
     // 如果采用了SideTable做引用计数
        table.refcnts.erase(this); // 在SideTable的引用计数中移除this
    }
    table.unlock();
}

4.6 weak_clear_no_lock

void 
weak_clear_no_lock(weak_table_t *weak_table, id referent_id) 
{
    
    
    objc_object *referent = (objc_object *)referent_id;

    weak_entry_t *entry = weak_entry_for_referent(weak_table, referent); // 找到referent在weak_table中对应的weak_entry_t
    if (entry == nil) {
    
    
        /// XXX shouldn't happen, but does with mismatched CF/objc
        //printf("XXX no entry for clear deallocating %p\n", referent);
        return;
    }

    // zero out references
    weak_referrer_t *referrers;
    size_t count;
    
    // 找出weak引用referent的weak 指针地址数组以及数组长度
    if (entry->out_of_line()) {
    
    
        referrers = entry->referrers;
        count = TABLE_SIZE(entry);
    } 
    else {
    
    
        referrers = entry->inline_referrers;
        count = WEAK_INLINE_COUNT;
    }
    
    for (size_t i = 0; i < count; ++i) {
    
    
        objc_object **referrer = referrers[i]; // 取出每个weak ptr的地址
        if (referrer) {
    
    
            if (*referrer == referent) {
    
     // 如果weak ptr确实weak引用了referent,则将weak ptr设置为nil,这也就是为什么weak 指针会自动设置为nil的原因
                *referrer = nil;
            }
            else if (*referrer) {
    
     // 如果所存储的weak ptr没有weak 引用referent,这可能是由于runtime代码的逻辑错误引起的,报错
                _objc_inform("__weak variable at %p holds %p instead of %p. "
                             "This is probably incorrect use of "
                             "objc_storeWeak() and objc_loadWeak(). "
                             "Break on objc_weak_error to debug.\n", 
                             referrer, (void*)*referrer, (void*)referent);
                objc_weak_error();
            }
        }
    }
    
    weak_entry_remove(weak_table, entry); // 由于referent要被释放了,因此referent的weak_entry_t也要移除出weak_table
}

Finally, let’s look at how to destroy weak pointers:

void
objc_destroyWeak(id *location)
{
    
    
    (void)storeWeak<DoHaveOld, DontHaveNew, DontCrashIfDeallocating>
        (location, nil);
}

After calling the storeWeakmethod here, since it does not point to a new object, if our weakpointer originally pointed to an object, it will go to: weak_unregister_no_lockto remove the old weak pointer address and set it to nil.

Summarize

  • The principle of weak is that the bottom layer maintains a weak_table_tstructural hash table, the key is the address of the pointed object, and the value is the address array of the weak pointer.
  • The function of the weak keyword is a weak reference. The counter of the referenced object will not be increased by 1 and will be automatically set to nil when the referenced object is released.
  • When the object is released, the calling clearDeallocatingfunction obtains the array of all weak pointer addresses based on the object address, then traverses the array and sets the data in it to nil, finally deletes the entry from the weak table, and finally cleans up the object's records.
  • The article introduces three structures SideTable: , weak_table_t, weak_entry_tand so on.

Guess you like

Origin blog.csdn.net/m0_63852285/article/details/131858578