【iOS】—— The underlying principle of attribute keywords and weak keywords

Reference blog: IOS development foundation - attribute keywords (copy strong weak, etc.)

Let's take a look at the commonly used attribute keywords:

  • Keywords related to memory management: weak, assign, strong, retain,copy
  • Keywords related to thread safety: nonatomic,atomic
  • Keywords related to access rights: readonly, readwrite(read-only, read-write)
  • Keywords to modify variables: const, static,extern
  • These are some keywords we commonly use in our daily development. The detailed usage and function of them are explained in detail below.

Keywords related to memory management: (weak, assign, strong, retain, copy)

keyword weak

It is also often used to modify the data of OC object type. After the modified object is released, the pointer address will be automatically set to nil, which is a weak reference.

Note: In the ARC environment, in order to avoid circular references, the delegate attribute is often modified with weak; in MRC, it is modified with assign. When an object no longer has strong type pointers pointing to it, it will be released, even if there are weak type pointers pointing to it, then these weak type pointers will also be cleared.

keyword assign

Often used for non-pointer variables, for basic data types (such as NSInteger) and C data types (int, float, double, char, etc.), in addition to the id type. Used for copy operations on primitive data types without changing the reference count. It can also be used to modify objects. However, after the object modified by assign is released, the address of the pointer still exists, that is to say, the pointer is not set to nil and becomes a wild pointer.

Note: The reason why the basic data type can be modified is that the basic data type is generally allocated on the stack, and the stack memory will be automatically processed by the system, which will not cause wild pointers. Also: the common id delegate under MRC often uses the attributes of the assign method instead of the attributes of the retain method, in order to prevent unnecessary circular references at both ends of the delegation
. For example: object A obtains the ownership of object B through retain, and the delegate of this object B is A. If the delegate is in retain mode, both are strong references and hold each other, then there is basically no chance to release these two objects.

The difference between weak and assign:

  • Modified object: weakdata of the modified oc object type, assignused to modify non-pointer variables.
  • Reference count: weakNeither assignwill increase the reference count.
  • Release: weakAfter the modified object is released, the pointer address is automatically set to nil, and assignthe pointer address still exists after the modified object is released, becoming a wild pointer.
  • Modifier delegateused in MRC assign, used in ARC weak.

Keyword strong:

Data used to modify some OC object types such as: ( NSNumber, NSString, NSArray, NSDate, NSDictionary, 模型类etc.), it is referenced by a strong pointer, which is a strong reference. In the context of ARC, it is equivalent to retain, which is different from weak. It is what we usually call a pointer copy (shallow copy). The memory address remains unchanged, but a new pointer is generated. The new pointer and the pointer of the referenced object point to the same memory address. No new object is generated, but a pointer to the object is added.

Note: Since a memory address is used, when the content stored in the memory address changes, the attributes will also change:

Keyword copy:

It is also used to modify the data of the OC object type. At the same time, it is used to modify the MRC manual memory management period, blockbecause blockit needs to go from the stack area copyto the heap area. In the current ARC era, the system automatically performs this operation for us, so it is possible to use it now strongor copyto modify it . The same thing with and is that they are all strong references, and they will add one to the attribute count, but the difference with and is that when the attribute it modifies refers to an attribute value, it is a memory copy (deep copy), that is, when it is referenced, a new memory address and pointer address will be generated, which is completely different from the referenced object, so it will not change due to changes in the referenced attribute.blockcopystrongcopystrong

The difference between copy and strong (deep copy shallow copy):

  • Shallow copy: pointer copy, the memory address remains the same, but the pointer address is different.
  • Deep copy: memory copy, different memory addresses, pointer addresses are also different.

Declare two copy attributes and two strong attributes, which are mutable and immutable types:

@property(nonatomic,strong)NSString * Strstrong;
@property(nonatomic,copy)NSString * Strcopy;
@property(nonatomic,copy)NSMutableString * MutableStrcopy;
@property(nonatomic,strong)NSMutableString * MutableStrstrong;

Assign values ​​to attributes and print the results:

    NSString * OriginalStr = @"我已经开始测试了";
    //对 不可变对象赋值 无论是 strong 还是 copy 都是原地址不变,内存地址都为(0x10c6d75c0),生成一个新指针指向对象(浅拷贝)
    self.Strcopy = OriginalStr;
    self.Strstrong = OriginalStr;
    self.MutableStrcopy = OriginalStr;
    self.MutableStrstrong = OriginalStr;
    NSLog(@"\n 内容值: rangle=>%@\n normal:copy=>%@=====strong=>%@\n Mutable:copy=>%@=====strong=>%@",OriginalStr,_Strcopy,_Strstrong,_MutableStrcopy,_MutableStrstrong);
    NSLog(@"\n 内存地址:rangle=>%p\n normal:copy=>%p=====strong=>%p\n Mutable:copy=>%p=====strong=>%p",OriginalStr,_Strcopy,_Strstrong,_MutableStrcopy,_MutableStrstrong);
    NSLog(@"\n 指针地址:rangle=>%p\n normal:copy=>%p=====strong=>%p\n Mutable:copy=>%p=====strong=>%p",&OriginalStr,&_Strcopy,&_Strstrong,&_MutableStrcopy,&_MutableStrstrong);

insert image description here
It can be seen from the above that when an object modified by strong refers to an object, the memory address is the same, only the address of the pointer is different, and the object modified by copy is also the same. why? Doesn't it mean that the object modified by copy generates a new memory address? Why is the memory address still the same here?

Because the assignment of immutable objects, whether it is strong or copy, is the same, the original memory address remains unchanged, and a new pointer address is generated.

Then we try to assign values ​​to properties with mutable objects:

NSMutableString * OriginalMutableStr = [NSMutableString stringWithFormat:@"我已经开始测试了"];
self.Strcopy = OriginalMutableStr;
self.Strstrong = OriginalMutableStr;
self.MutableStrcopy = OriginalMutableStr;
self.MutableStrstrong = OriginalMutableStr;

insert image description here
From the above results, it can be seen that the memory address of the attribute modified by strong has not changed, but the memory value of the attribute modified by copy has changed. From this it can be concluded that:

Assigning a strong value to a mutable object keeps the original address unchanged, and the reference count +1 (shallow copy). copy is to generate a new address and object, generate a new pointer to a new memory address (deep copy)

Let's test and modify the value of OriginalMutableStr at this time to see the result:

    [OriginalStr appendFormat:@"改变了"];

Output result:
insert image description here
See the attribute modified by strong, and then change it.
When the original value is changed, since OriginalStr is a variable type, it is modified on the original memory address. Neither the pointer address nor the memory address is changed, but the data stored in the current memory address is changed. Although the pointer address of the attribute modified by strong is different, the pointer points to the original memory address, so it will change with the change of OriginalStr.

Different from strong, the type modified by copy not only has a different pointer address, but also the pointed memory address is different from OriginalStr, so it will not change with the change of OriginalStr.

Notice:

  • Using self.Strcopy and _Strcopy to assign values ​​is also two different results, because the latter does not call the set method, and the difference between copy and strong is because in the set method, the attribute modified by copy: the _Strcopy = [Strcopy copy] method is called.
  • Copy is also divided into copy and mutableCopy. There are also differences when operating on container objects and non-container objects. Let's analyze it below:

Multiple copy modes: copy and mutableCopy operate on container objects

When performing copy operations on container objects (NSArray), there are several types:

  • copy: only the pointer is copied
  • mutableCopy: Content copy The single layer here refers to the deep copy of the NSArray object, without processing the objects in the container (the memory address of the NSArray object is different, but the memory address of the internal elements remains the same)
[arr mutableCopy];
  • Double-layer deep copy: The double-layer here refers to the deep copy of the NSArray object and the object in the NSArray container (why not completely, because it cannot handle the situation that there is another NSArray in the NSArray) use:
[[NSArray alloc] initWithArray:arr copyItems:YES];
  • Complete deep copy: perfect solution to the situation where NSArray nests NSArray, you can use the archive and unarchive methods:
[NSKeyedUnarchiver unarchiveObjectWithData:[NSKeyedArchiver archivedDataWithRootObject:testArr]];

Keywords related to thread safety: (nonatomic, atomic)

  • Keyword nonatomic
    nonatomic nonatomic operation: (no lock, thread execution is fast, but when multiple threads access the same attribute, the result is unpredictable)
  • The keyword atomic
    atomic atomic operation: lock to ensure the thread safety of getter and setter access methods (only setter and getter methods are locked).
    Because of thread shackles, before other threads read and write this property, the current operation will be executed first.

For example:
thread A calls the setter method of a certain property, and thread B calls the getter method of the property before the method is completed, then the getter operation of thread B is executed only after the setter method of thread A is executed. When several threads call the setter and getter methods of the same property at the same time, a legal value will be obtained, but the value of get is uncontrollable (because the order of thread execution is uncertain).

Note:
atomic only locks the getter/setter method of the property, so the security is only for the getter/setter method, not the entire thread safety, because a property does not only have a setter/getter method, for example: (If a thread is getting a getter or setter, another thread releases the property at the same time. If the release is completed first, it will cause a crash)

Here we use an example from when we learned the Tagged Pointer object:

	//@property (nonatomic, strong) NSString *name;
	
    dispatch_queue_t queue = dispatch_get_global_queue(0, 0);
    for (int i = 0; i < 1000; i++) {
    
    
        dispatch_async(queue, ^{
    
    
            self.name = [NSString stringWithFormat:@"addasdsaadss"];
        });
    }

The result will cause the program to crash:
insert image description here
let's change the keyword of the attribute to see

	//@property (atomic, strong) NSString *name;
	
    dispatch_queue_t queue = dispatch_get_global_queue(0, 0);
    for (int i = 0; i < 1000; i++) {
    
    
        dispatch_async(queue, ^{
    
    
            self.name = [NSString stringWithFormat:@"addasdsaadss"];
        });
    }

The program can run normally, which is equivalent to adding a lock to atomic.

Keywords to modify variables: (const, static, extern)

constant const

Constant modifier, which means immutable, can be used to modify the basic variable and pointer variable on the right (put it in front of who to modify who (basic data variable p, pointer variable p)).
Commonly used expressions such as:
const type * variable name a: can change the pointing of the pointer, but cannot change the content pointed to by the pointer. const is the front constraint parameter of the number allocation, indicating that
a is read-only. Only the address a can be modified, and the accessed memory space cannot be modified through a

int x = 12;
int new_x = 21;
const int *px = &x; 
px = &new_x; // 改变指针px的指向,使其指向变量y

Type * const variable name: the content pointed to by the pointer can be changed, but the pointing of the pointer cannot be changed. const is placed behind the constraint parameter, indicating that a is read-only, the address of a cannot be modified, only the value accessed by a can be modified, and the address of the parameter cannot be modified

int y = 12;
int new_y = 21;
int * const py = &y;
(*py) = new_y; // 改变px指向的变量x的值

The difference between constant (const) and macro definition (define):

The memory occupied by macros and constants is not much different. Macros define constants. Constants are placed in the constant area, and only one copy of memory will be generated.
Disadvantages:

  • Compilation time: macro is precompiled (processed before compilation), const is the compilation stage. If you use too many macro definitions, as the project becomes larger, the compilation speed will become slower and slower
  • Macros don’t check, don’t report compilation errors, just replace, const will compile and check, and report compilation errors.

advantage:

  • Macros can define some functions and methods. const cannot.

constant static

The three functions of the static keyword:

  • Local variables can be modified and stored in a static storage area.
  • Global variables can be modified, and global variables can only be accessed in the current source file.
  • A function can be modified to limit that the function can only be called in the current source file.

constant extern

It is only used to obtain the value of global variables (including global static variables), and cannot be used to define variables. First check whether there is a global variable in the current file, if not found, then go to other files to search (priority).

Static combined with const

Declare a static global read-only constant. Some global variables declared during development do not want to be changed by the outside world, and only allow reading.
The common use scenarios of staic and const in iOS are to replace macros and define a frequently used string constant as a static global read-only variable.

// 开发中经常拿到key修改值,因此用const修饰key,表示key只读,不允许修改。
static  NSString * const key = @"name";

// 如果 const修饰 *key1,表示*key1只读,key1还是能改变。
static  NSString const *key1 = @"name";

__autoreleasing keyword

When it comes to this, I have to talk about autoreleasing again:

int main(int argc, const char * argv[]) {
    
    
    @autoreleasepool {
    
    
        __autoreleasing id obj = [[NSObject alloc] init];
    }
    return 0;
}

Open the assembly and see: the keyword is converted into and this pair of methods
insert image description here
@autoreleasepool{}by the compiler . The modifier converts to , adding obj to the autorelease pool.objc_autoreleasePoolPushobjc_autoreleasePoolPop
__autoreleasingobjc_autorelease

The compiler's processing logic for the automatic release pool is roughly divided into:

  • By objc_autoreleasePoolPushthe first function that is autorelease pool scoped.
  • Use objc_autoreleaseto add the object to the autorelease pool.
  • By objc_autoreleasePoolPopthe last function that is autorelease pool scoped.

Next, look at the implementation of objc_autoreleasePoolPushand objc_autoreleasePoolPop:

void *objc_autoreleasePoolPush(void) {
    
    
	// 调用了AutoreleasePoolPage中的push方法
    return AutoreleasePoolPage::push();
}

void objc_autoreleasePoolPop(void *ctxt) {
    
    
	// 调用了AutoreleasePoolPage中的pop方法
    AutoreleasePoolPage::pop(ctxt);
}

Let's analyze the implementation of AutoreleasePoolPage

AutoreleasePoolPage

class AutoreleasePoolPage {
    
    
#   define EMPTY_POOL_PLACEHOLDER ((id*)1)
	// 哨兵对象
#   define POOL_BOUNDARY nil
    static pthread_key_t const key = AUTORELEASE_POOL_KEY;
    static uint8_t const SCRIBBLE = 0xA3;  // 0xA3A3A3A3 after releasing
    // AutoreleasePoolPage的大小,通过宏定义,可以看到是4096字节
    static size_t const SIZE = 
#if PROTECT_AUTORELEASEPOOL
        PAGE_MAX_SIZE;  // must be multiple of vm page size
#else
        PAGE_MAX_SIZE;  // size and alignment, power of 2
        //4096
#endif
    static size_t const COUNT = SIZE / sizeof(id);

    // 对当前AutoreleasePoolPage 完整性的校验,就是用来判断对象是否完成初始化的一个标志
    magic_t const magic;

    // 指向下一个即将产生的autoreleased对象的存放位置(当next == begin()时,表示AutoreleasePoolPage为空;当next == end()时,表示AutoreleasePoolPage已满
    id *next;

    // 当前线程,表明与线程有对应关系
    pthread_t const thread;

    // 指向父节点,第一个节点的 parent 值为 nil;
    AutoreleasePoolPage * const parent;

    // 指向子节点,最后一个节点的 child 值为 nil;
    AutoreleasePoolPage *child;

    // 代表深度,第一个page的depth为0,往后每递增一个page,depth会加1;
    uint32_t const depth;

    // 表示high water mark(最高水位标记)
    uint32_t hiwat;
};

general flow

  • When entering @autoreleasepoolthe scope, objc_autoreleasePoolPushthe method is called, and an object runtimewill be added to the current as a sentinel object, and the address of the sentinel object will be returned;AutoreleasePoolPagenil
  • The object calling autoreleasemethod will be added to the corresponding one AutoreleasePoolPage, and the next pointer is like a cursor, which keeps changing and records the position. If the added object exceeds the size of one page, a new page will be added automatically.
  • When leaving @autoreleasepoolthe scope, objc_autoreleasePoolPopthe (sentry object address) method is called, which will search from the previous element of the current index until the nearest sentinel object, and send messages to objects in this range pagein turnnextrelease

Because of the existence of sentry objects, the nesting of automatic release pools is also satisfied. Whether it is a nested or nested automatic release pool, just find the corresponding sentinel object.

Source code process analysis

push function

// 哨兵对象定义
#define POOL_BOUNDARY nil

static inline void *push()
{
    
    
    id *dest;
    if (slowpath(DebugPoolAllocation)) {
    
    
        // Each autorelease pool starts on a new pool page.
        dest = autoreleaseNewPage(POOL_BOUNDARY);
    } else {
    
    
        // 添加一个哨兵对象到自动释放池
        dest = autoreleaseFast(POOL_BOUNDARY);
    }
    ...
    return dest;
}

//向自动释放池中添加对象
static inline id *autoreleaseFast(id obj)
{
    
    
    // 获取hotPage: 当前正在使用的Page
    AutoreleasePoolPage *page = hotPage();
    // 如果有page 并且 page没有被占满
    if (page && !page->full()) {
    
    
        // 添加一个对象
        return page->add(obj);
    } else if (page) {
    
    
        // 添加一个对象
        return autoreleaseFullPage(obj, page);
    } else {
    
    
        // 如果没有page,则创建一个page
        return autoreleaseNoPage(obj);
    }
}

// 创建一个新的page,并将当前page->child指向新的page,将对象添加进去
id *autoreleaseFullPage(id obj, AutoreleasePoolPage *page)
{
    
    
    ...
    do {
    
    
        if (page->child) page = page->child;
        else page = new AutoreleasePoolPage(page);
    } while (page->full());

    setHotPage(page);
    return page->add(obj);
}

// 创建一个新的page
id *autoreleaseNoPage(id obj)
{
    
    
    ...
    AutoreleasePoolPage *page = new AutoreleasePoolPage(nil);
    setHotPage(page);
    ...
    // Push the requested object or pool.
    return page->add(obj);
}

//压栈的函数(省略了不必要的部分)
//压栈操作,将对象加入AutoreleasePoolPage,然后移动栈顶指针
id *add(id obj) {
    
    
        id *ret = next;
        *next++ = obj;
        return ret;
    }

pop function

static inline void pop(void *token)   // token指针指向栈顶的地址
{
    
    
	//使用pageForPointer获取当前token所在的AutoreleasePoolPage
    AutoreleasePoolPage *page;
    id *stop;

    page = pageForPointer(token);   // 通过栈顶的地址找到对应的page
    stop = (id *)token;
    if (DebugPoolAllocation  &&  *stop != POOL_SENTINEL) {
    
    
        // This check is not valid with DebugPoolAllocation off
        // after an autorelease with a pool page but no pool in place.
        _objc_fatal("invalid or prematurely-freed autorelease pool %p; ", 
                    token);
    }

    if (PrintPoolHiwat) printHiwat();   // 记录最高水位标记

	//调用releaseUntil方法释放栈中的对象、直到stop,stop就是传递的参数,一般为哨兵对象
    page->releaseUntil(stop);   // 从栈顶开始操作出栈,并向栈中的对象发送release消息,直到遇到第一个哨兵对象

    // memory: delete empty children
    // 删除空掉的节点
    if (DebugPoolAllocation  &&  page->empty()) {
    
    
        // special case: delete everything during page-per-pool debugging
        AutoreleasePoolPage *parent = page->parent;
        page->kill();
        setHotPage(parent);
    } else if (DebugMissingPools  &&  page->empty()  &&  !page->parent) {
    
    
        // special case: delete everything for pop(top) 
        // when debugging missing autorelease pools
        page->kill();
        setHotPage(nil);
    } 
    //调用child的kill方法,releaseUntil把page里的对象进行了释放,但是page本身也会占据很多空间,要通过kill()来处理
	//如果当前page小于一半满,则把当前页的所有孩子都杀掉,否则,留下一个孩子,从孙子开始杀。
    else if (page->child) {
    
    
        // hysteresis: keep one empty child if page is more than half full
        if (page->lessThanHalfFull()) {
    
    
            page->child->kill();
        }
        else if (page->child->child) {
    
    
            page->child->child->kill();
        }
    }
}


将上方代码简化之后如下:
static inline void pop(void *token) {
    
    
	//使用pageForPointer获取当前token所在的AutoreleasePoolPage
    AutoreleasePoolPage *page = pageForPointer(token);
    id *stop = (id *)token;
	//调用releaseUntil方法释放栈中的对象、直到stop,stop就是传递的参数,一般为哨兵对象
    page->releaseUntil(stop);
	
	//调用child的kill方法,releaseUntil把page里的对象进行了释放,但是page本身也会占据很多空间,要通过kill()来处理
	//如果当前page小于一半满,则把当前页的所有孩子都杀掉,否则,留下一个孩子,从孙子开始杀。
    if (page->child) {
    
    
        if (page->lessThanHalfFull()) {
    
    
            page->child->kill();
        } else if (page->child->child) {
    
    
            page->child->child->kill();
        }
    }
}

The process is divided into two steps:

  • page->releaseUntil(stop), call objc_release() on all objects between the top of the stack (page->next) and the stop address (POOL_SENTINEL), and decrement the reference count by 1
  • Clear the page object page->kill()

basic principle of weak

weak foundation

weak is a weak reference, the counter of the referenced object will not be incremented by one, and will be automatically set to nil when the referenced object is released.
The weak table is actually a hash (hash) table (a dictionary is also a hash table), Key is the address of the pointed object, and Value is the address collection of weak pointers. It is used to solve circular reference problems.

Runtime maintains a weak table for storing all weak pointers to an object. The weak table is actually a hash (hash) table. Key is the address of the pointed object, and Value is the address of the weak pointer (the value of this address is the address of the pointed object pointer, which is the address of the address).

The implementation principle of weak can be summarized in the following three steps:

  • Initialization: runtime will call the objc_initWeak function to initialize a new weak pointer to the address of the object.
  • When adding a reference: the objc_initWeak function will call the objc_storeWeak() function. The function of objc_storeWeak() is to update the pointer and create a corresponding weak reference table.
  • When releasing, call the clearDeallocating function. The clearDeallocating function first obtains the array of all weak pointer addresses according to the object address, then traverses the array and sets the data in it to nil, and finally deletes the entry from the weak table to clear the object records.

SideTable

struct SideTable {
    
    
// 保证原子操作的自旋锁
    spinlock_t slock;
    // 引用计数的 hash 表
    RefcountMap refcnts;
    // weak 引用全局 hash 表
    weak_table_t weak_table;
    
    SideTable() {
    
    
        memset(&weak_table, 0, sizeof(weak_table));
    }

    ~SideTable() {
    
    
        _objc_fatal("Do not delete SideTable.");
    }

    void lock() {
    
     slock.lock(); }
    void unlock() {
    
     slock.unlock(); }
    void reset() {
    
     slock.reset(); }

    // Address-ordered lock discipline for a pair of side tables.

    template<HaveOld, HaveNew>
    static void lockTwo(SideTable *lock1, SideTable *lock2);
    template<HaveOld, HaveNew>
    static void unlockTwo(SideTable *lock1, SideTable *lock2);
}

slock is a spin lock selected to prevent competition.
refcnts is a variable of the extra_rc common reference count of the isa pointer of the auxiliary object (for the object result, mentioned later)

Next, let's take a look at the three member variables in SideTable:

1.spinlock_t slock spin lock

Locks are an important tool for thread synchronization.
There are five major locks in the operating system (the operating system has not learned this yet, first write the basic concepts, and a blog will introduce them in detail later):

  • Semaphore:
    Integer semaphore S, S<=0 indicates that the resource is occupied, S>0 indicates that the resource is available, pv operation to access the record
    semaphore s.value > 0 indicates the number of available resources; < 0 indicates the number blocked in the waiting list
    AND semaphore, AND semaphore refers to the semaphore operation when multiple resources are required at the same time and each type occupies one resource.
    The semaphore set corresponds to a variety of resources, which is equivalent to a record-type set
  • Mutex: Similar to a binary semaphore, the only difference is that the acquisition and release of the mutex must be performed in the same thread. It is invalid for a thread to release a semaphore that it does not own. The semaphore can be released by other threads.
  • Critical section: In the process of concurrent execution, the program segment that must be mutually exclusive to access critical resources is called a critical section
  • Read-write lock: solve the lock generated by the reader-writer problem
  • Condition variable: A condition variable acts as a notification mechanism. Multiple threads can set and wait for the condition variable, and once another thread sets the condition variable (equivalent to the wake-up condition variable), multiple waiting threads can continue to execute

spin lock

When it comes to spin locks, we need to talk about mutex locks:

  • The same point: It can be guaranteed that only one thread accesses shared resources at the same time. Can guarantee thread safety.
  • difference:
  • Mutual exclusion lock: If the shared data is already locked by other threads, the thread will go to sleep and wait for the lock. Once the accessed resource is unlocked, the thread waiting for the resource will be woken up.
  • Spin lock: If the shared data is already locked by other threads, the thread will wait for the lock in an endless loop. Once the accessed resource is unlocked, the thread waiting for the resource will execute immediately.

Spinlocks are more efficient than mutexes. But we should pay attention to that since the CPU is not released during spin, the thread holding the spin lock should release the spin lock as soon as possible, otherwise the thread waiting for the spin lock will always spin there, which will waste CPU time.
Lock the SideTable when operating the reference count to avoid data errors

For knowledge about locks, you can read the blog written before:
[iOS] - related locks in iOS

2.RefcountMap

Please add a picture description

typedef objc::DenseMap<DisguisedPtr<objc_object>,size_t,true> RefcountMap;

Among them, DenseMap is a template class:

template<typename KeyT, typename ValueT,
         bool ZeroValuesArePurgeable = false, 
         typename KeyInfoT = DenseMapInfo<KeyT> >
class DenseMap : public DenseMapBase<DenseMap<KeyT, ValueT, 
  ZeroValuesArePurgeable, KeyInfoT>, KeyT, ValueT, KeyInfoT, 
  ZeroValuesArePurgeable> {
    
    
  ...
  BucketT *Buckets;
  unsigned NumEntries;
  unsigned NumTombstones;
  unsigned NumBuckets;
  ...
}

The more important members are:

1. ZeroValuesArePurgeableThe default value is false, but RefcountMapspecify that it is initialized to true. This member marks whether a bucket with a value of 0 (reference count is 1) can be used. Because the initial value of an empty bucket is 0, there is no difference between a bucket with a value of 0 and an empty bucket. If a bucket with a value of 0 is allowed, if the bucket corresponding to the object is not found when searching for the bucket, and the tombstone bucket is not found, the bucket with a value of 0 will be used first.

2.Buckets 指针管理一段连续内存空间, 也就是数组, 数组成员是 BucketT 类型的对象, 我们这里将 BucketT 对象称为桶(实际上这个数组才应该叫桶, 苹果把数组中的元素称为桶应该是为了形象一些, 而不是哈希桶中的桶的意思). 桶数组在申请空间后, 会进行初始化, 在所有位置上都放上空桶(桶的 key 为 EmptyKey 时是空桶), 之后对引用计数的操作, 都要依赖于桶.
桶的数据类型实际上是 std::pair, 类似于 swift 中的元祖类型, 就是将对象地址和对象的引用计数(这里的引用计数类似于isa, 也是使用其中的几个 bit 来保存引用计数, 留出几个 bit 来做其它标记位)组合成一个数据类型.

BucketTis defined as follows:

typedef std::pair<KeyT, ValueT> BucketT;

3. NumEntriesRecord the number of non-empty buckets used in the array.

4. NumTombstones, Tombstoneliterally translated as tombstone, when the reference count of an object is 0, when it is taken out of the bucket, its position will be marked as the Tombstone. NumTombstonesnumber of tombstones in the array. The function of the tombstone will be introduced later.

5. NumBucketsThe number of buckets, because the array is always full of buckets, so it can be understood as the size of the array.

inline uint64_t NextPowerOf2(uint64_t A) {
    
    
    A |= (A >> 1);
    A |= (A >> 2);
    A |= (A >> 4);
    A |= (A >> 8);
    A |= (A >> 16);
    A |= (A >> 32);
    return A + 1;
}

This is the method of providing the size of the array corresponding to 64 bits. When it is necessary to open up space for the bucket array, the size of the array will be determined by this method. This algorithm can cover the highest bit 1 to all lower bits. For example, A = 0b10000, (A >> 1) = 0b01000, A = 0b11000 will be obtained by bitwise AND, at this time (A >> 2) = 0b00110, A = 0 will be obtained by bitwise AND b11110. By analogy, the highest bit 1 of A will be covered to the upper 2 bits, upper 4 bits, upper 8 bits, until the lowest bit. Finally, the binary number filled with 1 will be added with 1 to get a 0b1000...(N 0s). That is to say, the size of the bucket array will be 2^n.

The working logic of RefcountMap (code analysis is at the end)

1. By calculating the hash value of the object address, get the corresponding SideTablesfrom SideTable. The reference count of the object with repeated hash value is stored in the same SideTable.
2. SideTableUse find()the method and the overloaded [] operator to determine the bucket corresponding to the object through the object address. The search algorithm that is finally executed is. 3. The search algorithm will first judge the number of buckets. However, the found bucket will be recorded first. If the bucket corresponding to the object is found, it only needs to have its reference count + 1 or - 1. If the reference count is 0 and the object needs to be destroyed, set the key in LookupBucketFor()this bucket to
return falsereturn falseTombstoneKey

value_type& FindAndConstruct(const KeyT &Key) {
    
    
    BucketT *TheBucket;
    if (LookupBucketFor(Key, TheBucket))
      return *TheBucket;
    return *InsertIntoBucket(Key, ValueT(), TheBucket);
  }

4. The insertion algorithm will first check the available amount. If the available amount of the hash table (the number of tombstone buckets + empty buckets) is less than 1/4, a larger space needs to be re-opened for the table. If the number of empty buckets in the table is less than 1/8 (indicating that there are too many tombstone buckets), the tombstones in the table need to be cleaned. In the above two cases, the hash lookup algorithm will find it difficult to find the correct position, and may even cause an infinite loop, so the table must be processed first. And insert. If the above two situations do not happen, put the reference count of the new object directly into the bucket provided by the caller.

Illustration:
Please add a picture description

The role of the tombstone:

  • If the c object is destroyed and the bucket with subscript 2 is set as an empty bucket instead of a tombstone bucket, then the reference count is added to the object e at this time. When the bucket with subscript 2 is found according to the hash algorithm, it will be inserted directly, and the reference count cannot be increased for e already in the bucket with subscript 4. However, in our normal process, the bucket with subscript 2 will be set as the tombstone bucket after the c object is destroyed. , 2 will be skipped, and the search will continue until the bucket corresponding to the e object is found, or until an empty bucket is found to create a new bucket for storing the e object
  • If a new object f is initialized at this time, when the bucket with subscript 2 is found according to the hash algorithm, it is found that a tombstone is placed in the bucket, and the subscript 2 will be recorded at this time. Next, the hash algorithm is continued to find the location, and when an empty bucket is found, it proves that there is no object f in the table.
The source code for finding the bucket corresponding to an object is as follows:
bool LookupBucketFor(const LookupKeyT &Val,
                       const BucketT *&FoundBucket) const {
    
    
    ...
    if (NumBuckets == 0) {
    
     //桶数是0
      FoundBucket = 0;
      return false; //返回 false 回上层调用添加函数
    }
    ...
    unsigned BucketNo = getHashValue(Val) & (NumBuckets-1); //将哈希值与数组最大下标按位与
    unsigned ProbeAmt = 1; //哈希值重复的对象需要靠它来重新寻找位置
    while (1) {
    
    
      const BucketT *ThisBucket = BucketsPtr + BucketNo; //头指针 + 下标, 类似于数组取值
      //找到的桶中的 key 和对象地址相等, 则是找到
      if (KeyInfoT::isEqual(Val, ThisBucket->first)) {
    
    
        FoundBucket = ThisBucket;
        return true;
      }
      //找到的桶中的 key 是空桶占位符, 则表示可插入
      if (KeyInfoT::isEqual(ThisBucket->first, EmptyKey)) {
    
     
        if (FoundTombstone) ThisBucket = FoundTombstone; //如果曾遇到墓碑, 则使用墓碑的位置
        FoundBucket = FoundTombstone ? FoundTombstone : ThisBucket;
        return false; //找到空占位符, 则表明表中没有已经插入了该对象的桶
      }
      //如果找到了墓碑
      if (KeyInfoT::isEqual(ThisBucket->first, TombstoneKey) && !FoundTombstone)
        FoundTombstone = ThisBucket;  // 记录下墓碑
      //这里涉及到最初定义 typedef objc::DenseMap<DisguisedPtr<objc_object>,size_t,true> RefcountMap, 传入的第三个参数 true
      //这个参数代表是否可以清除 0 值, 也就是说这个参数为 true 并且没有墓碑的时候, 会记录下找到的 value 为 0 的桶
      if (ZeroValuesArePurgeable  && 
          ThisBucket->second == 0  &&  !FoundTombstone) 
        FoundTombstone = ThisBucket;

      //用于计数的 ProbeAmt 如果大于了数组容量, 就会抛出异常
      if (ProbeAmt > NumBuckets) {
    
    
          _objc_fatal("...");
      }
      BucketNo += ProbeAmt++; //本次哈希计算得出的下表不符合, 则利用 ProbeAmt 寻找下一个下标
      BucketNo&= (NumBuckets-1); //得到新的数字和数组下标最大值按位与
    }
  }

The code to insert into the reference count bucket of an object is as follows
BucketT *InsertIntoBucketImpl(const KeyT &Key, BucketT *TheBucket) {
    
    
    unsigned NewNumEntries = getNumEntries() + 1; //桶的使用量 +1
    unsigned NumBuckets = getNumBuckets(); //桶的总数
    if (NewNumEntries*4 >= NumBuckets*3) {
    
     //使用量超过 3/4
      this->grow(NumBuckets * 2); //数组大小 * 2做参数, grow 中会决定具体数值
      //grow 中会重新布置所有桶的位置, 所以将要插入的对象也要重新确定位置
      LookupBucketFor(Key, TheBucket);
      NumBuckets = getNumBuckets(); //获取最新的数组大小
    }
    //如果空桶数量少于 1/8, 哈希查找会很难定位到空桶的位置
    if (NumBuckets-(NewNumEntries+getNumTombstones()) <= NumBuckets/8) {
    
    
      //grow 以原大小重新开辟空间, 重新安排桶的位置并能清除墓碑
      this->grow(NumBuckets);
      LookupBucketFor(Key, TheBucket); //重新布局后将要插入的对象也要重新确定位置
    }
    assert(TheBucket);
    //找到的 BucketT 标记了 EmptyKey, 可以直接使用
    if (KeyInfoT::isEqual(TheBucket->first, getEmptyKey())) {
    
    
      incrementNumEntries(); //桶使用量 +1
    }
    else if (KeyInfoT::isEqual(TheBucket->first, getTombstoneKey())) {
    
     //如果找到的是墓碑
      incrementNumEntries(); //桶使用量 +1
      decrementNumTombstones(); //墓碑数量 -1
    }
    else if (ZeroValuesArePurgeable  &&  TheBucket->second == 0) {
    
     //找到的位置是 value 为 0 的位置
      TheBucket->second.~ValueT(); //测试中这句代码被直接跳过并没有执行, value 还是 0
    } else {
    
    
      // 其它情况, 并没有成员数量的变化(官方注释是 Updating an existing entry.)
    }
    return TheBucket;
  }

weak部分——weak_table_t

weak_table_tIn SideTablethe structure, the Hash table that stores the weak reference pointer of the object, the core data structure of the weak function implementation
First, let's look at the source code of the weak_table_t structure:

struct weak_table_t {
    
    
    weak_entry_t *weak_entries;//连续地址空间的头指针,数组
    size_t    num_entries;//数组中已占用位置的个数
    uintptr_t mask;//数组下标最大值(即数组大小 -1)
    uintptr_t max_hash_displacement;//最大哈希偏移值
};

weak_tableIt is a hash table structure. weakThe hash value is calculated according to the address of the object pointed to by the pointer. Objects with the same hash value look up the available position in the form of subscript + 1. It is a typical closed hash algorithm. The maximum hash offset value is the maximum offset between the calculated hash value and the actual insertion position of all objects, which can be used as the upper limit of the loop when searching.

Weak_table structure diagram:
Please add a picture description

Members of weak_entry_t

struct weak_entry_t {
    
    
    DisguisedPtr<objc_object> referent; //对象地址
    union {
    
      //这里又是一个联合体, 苹果设计的数据结构的确很棒
        struct {
    
    
            // 因为这里要存储的又是一个 weak 指针数组, 所以苹果继续选择采用哈希算法
            weak_referrer_t *referrers; //指向 referent 对象的 weak 指针数组
            uintptr_t        out_of_line_ness : 2; //这里标记是否超过内联边界, 下面会提到
            uintptr_t        num_refs : PTR_MINUS_2; //数组中已占用的大小
            uintptr_t        mask; //数组下标最大值(数组大小 - 1)
            uintptr_t        max_hash_displacement; //最大哈希偏移值
        };
        struct {
    
    
            //这是一个取名叫内联引用的数组
            weak_referrer_t  inline_referrers[WEAK_INLINE_COUNT]; //宏定义的值是 4
        };
    };
    // weak_entry_t 的赋值操作,直接使用 memcpy 函数拷贝 other 内存里面的内容到 this 中,
    // 而不是用复制构造函数什么的形式实现,应该也是为了提高效率考虑的...
    weak_entry_t& operator=(const weak_entry_t& other) {
    
    
        memcpy(this, &other, sizeof(other));
        return *this;
    }

    // 返回 true 表示使用 referrers 哈希数组 false 表示使用 inline_referrers 数组保存 weak_referrer_t
    bool out_of_line() {
    
    
        return (out_of_line_ness == REFERRERS_OUT_OF_LINE);
    }

    // weak_entry_t 的构造函数
    
    // newReferent 是原始对象的指针,
    // newReferrer 则是指向 newReferent 的弱引用变量的指针。
    
    // 初始化列表 referent(newReferent) 会调用: DisguisedPtr(T* ptr) : value(disguise(ptr)) { } 构造函数,
    // 调用 disguise 函数把 newReferent 转化为一个整数赋值给 value。
    weak_entry_t(objc_object *newReferent, objc_object **newReferrer)
        : referent(newReferent)
    {
    
    
        // 把 newReferrer 放在数组 0 位,也会调用 DisguisedPtr 构造函数,把 newReferrer 转化为整数保存
        inline_referrers[0] = newReferrer;
        // 循环把 inline_referrers 数组的剩余 3 位都置为 nil
        for (int i = 1; i < WEAK_INLINE_COUNT; i++) {
    
    
            inline_referrers[i] = nil;
        }
    }
}

Through the address of the object, we can weak_table_tfind the corresponding in weak_entry_t, weak_entry_twhich stores all the weak pointers pointing to this object

Apple weak_entry_tuses another shared body in , the first structure out_of_line_nessoccupies 2 bits, and num_refsoccupies 62 bits in a 64-bit environment, so in fact, the two structures are 32 bytes, sharing a segment of address. When there are no more than 4 weak pointers pointing to this object, the array is used directly, eliminating the step of hash operation. If the number of weak pointers exceeds 4, the hash table in the first structure must be used inline_referrers.

The general logic of weak_table

  • Under ARC, the compiler will automatically add the code to manage the reference count. When the weak pointer is assigned, the compiler will call to assign the value. If the weak pointer has an storeWeakobject pointed to, it will first call weak_unregister_no_lock()the method to delete the weak pointer from the original table, and then call weak_register_no_lock()to insert the weak pointer into the corresponding table.
  • When searching, first use the address of the pointed object to calculate the hash value, find the corresponding one SideTables()from SideTable, and then further use this object address to find the corresponding one from SideTablethe . The final operation is this .weak_tableweak_entry_tweak_entry_t
  • If the object has no more than 4 weak pointers, the array will be directly manipulated , otherwise memory will be allocated inline_referrersfor the array, and the hash algorithm will be used to manage the table.referrers
  • When deleting the old weak pointer, the address of the originally pointed object will be used to find the corresponding one weak_entry_t, and the weak pointer will be deleted from it. If the weak pointer array is empty after deletion, this will be destroyed, the weak_entry_toriginal position will be empty, and the weak reference flag of the isa pointer originally pointed to by the object will be 0.
  • When adding a new weak pointer, if the corresponding one is found weak_entry_t, insert the weak pointer into the referrers array. If not found, create a weak_entry_tconfigured weak_table_tarray and insert it.

An important implementation method of weak

objc_initWeak function

When initialization begins, the objc_initWeak function is called to initialize a new weak pointer to the address of the object. When we initialize the weak variable, the runtime will call objc_initWeak in NSObject.m, and the implementation in the objc_initWeak function is as follows:

id objc_initWeak(id *location, id newObj) {
    
    
// 查看对象实例是否有效,无效对象直接导致指针释放
    if (!newObj) {
    
    
        *location = nil;
        return nil;
    }
    // 这里传递了三个 Bool 数值
    // 使用 template 进行常量参数传递是为了优化性能
    return storeWeakfalse/*old*/, true/*new*/, true/*crash*/>
    (location, (objc_object*)newObj);
}

Then let's take a look at objc_initWeak()what the two parameters passed in represent:

  • location:__weak pointer address, store the address of the pointer, so that the object it points to can be set to nil at the end.
  • newObj: The referenced object. That is, p in the example.

It can be seen from the above code that objc_initWeak()the function first judges whether the class object pointed to by the pointer is valid, and returns directly if it is invalid; otherwise, it is storeWeak()registered as a _weak object pointing to value through

objc_initWeakThe function will call objc_storeWeak()the function, and objc_storeWeak() the function of the function is to update the pointing of the pointer and create a weak reference table.

objc_initWeakThe function has a prerequisite: the object must be a valid pointer that has not been registered as a __weak object. The value can be nil, or point to a valid object.

objc_storeWeak()

// HaveOld:	 true - 变量有值
// 			false - 需要被及时清理,当前值可能为 nil
// HaveNew:	 true - 需要被分配的新值,当前值可能为 nil
// 			false - 不需要分配新值
// CrashIfDeallocating: true - 说明 newObj 已经释放或者 newObj 不支持弱引用,该过程需要暂停
// 			false - 用 nil 替代存储
template bool HaveOld, bool HaveNew, bool CrashIfDeallocating>
static id storeWeak(id *location, objc_object *newObj) {
    
    
	// 该过程用来更新弱引用指针的指向
	// 初始化 previouslyInitializedClass 指针
    Class previouslyInitializedClass = nil;
    id oldObj;
    // 声明两个 SideTable
    // ① 新旧散列创建
    SideTable *oldTable;
    SideTable *newTable;
	// 获得新值和旧值的锁存位置(用地址作为唯一标示)
	// 通过地址来建立索引标志,防止桶重复
	// 下面指向的操作会改变旧值
  retry:
    if (HaveOld) {
    
    
    	// 更改指针,获得以 oldObj 为索引所存储的值地址
        oldObj = *location;
        oldTable = &SideTables()[oldObj];
    } else {
    
    
        oldTable = nil;
    }
    if (HaveNew) {
    
    
    	// 更改新值指针,获得以 newObj 为索引所存储的值地址
        newTable = &SideTables()[newObj];
    } else {
    
    
        newTable = nil;
    }
	// 加锁操作,防止多线程中竞争冲突
    SideTable::lockTwoHaveOld, HaveNew>(oldTable, newTable);
	// 避免线程冲突重处理
	// location 应该与 oldObj 保持一致,如果不同,说明当前的 location 已经处理过 oldObj 可是又被其他线程所修改
    if (HaveOld  &&  *location != oldObj) {
    
    
        SideTable::unlockTwoHaveOld, HaveNew>(oldTable, newTable);
        goto retry;
    }
    // 防止弱引用间死锁
    // 并且通过 +initialize 初始化构造器保证所有弱引用的 isa 非空指向
    if (HaveNew  &&  newObj) {
    
    
    	// 获得新对象的 isa 指针
        Class cls = newObj->getIsa();
        // 判断 isa 非空且已经初始化
        if (cls != previouslyInitializedClass  &&  
            !((objc_class *)cls)->isInitialized()) {
    
    
        	// 解锁
            SideTable::unlockTwoHaveOld, HaveNew>(oldTable, newTable);
            // 对其 isa 指针进行初始化
            _class_initialize(_class_getNonMetaClass(cls, (id)newObj));
            // 如果该类已经完成执行 +initialize 方法是最理想情况
            // 如果该类 +initialize 在线程中 
            // 例如 +initialize 正在调用 storeWeak 方法
            // 需要手动对其增加保护策略,并设置 previouslyInitializedClass 指针进行标记
            previouslyInitializedClass = cls;
			// 重新尝试
            goto retry;
        }
    }
    // ② 清除旧值
    if (HaveOld) {
    
    
        weak_unregister_no_lock(&oldTable->weak_table, oldObj, location);
    }
    // ③ 分配新值
    if (HaveNew) {
    
    
        newObj = (objc_object *)weak_register_no_lock(&newTable->weak_table, 
                                                      (id)newObj, location, 
                                                      CrashIfDeallocating);
        // 如果弱引用被释放 weak_register_no_lock 方法返回 nil 
        // 在引用计数表中设置若引用标记位
        if (newObj  &&  !newObj->isTaggedPointer()) {
    
    
        	// 弱引用位初始化操作
			// 引用计数那张散列表的weak引用对象的引用计数中标识为weak引用
            newObj->setWeaklyReferenced_nolock();
        }
        // 之前不要设置 location 对象,这里需要更改指针指向
        *location = (id)newObj;
    }
    else {
    
    
        // 没有新值,则无需更改
    }
    SideTable::unlockTwoHaveOld, HaveNew>(oldTable, newTable);
    return (id)newObj;
}

storeWeakAlthough the implementation code of the method is a bit long, it is not difficult to understand. Let's analyze the implementation of this method:

  • storeWeakThe method actually receives 3 parameters, which are haveOld, , haveNewand crashIfDeallocating. These three parameters are all passed in as templates, and they are three parameters of bool type. Respectively indicate weakwhether the pointer pointed to a weak reference before, weakwhether the pointer needs to point to a new reference, and if the weakly referenced object is being destroyed, should the object be weakly referenced at this time crash.
  • This method maintains oldTableand newTable respectively represent the old reference weak table and the new weak reference table, which are all SideTabletables hash.
  • If weakthe pointer previously pointed to a weak reference, weak_unregister_no_lockthe method will be called to remove the address of the old weak pointer (provided that the weak pointer will point to a new object).
  • If the weak pointer needs to point to a new reference, weak_register_no_lockthe method will be called to add the address of the new weak pointer to the weak reference table.
  • Call setWeaklyReferenced_nolockthe method to modify the bit flag of the object newly referenced by weak (whether the mark of the optimized version of the isa pointer is weakly referenced member variable)

weak_entry_tThe method when adding a new pointed object to isweak_register_no_lock

weak_register_no_lock

id 
weak_register_no_lock(weak_table_t *weak_table, id referent_id, 
                      id *referrer_id, bool crashIfDeallocating)
{
    
    
    objc_object *referent = (objc_object *)referent_id;
    objc_object **referrer = (objc_object **)referrer_id;

    // 如果referent为nil 或 referent 采用了TaggedPointer计数方式,直接返回,不做任何操作
    if (!referent  ||  referent->isTaggedPointer()) return referent_id;

    // 确保被引用的对象可用(没有在析构,同时应该支持weak引用)
    bool deallocating;
    if (!referent->ISA()->hasCustomRR()) {
    
    
        deallocating = referent->rootIsDeallocating();
    }
    else {
    
    
        BOOL (*allowsWeakReference)(objc_object *, SEL) = 
            (BOOL(*)(objc_object *, SEL))
            object_getMethodImplementation((id)referent, 
                                           SEL_allowsWeakReference);
        if ((IMP)allowsWeakReference == _objc_msgForward) {
    
    
            return nil;
        }
        deallocating =
            ! (*allowsWeakReference)(referent, SEL_allowsWeakReference);
    }
    // 正在析构的对象,不能够被弱引用
    if (deallocating) {
    
    
        if (crashIfDeallocating) {
    
    
            _objc_fatal("Cannot form weak reference to instance (%p) of "
                        "class %s. It is possible that this object was "
                        "over-released, or is in the process of deallocation.",
                        (void*)referent, object_getClassName((id)referent));
        } else {
    
    
            return nil;
        }
    }

    // now remember it and where it is being stored
    // 在 weak_table中找到referent对应的weak_entry,并将referrer加入到weak_entry中
    weak_entry_t *entry;
    if ((entry = weak_entry_for_referent(weak_table, referent))) {
    
     // 如果能找到weak_entry,则讲referrer插入到weak_entry中
        append_referrer(entry, referrer); 	// 将referrer插入到weak_entry_t的引用数组中
    } 
    else {
    
     // 如果找不到,就新建一个
        weak_entry_t new_entry(referent, referrer);  
        weak_grow_maybe(weak_table);
        weak_entry_insert(weak_table, &new_entry);
    }

    // Do not set *referrer. objc_storeWeak() requires that the 
    // value not change.

    return referent_id;
}

This method needs to pass in four parameters, and their meanings are as follows:

  • weak_table: weak_table_tThe global weak reference table of the structure type.
  • referent_id: weakPointer.
  • *referrer_id: weakPointer address.
  • crashIfDeallocating: If the weakly referenced object is being destructed, should the object be weakly referenced at this time should crash.

The general flow of this method:

  • If referentit is nil or the counting method referentis adopted TaggedPointer, return directly without any operation.
  • An exception is thrown if the object is being destructed.
  • If the object cannot be weakreferenced, return nil directly.
  • If the object is not re-destructed and can be referenced weak, the weak_entry_for_referentmethod is called to find the corresponding object from the weak reference table according to the address of the weak reference object weak_entry, and if it can be found, the append_referrermethod is called to insert the address of the weak pointer into it. Otherwise create a new one weak_entry.

The source code implementation of weak_entry method weak_entry_for_referent:

weak_entry_for_referent

static weak_entry_t *
weak_entry_for_referent(weak_table_t *weak_table, objc_object *referent)
{
    
    
    assert(referent);

    weak_entry_t *weak_entries = weak_table->weak_entries;

    if (!weak_entries) return nil;

    size_t begin = hash_pointer(referent) & weak_table->mask;  // 这里通过 & weak_table->mask的位操作,来确保index不会越界
    size_t index = begin;
    size_t hash_displacement = 0;
    while (weak_table->weak_entries[index].referent != referent) {
    
    
        index = (index+1) & weak_table->mask;
        if (index == begin) bad_weak_table(weak_table->weak_entries); // 触发bad weak table crash
        hash_displacement++;
        if (hash_displacement > weak_table->max_hash_displacement) {
    
     // 当hash冲突超过了可能的max hash 冲突时,说明元素没有在hash表中,返回nil 
            return nil;
        }
    }
    
    return &weak_table->weak_entries[index];
}

Then let's take a look at the source code implementation of append_referrer, the method of adding elements to weak_entry:

append_referrer

static void append_referrer(weak_entry_t *entry, objc_object **new_referrer)
{
    
    
    if (! entry->out_of_line()) {
    
     // 如果weak_entry 尚未使用动态数组,走这里
        // Try to insert inline.
        //尝试插入内联引用的数组
        for (size_t i = 0; i < WEAK_INLINE_COUNT; i++) {
    
    
            if (entry->inline_referrers[i] == nil) {
    
    
                entry->inline_referrers[i] = new_referrer;
                return;
            }
        }
        
        // 如果inline_referrers的位置已经存满了,则要转型为referrers,做动态数组。
        // Couldn't insert inline. Allocate out of line.
        weak_referrer_t *new_referrers = (weak_referrer_t *)
            calloc(WEAK_INLINE_COUNT, sizeof(weak_referrer_t));
        // This constructed table is invalid, but grow_refs_and_insert
        // will fix it and rehash it.
        for (size_t i = 0; i < WEAK_INLINE_COUNT; i++) {
    
    
            new_referrers[i] = entry->inline_referrers[I];
        }
        entry->referrers = new_referrers;
        entry->num_refs = WEAK_INLINE_COUNT;
        entry->out_of_line_ness = REFERRERS_OUT_OF_LINE;
        entry->mask = WEAK_INLINE_COUNT-1;
        entry->max_hash_displacement = 0;
    }

    // 对于动态数组的附加处理:
    assert(entry->out_of_line()); // 断言: 此时一定使用的动态数组

    if (entry->num_refs >= TABLE_SIZE(entry) * 3/4) {
    
     // 如果动态数组中元素个数大于或等于数组位置总空间的3/4,则扩展数组空间为当前长度的一倍
        return grow_refs_and_insert(entry, new_referrer); // 扩容,并插入
    }
    
    // 如果不需要扩容,直接插入到weak_entry中
    // 注意,weak_entry是一个哈希表,key:w_hash_pointer(new_referrer) value: new_referrer
    
    // 细心的人可能注意到了,这里weak_entry_t 的hash算法和 weak_table_t的hash算法是一样的,同时扩容/减容的算法也是一样的
    size_t begin = w_hash_pointer(new_referrer) & (entry->mask); // '& (entry->mask)' 确保了 begin的位置只能大于或等于 数组的长度
    size_t index = begin;  // 初始的hash index
    size_t hash_displacement = 0;  // 用于记录hash冲突的次数,也就是hash再位移的次数
    while (entry->referrers[index] != nil) {
    
    
        hash_displacement++;
        index = (index+1) & entry->mask;  // index + 1, 移到下一个位置,再试一次能否插入。(这里要考虑到entry->mask取值,一定是:0x111, 0x1111, 0x11111, ... ,因为数组每次都是*2增长,即8, 16, 32,对应动态数组空间长度-1的mask,也就是前面的取值。)
        if (index == begin) bad_weak_table(entry); // index == begin 意味着数组绕了一圈都没有找到合适位置,这时候一定是出了什么问题。
    }
    if (hash_displacement > entry->max_hash_displacement) {
    
     // 记录最大的hash冲突次数, max_hash_displacement意味着: 我们尝试至多max_hash_displacement次,肯定能够找到object对应的hash位置
        entry->max_hash_displacement = hash_displacement;
    }
    // 将ref存入hash数组,同时,更新元素个数num_refs
    weak_referrer_t &ref = entry->referrers[index];
    ref = new_referrer;
    entry->num_refs++;
}

This code first determines whether to use a fixed-length array or a dynamic array. If a fixed-length array is used, simply add the weakpointer address to the array. If the fixed-length array is exhausted, the elements in the fixed-length array need to be transferred to the dynamic array.

Then let's take a look at the method called when weakthe pointer removes the weak reference and needs to be cleared : , the old pointer address is removed in the method.weak_entryweak_unregister_no_lockweak

weak_unregister_no_lock

void
weak_unregister_no_lock(weak_table_t *weak_table, id referent_id, 
                        id *referrer_id)
{
    
    
	//对象的地址
    objc_object *referent = (objc_object *)referent_id;
    //weak指针地址
    objc_object **referrer = (objc_object **)referrer_id;

    weak_entry_t *entry;

    if (!referent) return;

    if ((entry = weak_entry_for_referent(weak_table, referent))) {
    
     // 查找到referent所对应的weak_entry_t
        remove_referrer(entry, referrer);  // 在referent所对应的weak_entry_t的hash数组中,移除referrer
       
        // 移除元素之后, 要检查一下weak_entry_t的hash数组是否已经空了
        bool empty = true;
        if (entry->out_of_line()  &&  entry->num_refs != 0) {
    
    
            empty = false;
        }
        else {
    
    
            for (size_t i = 0; i < WEAK_INLINE_COUNT; i++) {
    
    
                if (entry->inline_referrers[i]) {
    
    
                    empty = false; 
                    break;
                }
            }
        }

        if (empty) {
    
     // 如果weak_entry_t的hash数组已经空了,则需要将weak_entry_t从weak_table中移除
            weak_entry_remove(weak_table, entry);
        }
    }
    // Do not set *referrer = nil. objc_storeWeak() requires that the 
    // value not change.
}

Approximate process:

  • First, it will find the weak_entry_t corresponding to the referent in the weak_table
  • Remove referrer in weak_entry_t
  • After removing the element, judge whether there are elements in weak_entry_t at this time (empty==true?)
  • If weak_entry_t has no elements at this time, you need to remove weak_entry_t from weak_table

How are all weak pointers that reference it automatically set to nil when the object is freed?

understanding

When the reference count of the object is 0, the bottom layer will call the _objc_rootDealloc method to release the object, and the rootDealloc method will be called in the _objc_rootDealloc method. The following is the code implementation of the rootDealloc method:

inline void
objc_object::rootDealloc()
{
    
    
    if (isTaggedPointer()) return;  // fixme necessary?

    if (fastpath(isa.nonpointer  &&  
                 !isa.weakly_referenced  &&  
                 !isa.has_assoc  &&  
                 !isa.has_cxx_dtor  &&  
                 !isa.has_sidetable_rc))
    {
    
    
        assert(!sidetable_present());
        free(this);
    } 
    else {
    
    
        object_dispose((id)this);
    }
}

Approximate process:

  • First judge whether the object is Tagged Pointer, if it is, return directly.
  • If the object adopts an optimized isacounting method, and at the same time satisfies that the object is not referenced by weak !isa.weakly_referenced, has no associated objects !isa.has_assoc, does not have a custom C++ destructor method !isa.has_cxx_dtor, and does not use SideTable for reference counting, !isa.has_sidetable_rcit will be released directly and quickly.
  • If the condition in 2 cannot be met, object_disposethe method will be called.

Then let's take a look at the source code of the object_dispose method:

object_dispose

void *objc_destructInstance(id obj)
{
    
    
    if (obj) {
    
    
        // Read all of the flags at once for performance.
        bool cxx = obj->hasCxxDtor();
        bool assoc = obj->hasAssociatedObjects();

        // This order is important.
        if (cxx) object_cxxDestruct(obj);
        if (assoc) _object_remove_associations(obj, /*deallocating*/true);
        obj->clearDeallocating();
    }

    return obj;
}

If there is a custom C++ destructor, the C++ destructor is called. If there is an associated object, remove the associated object and remove itself from Association Managerthe map. Calling clearDeallocatingthe method clears references to the object.

Next, let's analyze the method of clearing the relevant references of the object clearDeallocating.

clearDeallocating

inline void 
objc_object::clearDeallocating()
{
    
    
    if (slowpath(!isa.nonpointer)) {
    
    
        // Slow path for raw pointer isa.
        sidetable_clearDeallocating();
    }
    else if (slowpath(isa.weakly_referenced  ||  isa.has_sidetable_rc)) {
    
    
        // Slow path for non-pointer isa with weak refs and/or side table data.
        clearDeallocating_slow();
    }

    assert(!sidetable_present());
}

clearDeallocatingThere are two branches in , first judge whether the object uses optimized isa reference counting, if not, you need to call a sidetable_clearDeallocatingmethod to clear SideTablethe reference counting data stored in the object. If the object adopts optimized isa reference counting, it is judged whether there is SideTableauxiliary reference counting ( isa.has_sidetable_rc) or weak reference ( isa.weakly_referenced), and if it meets one of these two conditions, call clearDeallocating_slowthe method.

Let's take a look at sidetable_clearDeallocatingthe method and clearDeallocating_slowmethod

sidetable_clearDeallocating

void 
objc_object::sidetable_clearDeallocating()
{
    
    
    SideTable& table = SideTables()[this];

    // clear any weak table items
    // clear extra retain count and deallocating bit
    // (fixme warn or abort if extra retain count == 0 ?)
    //清除所有弱表项
	//清除额外的保留计数和释放位
	//(如果额外保留计数==0,则修复警告或中止)
    table.lock();
    RefcountMap::iterator it = table.refcnts.find(this);
    if (it != table.refcnts.end()) {
    
    
        if (it->second & SIDE_TABLE_WEAKLY_REFERENCED) {
    
    
            weak_clear_no_lock(&table.weak_table, (id)this);
        }
        table.refcnts.erase(it);
    }
    table.unlock();
}

clearDeallocating_slow

NEVER_INLINE void
objc_object::clearDeallocating_slow()
{
    
    
    assert(isa.nonpointer  &&  (isa.weakly_referenced || isa.has_sidetable_rc));

    SideTable& table = SideTables()[this]; // 在全局的SideTables中,以this指针为key,找到对应的SideTable
    table.lock();
    if (isa.weakly_referenced) {
    
     // 如果obj被弱引用
        weak_clear_no_lock(&table.weak_table, (id)this); // 在SideTable的weak_table中对this进行清理工作
    }
    if (isa.has_sidetable_rc) {
    
     // 如果采用了SideTable做引用计数
        table.refcnts.erase(this); // 在SideTable的引用计数中移除this
    }
    table.unlock();
}

The above two methods are called weak_clear_no_lockto do weak_tablethe cleanup work.

weak_clear_no_lock

void 
weak_clear_no_lock(weak_table_t *weak_table, id referent_id) 
{
    
    
    objc_object *referent = (objc_object *)referent_id;

    weak_entry_t *entry = weak_entry_for_referent(weak_table, referent); // 找到referent在weak_table中对应的weak_entry_t
    if (entry == nil) {
    
    
        /// XXX shouldn't happen, but does with mismatched CF/objc
        //printf("XXX no entry for clear deallocating %p\n", referent);
        return;
    }

    // zero out references
    weak_referrer_t *referrers;
    size_t count;
    
    // 找出weak引用referent的weak 指针地址数组以及数组长度
    if (entry->out_of_line()) {
    
    
        referrers = entry->referrers;
        count = TABLE_SIZE(entry);
    } 
    else {
    
    
        referrers = entry->inline_referrers;
        count = WEAK_INLINE_COUNT;
    }
    
    for (size_t i = 0; i < count; ++i) {
    
    
        objc_object **referrer = referrers[i]; // 取出每个weak ptr的地址
        if (referrer) {
    
    
            if (*referrer == referent) {
    
     // 如果weak ptr确实weak引用了referent,则将weak ptr设置为nil,这也就是为什么weak 指针会自动设置为nil的原因
                *referrer = nil;
            }
            else if (*referrer) {
    
     // 如果所存储的weak ptr没有weak 引用referent,这可能是由于runtime代码的逻辑错误引起的,报错
                _objc_inform("__weak variable at %p holds %p instead of %p. "
                             "This is probably incorrect use of "
                             "objc_storeWeak() and objc_loadWeak(). "
                             "Break on objc_weak_error to debug.\n", 
                             referrer, (void*)*referrer, (void*)referent);
                objc_weak_error();
            }
        }
    }
    
    weak_entry_remove(weak_table, entry); // 由于referent要被释放了,因此referent的weak_entry_t也要移除出weak_table
}

Finally, let's take a look at the method of weak pointer destruction:

void
objc_destroyWeak(id *location)
{
    
    
    (void)storeWeak<DoHaveOld, DontHaveNew, DontCrashIfDeallocating>
        (location, nil);
}

After calling the storeWeakmethod here, since there is no new object pointed to, if our weak pointer already points to an object, we will go to: to weak_unregister_no_lockremove the address of the old weak pointer nil.

Summarize

  • weakThe principle is that the bottom layer maintains a table weak_table_tof structures hash, keywhich is the address of the pointed object and valuean weakarray of addresses of pointers.
  • weakThe function of the keyword is a weak reference, and the counter of the referenced object will not be incremented by 1, and will be automatically set when the referenced object is released nil.
  • When the object is released, the calling clearDeallocatingfunction obtains the array of all pointer addresses according to the object address weak, then traverses the array to set the data in it nil, finally deletes this entryfrom weakthe table, and finally cleans up the object's records.
  • The article introduces three structures such as SideTable, weak_table_t, weak_entry_tand the relationship between them is shown in the following figure:

Guess you like

Origin blog.csdn.net/m0_62386635/article/details/131767655