The most commonly used lock for iOS thread safety - @synchronized

The previous article sorted out several common locks in iOS. The most commonly used ones are the ones I use most @synchronized. Next, let's learn the underlying principles together.

@synchronizedHow to achieve recursive mutual exclusion? How to achieve reentrancy? Take these two questions to analyze the source code.

We first use the clang command to view @synchronizedthe implementation in .cpp

#import <Cocoa/Cocoa.h>
int main(int argc, const char * argv[]) {
    @autoreleasepool {
        NSObject *syObject = [NSObject alloc];
        @synchronized (syObject) {
        }
    }
    return NSApplicationMain(argc, argv);
}
NSObject *syObject = ((NSObject *(*)(id, SEL))(void *)objc_msgSend)((id)objc_getClass("NSObject"), sel_registerName("alloc"));
{ 
    id _rethrow = 0; 
    id _sync_obj = (id)syObject;
    objc_sync_enter(_sync_obj);
    try {
        struct _SYNC_EXIT { 
            _SYNC_EXIT(id arg) : sync_exit(arg) {}
            ~_SYNC_EXIT() {objc_sync_exit(sync_exit);}
            id sync_exit;
        } _sync_exit(_sync_obj);
    } catch (id e) {_rethrow = e;}
    { 
        struct _FIN { _FIN(id reth) : rethrow(reth) {}
        ~_FIN() { if (rethrow) objc_exception_throw(rethrow); }
        id rethrow;
    } _fin_force_rethow(_rethrow);}
}

The compiled code is analyzed, and the constructor and destructor are called first objc_sync_enterand then a _SYNC_EXITconstructor and destructor. The constructor does nothing, and the destructor will be called when it goes out of scope objc_sync_exit. @synchronizedwill be compiled into objc_sync_enterand objc_sync_exit.

Next we look at the source code to see what they have done

objc_sync_enter

int objc_sync_enter(id obj) {
    int result = OBJC_SYNC_SUCCESS;
    if (obj) {
        SyncData* data = id2data(obj, ACQUIRE);
        ASSERT(data);
        data->mutex.lock();
    } else {
        // @synchronized(nil) does nothing
        if (DebugNilSync) {
            _objc_inform("NIL SYNC DEBUG: @synchronized(nil); set a breakpoint on objc_sync_nil to debug");
        }
        objc_sync_nil();
    }
    return result;
}

objc_sync_exit

// End synchronizing on 'obj'. 
// Returns OBJC_SYNC_SUCCESS or OBJC_SYNC_NOT_OWNING_THREAD_ERROR
int objc_sync_exit(id obj) {
    int result = OBJC_SYNC_SUCCESS;
    if (obj) {
        SyncData* data = id2data(obj, RELEASE); 
        if (!data) {
            result = OBJC_SYNC_NOT_OWNING_THREAD_ERROR;
        } else {
            bool okay = data->mutex.tryUnlock();
            if (!okay) {
                result = OBJC_SYNC_NOT_OWNING_THREAD_ERROR;
            }
        }
    } else {
        // @synchronized(nil) does nothing
    }
    return result;
}

objc_sync_enterAnd both do nothing if the parameter is first determined, not objc_sync_exitby obtaining a data structure and passing the second parameter to distinguish whether it is called or called . Take out the recursive lock, lock and unlock operation.objnilobjnilid2dataSyncDataid2dataenterid2dataexitid2dataSyncData

First understand SyncDatathe data structure:

typedef struct alignas(CacheLineSize) SyncData {
    struct SyncData* nextData;
    DisguisedPtr<objc_object> object;
    int32_t threadCount;  // number of THREADS using this block
    recursive_mutex_t mutex;
} SyncData;

It can be seen that it SyncDatais a singly linked list structure, and @synchronized 的参数objecta recursive lock and the number of threads are recorded for each. These two allocation records are the basis of recursive calls under multi-threading. ( @synchronized(objc1) equivalent to yes SyncData->object=objc1)

using recursive_mutex_t = recursive_mutex_tt<LOCKDEBUG>;
class recursive_mutex_tt : nocopy_t {
    os_unfair_recursive_lock mLock;
    ......
}

OS_UNFAIR_RECURSIVE_LOCK_AVAILABILITY
typedef struct os_unfair_recursive_lock_s {
    os_unfair_lock ourl_lock;
    uint32_t ourl_count;
} os_unfair_recursive_lock, *os_unfair_recursive_lock_t;

As you can see, it's basically based os_unfair_lockon encapsulation. In previous versions, this was based on pthread_mutex_tthe package.

Regarding this lock, you can continue to see its definition:

我们现在要关注的是SyncData里的成员是如何在多线程下实现递归调用的。关键逻辑还得看id2data里面做了什么。

static SyncData* id2data(id object, enum usage why) {
    // 从全局hash表中, 通过object获取锁
    spinlock_t *lockp = &LOCK_FOR_OBJ(object);
    // 从全局hash表中, 通过object获取指向SyncData单向链表的头指针
    SyncData **listp = &LIST_FOR_OBJ(object);
    // 查询后需要返回的结果
    SyncData* result = NULL;

#if SUPPORT_DIRECT_THREAD_KEYS
    /* 快缓存。 : 
    2个固定的线程键 储存一个单独的 SyncCacheItem 
    Fast cache: two fixed pthread keys store a single SyncCacheItem. 
    这就避免了对于一次只同步单个对象的线程使用SyncCache的malloc 
    This avoids malloc of the SyncCache for threads that only synchronize a single object at a time. 
    SYNC_DATA_DIRECT_KEY == SyncCacheItem.data 
    SYNC_COUNT_DIRECT_KEY == SyncCacheItem.lockCount 
    */
    // Check per-thread single-entry fast cache for matching object
    bool fastCacheOccupied = NO;
    //拿出快速缓存里面的SyncData
    SyncData *data = (SyncData *)tls_get_direct(SYNC_DATA_DIRECT_KEY);
    if (data) {
        fastCacheOccupied = YES;
        // 如果是同一个
        if (data->object == object) {
            // Found a match in fast cache.
            uintptr_t lockCount;
           result = data;
            lockCount = (uintptr_t)tls_get_direct(SYNC_COUNT_DIRECT_KEY);
            if (result->threadCount <= 0  ||  lockCount <= 0) {
                _objc_fatal("id2data fastcache is buggy");
            }
            //加锁的时候(ENTER)
            switch(why) {
            case ACQUIRE: {
                lockCount++;
                //lockCount放入快速缓存
                tls_set_direct(SYNC_COUNT_DIRECT_KEY, (void*)lockCount);
                break;
            }
            //解锁的时候(EXIT)
            case RELEASE:
                lockCount--;
                //取出加锁的时候的lockCount
                tls_set_direct(SYNC_COUNT_DIRECT_KEY, (void*)lockCount);
                if (lockCount == 0) {
                    // remove from fast cache
                    tls_set_direct(SYNC_DATA_DIRECT_KEY, NULL);
                    // atomic because may collide with concurrent ACQUIRE
                    // SyncData中记录线程数量的-1
                    OSAtomicDecrement32Barrier(&result->threadCount);
                }
                break;
            case CHECK:
                // do nothing
                break;
            }
            //返回
            return result;
        }
    }
#endif

    /*
        SyncCache 查找
        在线程的TLS中找objc对象,然后再维护2个count lockCount 和 threadCount
        每个线程都只有一份
    */
    // Check per-thread cache of already-owned locks for matching object
    SyncCache *cache = fetch_cache(NO);
    if (cache) {
        unsigned int i;
        //遍历查找
        for (i = 0; i < cache->used; i++) {
            SyncCacheItem *item = &cache->list[i];
            if (item->data->object != object) continue;
            // Found a match.
            result = item->data;
            if (result->threadCount <= 0  ||  item->lockCount <= 0) {
                _objc_fatal("id2data cache is buggy");
            }
            switch(why) {
            case ACQUIRE:
                item->lockCount++;
                break;
            case RELEASE:
                item->lockCount--;
                //如果==0,该线程已经使用完了
                if (item->lockCount == 0) {
                    // remove from per-thread cache
                    cache->list[i] = cache->list[--cache->used];
                    // atomic because may collide with concurrent ACQUIRE
                    // threadCount -1。防止和加锁的时候通途
                    OSAtomicDecrement32Barrier(&result->threadCount);
                }
                break;
            case CHECK:
                // do nothing
                break;
            }
            return result;
        }
    }

    /*
        sDataLists 查找
        这里加锁内容包括sDataLists查找,和创建SyncData,目的是为了防止创建重复的SyncData
    */
    // Thread cache didn't find anything.
    // Walk in-use list looking for matching object
    // Spinlock prevents multiple threads from creating multiple 
    // locks for the same new object.
    // We could keep the nodes in some hash table if we find that there are
    // more than 20 or so distinct locks active, but we don't do that now.
    lockp->lock();
    {
        SyncData* p;
        SyncData* firstUnused = NULL;
        //遍历链表
        for (p = *listp; p != NULL; p = p->nextData) {
            //找到SyncData
            if ( p->object == object ) {
                result = p;
                // atomic because may collide with concurrent RELEASE
                // threadCount + 1
                OSAtomicIncrement32Barrier(&result->threadCount);
                // 跳转:done
                goto done;
            }
            if ( (firstUnused == NULL) && (p->threadCount == 0) )
                firstUnused = p;
        }
        // no SyncData currently associated with object
        if ( (why == RELEASE) || (why == CHECK) ) goto done;
        // an unused one was found, use it
        //利用链表里面的无用节点
        if ( firstUnused != NULL ) {
            result = firstUnused;
            result->object = (objc_object *)object;
            result->threadCount = 1;
            goto done;
        }
    }
    /*
        创建SyncData
    */
    // Allocate a new SyncData and add to list.
    // XXX allocating memory with a global lock held is bad practice,
    // might be worth releasing the lock, allocating, and searching again.
    // But since we never free these guys we won't be stuck in allocation very often.
    posix_memalign((void **)&result, alignof(SyncData), sizeof(SyncData));
    result->object = (objc_object *)object;
    result->threadCount = 1;
    new (&result->mutex) recursive_mutex_t(fork_unsafe_lock);
    //放到头节点
    result->nextData = *listp;
    *listp = result;

 done:
     /* 缓存 */
    lockp->unlock();
    if (result) {
        // Only new ACQUIRE should get here.
        // All RELEASE and CHECK and recursive ACQUIRE are 
        // handled by the per-thread caches above.
        if (why == RELEASE) {
            // Probably some thread is incorrectly exiting 
            // while the object is held by another thread.
            return nil;
        }
        if (why != ACQUIRE) _objc_fatal("id2data is buggy");
        if (result->object != object) _objc_fatal("id2data is buggy");

#if SUPPORT_DIRECT_THREAD_KEYS
        // 支持线程快速缓存,并且快速缓存没有东西
        if (!fastCacheOccupied) {
            //存储到快速缓存中
            // Save in fast thread cache
            tls_set_direct(SYNC_DATA_DIRECT_KEY, result);
            tls_set_direct(SYNC_COUNT_DIRECT_KEY, (void*)1);
        } else 
#endif
        {
            //在线程缓存中存储
            // Save in thread cache
            if (!cache) cache = fetch_cache(YES);
            cache->list[cache->used].data = result;
            cache->list[cache->used].lockCount = 1;
            cache->used++;
        }
    }
    return result;
}

TLS快速缓存 查找

行数很多,可以分为几个部分,第一部分快速缓存查找:

1.TLS快速缓存中只存储了一个SyncData数据,从这里取出的SyncDataobject@synchronized 的参数object做对比(相同则说明是我们要找到的SyncData
2.如果找到了SyncData,对lockCountthreadCount做记录更新,直接把SyncData返回出去;
3.如果没有找到SyncData,则进入下一部分。

注意⚠️TLS(thread Local Store)为线程本地存储。也就是说每条线程都会有一个这样的FastCache。并不是整个过程只有一个FastCache。如果在FastCache找到就直接返回。

syncCache 查找

1.遍历带锁的每个线程的缓存,取出每一个SyncCacheItem,取出SyncCacheItem里面的SyncDataSyncDataobject@synchronized 的参数object做对比(相同则说明是我们要找到的SyncData
2.如果找到了SyncData,对lockCountthreadCount做记录更新,直接把SyncData返回出去;
3.如果没有找到SyncData,则进入第三部分。

typedef struct {
    SyncData *data;
    unsigned int lockCount;  // number of times THIS THREAD locked this block
} SyncCacheItem;
typedef struct SyncCache {
    unsigned int allocated;
    unsigned int used;
    SyncCacheItem list[0];
} SyncCache;

同样是在缓存找,因为SyncCache里面是数组,这里遍历查找。可以看其中fetch_cache(NO)中的代码:

static SyncCache *fetch_cache(bool create) {
    _objc_pthread_data *data; 
    data = _objc_fetch_pthread_data(create); 
    if (!data) return NULL; 
    if (!data->syncCache) { 
        if (!create) { return NULL; } 
        else {
            int count = 4; 
            data->syncCache = (SyncCache *) calloc(1, sizeof(SyncCache) + count*sizeof(SyncCacheItem)); 
            data->syncCache->allocated = count; 
        } 
    } // Make sure there's at least one open slot in the list. 
    if (data->syncCache->allocated == data->syncCache->used) { 
        data->syncCache->allocated *= 2; 
        data->syncCache = (SyncCache *) realloc(data->syncCache, sizeof(SyncCache) + data->syncCache->allocated * sizeof(SyncCacheItem));
    } 
    return data->syncCache; 
}

其中_objc_pthread_data结构如下:

typedef struct {
    struct _objc_initializing_classes *initializingClasses; // for +initialize
    struct SyncCache *syncCache;  // for @synchronize
    struct alt_handler_list *handlerList;  // for exception alt handlers
    char *printableNames[4];  // temporary demangled names for logging
    const char **classNameLookups;  // for objc_getClass() hooks
    unsigned classNameLookupsAllocated;
    unsigned classNameLookupsUsed;
    // If you add new fields here, don't forget to update 
    // _objc_pthread_destroyspecific()
} _objc_pthread_data;

这里也进一步说明了TLSsyncCache也是每个线程中都存在一份的。 很明显线程缓存保存了好多的SyncData+lockCount

sDataLists 查找 或 创建SyncData

如果在快速缓存和缓存里面都没有找到,这时候是这个线程第一次走到 @synchronized的地方,系统会去sDataLists里面去找对应的SyncData对象.sDataLists全局Hash表,在id2data函数一开头就获取了全局Hash表的元素

// Use multiple parallel lists to decrease contention among unrelated objects.
#define LOCK_FOR_OBJ(obj) sDataLists[obj].lock
#define LIST_FOR_OBJ(obj) sDataLists[obj].data
static StripedMap<SyncList> sDataLists;

template<typename T>
class StripedMap {
#if TARGET_OS_IPHONE && !TARGET_OS_SIMULATOR
    enum { StripeCount = 8 };
#else
    enum { StripeCount = 64 };
#endif
    struct PaddedT {
        T value alignas(CacheLineSize);
    };
    PaddedT array[StripeCount];
    ......
}

struct SyncList {
    SyncData *data;
    spinlock_t lock;
    constexpr SyncList() : data(nil), lock(fork_unsafe_lock) { }
};

sDataListsStripedMap类型,StripedMap存储的是 真机下8张表/模拟器下64张表,每张表里存储的是很多的SyncList = SyncData单向链表 + lock。这里也说明了一个问题,为什么@synchronized在模拟器中的性能会很差,因为在模拟器中会从64张表中去查找锁,而真机是从8张表中查找锁.

1.遍历全局HashStripedMap,取出的SyncData单向链表object@synchronized 的参数object做对比(相同则说明是我们要找到的SyncData),如果对比不是同一个,会找链表下一个元素对比。
2.当前没有与object关联的SyncData,则直接返回nil
3.找到一个没用过的SyncData,就对其缓存到TLS快速缓存线程缓存,并返回这个SyncData 4.如果是TLS快速缓存线程缓存全局Hash表StripedMap都没有找到,说明object被第一次加锁,去创建一个SyncData 返回它。

缓存到线程中

It will be called after finding and creating SyncData in sDataLists done. This will only be Enterexecuted when the cache is supported. If the cache is supported and there is no value in the cache, then add it in the cache to facilitate the next recursion to lock. Otherwise, it will be added to the thread cache.

id2dataThe process used in the process of finding locks 类似三级缓存is to manage locks in multiple threads, and allow threads to obtain locks at the fastest speed to complete the operation of locking and unlocking, thereby improving efficiency.

Summarize

  • not found in fast cache
  • also not found in thread cache
  • Also not found in the global sDataLists
  • Then create a new SyncData

WX20220707-094714@2x.png

Guess you like

Origin juejin.im/post/7117441321270247438