Autorelease Pool Overview

What is AutoreleasePool

AutoreleasePool (automatic release pool) is an automatic memory recycling mechanism in OC. It can delay the timing of variable release added to AutoreleasePool. Under normal circumstances, the created variable will be released when it exceeds its scope, but if the variable is added to the AutoreleasePool, the release will be delayed.

Main thread AutoreleasePool creation and release

  • After the App starts, Apple registers two Observers in the main thread RunLoop, and their callbacks are all _wrapRunLoopWithAutoreleasePoolHandler().

  • The first event monitored by the Observer is Entry (about to enter the Loop), and _objc_autoreleasePoolPush() will be called in its callback to create an automatic release pool. Its order is -2147483647, which has the highest priority and ensures that the creation of the release pool occurs before all other callbacks.

  • The second Observer monitors two events: _objc_autoreleasePoolPop() and _objc_autoreleasePoolPush() are called when BeforeWaiting (preparing to enter sleep) to release the old pool and create a new pool; _objc_autoreleasePoolPop() is called when Exit (about to exit the Loop) to release Autorelease pool. The order of this Observer is 2147483647, with the lowest priority, ensuring that its release pool occurs after all other callbacks.

  • Code executed on the main thread is usually written in event callbacks and Timer callbacks. These callbacks will be surrounded by the AutoreleasePool created by RunLoop, so there will be no memory leaks and developers do not have to explicitly create the Pool.

That is to say, AutoreleasePool is created before a RunLoop event starts (push), and AutoreleasePool is released before a RunLoop event is about to end (pop).
The Autorelease object in the AutoreleasePool is added in the RunLoop event, and the Autorelease object in the AutoreleasePool is released when the AutoreleasePool is released.

Child thread AutoreleasePool creation and release

We already know before that the autorelease object is released when AutoreleasePoolPage pops. In the runloop of the main thread, there are two oberservers responsible for creating and clearing the autoreleasepool. For details, see YY's in-depth understanding of runloop .

The runloop of the sub-thread needs to be opened manually, so when is the AutoreleasePool in the sub-thread created and released?

If the current thread does not have an AutorelesepoolPage, the code execution sequence is autorelease -> autoreleaseFast -> autoreleaseNoPage.
In the autoreleaseNoPage method, a hotPage is created and then page->add(obj) is called. That is to say, even if this thread does not have an AutorelesepoolPage, when an autorelease object is used, a new AutoreleasepoolPage will come out to manage the autorelese object, so there is no need to worry about memory leaks.

After clarifying when the autoreleasepool was created, the next question naturally arises. When will the autoreleasepool be emptied?

For this problem, use the watchpoint set variable command to observe.
The first is the simplest scenario, creating a child thread.

__weak id obj;
...
[NSThread detachNewThreadSelector:@selector(createAndConfigObserverInSecondaryThread) toTarget:self withObject:nil];

Use a weak pointer to observe the autorelease object in the child thread and the tasks performed in the child thread.

- (void)createAndConfigObserverInSecondaryThread{
    __autoreleasing id test = [NSObject new];
    NSLog(@"obj = %@", test);
    obj = test;
    [[NSThread currentThread] setName:@"test runloop thread"];
    NSLog(@"thread ending");
}

Set a breakpoint at obj = test and use the watchpoint set variable obj command to observe obj. You can see that the method call stack of obj when it is released is like this. (missing picture)

Through this call stack, you can see that the release time is _pthread_exit. Then execute the tls_dealloc method of AutorelepoolPage. This method is as follows

static void tls_dealloc(void *p)
{
    if (p == (void*)EMPTY_POOL_PLACEHOLDER) {
        // No objects or pool pages to clean up here.
        return;
    }

    // reinstate TLS value while we work
    setHotPage((AutoreleasePoolPage *)p);

    if (AutoreleasePoolPage *page = coldPage()) {
        if (!page->empty()) pop(page->begin());  // pop all of the pools
        if (DebugMissingPools || DebugPoolAllocation) {
            // pop() killed the pages already
        } else {
            page->kill();  // free all of the pages
        }
    }
    
    // clear TLS value so TLS destruction doesn't loop
    setHotPage(nil);
}

I found the key code here if (!page->empty()) pop(page->begin());. Looking a little further, the following function will be executed during _pthread_exit:

void
_pthread_tsd_cleanup(pthread_t self)
{
#if !VARIANT_DYLD
   int j;

   // clean up dynamic keys first
   for (j = 0; j < PTHREAD_DESTRUCTOR_ITERATIONS; j++) {
   	pthread_key_t k;
   	for (k = __pthread_tsd_start; k <= self->max_tsd_key; k++) {
   		_pthread_tsd_cleanup_key(self, k);
   	}
   }

   self->max_tsd_key = 0;

   // clean up static keys
   for (j = 0; j < PTHREAD_DESTRUCTOR_ITERATIONS; j++) {
   	pthread_key_t k;
   	for (k = __pthread_tsd_first; k <= __pthread_tsd_max; k++) {
   		_pthread_tsd_cleanup_key(self, k);
   	}
   }
#endif // !VARIANT_DYLD
}

That is to say, the thread will release its own resources when exiting. This operation includes destroying the autoreleasepool. In tls_delloc, the pop operation is performed.
The autoreleasepool will be emptied when the thread is destroyed. But the thread in the above example does not join the runloop, it is just a one-time thread.

As for the situation when the child thread joins the runloop:

  • If the RunLoop of the child thread is built by the CF framework, the automatic release pool needs to be maintained by itself, otherwise it will be completely taken over by the thread. There is indeed no such content in the source code.
  • If it is built by the Foundation framework, in the runMode: (NSString*)mode beforeDate: (NSDate*)date method, an autoreleasepool is actually wrapped. If you dig into some functions, you will find that there are actually autoreleasepool functions in many places, so even if it is our custom source, there is no need to worry about the operation of releasing the autoreleasepool in the execution function. The system has added these operations for us at each key entrance. . You can refer to GNU implementation for this part, or you can verify it by looking at Apple's assembly code.

AutoreleasePool implementation principle

Use the clang -rewrite-objc command in the terminal to rewrite the following OC code into a C++ implementation:

#import <Foundation/Foundation.h>

int main(int argc, const char * argv[]) {
    
    @autoreleasepool {
        NSLog(@"Hello, World!");
    }
 
    return 0;
}

In the cpp file code we find the main function code as follows:

int main(int argc, const char * argv[]) {

    /* @autoreleasepool */ { __AtAutoreleasePool __autoreleasepool; 
        NSLog((NSString *)&__NSConstantStringImpl__var_folders_kb_06b822gn59df4d1zt99361xw0000gn_T_main_d39a79_mi_0);
    }

    return 0;
}

You can see that Apple implements @autoreleasepool{} by declaring a local variable __autoreleasepool of type __AtAutoreleasePool.
__AtAutoreleasePool is defined as follows:

extern "C" __declspec(dllimport) void * objc_autoreleasePoolPush(void);
extern "C" __declspec(dllimport) void objc_autoreleasePoolPop(void *);

struct __AtAutoreleasePool {
  __AtAutoreleasePool() {atautoreleasepoolobj = objc_autoreleasePoolPush();}
  ~__AtAutoreleasePool() {objc_autoreleasePoolPop(atautoreleasepoolobj);}
  void * atautoreleasepoolobj;
};

According to the characteristics of constructors and destructors (the constructor of automatic local variables is called when the program executes to the point where the object is declared, and the corresponding destructor is called when the program executes and leaves the scope of the object) , we can simplify the above two pieces of code into the following form:

int main(int argc, const char * argv[]) {

    /* @autoreleasepool */ {
        void *atautoreleasepoolobj = objc_autoreleasePoolPush();
        NSLog((NSString *)&__NSConstantStringImpl__var_folders_kb_06b822gn59df4d1zt99361xw0000gn_T_main_d39a79_mi_0);
        objc_autoreleasePoolPop(atautoreleasepoolobj);
    }

    return 0;
}

At this point, we can analyze that the execution process of a single automatic release pool is objc_autoreleasePoolPush() —> [object autorelease] —> objc_autoreleasePoolPop(void *).
Let’s take a look at the implementation of objc_autoreleasePoolPush and objc_autoreleasePoolPop:

void *objc_autoreleasePoolPush(void) {
    return AutoreleasePoolPage::push();
}

void objc_autoreleasePoolPop(void *ctxt) {
    AutoreleasePoolPage::pop(ctxt);
}

The above method seems to be an encapsulation of the static methods push and pop corresponding to AutoreleasePoolPage.
Let's analyze the implementation of AutoreleasePoolPage and reveal the implementation principle of AutoreleasePool.

AutoreleasePoolPage implementation

Introduction to AutoreleasePoolPage

AutoreleasePoolPage is a class in C++. Its structure diagram
[External link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-jtGyRR9O-1652109849071) (https://files.jb51 .net/file_images/article/201805/201853851001.png)]

AutoreleasePool corresponds one-to-one by thread (the thread pointer in the structure points to the current thread).

AutoreleasePool does not have a separate structure, but is composed of several AutoreleasePoolPages in the form of a doubly linked list (corresponding to the parent pointer and child pointer in the structure respectively).

The definition of AutoreleasePoolPage in NSObject.mm is as follows:

class AutoreleasePoolPage {
#   define EMPTY_POOL_PLACEHOLDER ((id*)1)

#   define POOL_BOUNDARY nil
    static pthread_key_t const key = AUTORELEASE_POOL_KEY;
    static uint8_t const SCRIBBLE = 0xA3;  // 0xA3A3A3A3 after releasing
    static size_t const SIZE = 
#if PROTECT_AUTORELEASEPOOL
        PAGE_MAX_SIZE;  // must be multiple of vm page size
#else
        PAGE_MAX_SIZE;  // size and alignment, power of 2
#endif
    static size_t const COUNT = SIZE / sizeof(id);

    magic_t const magic;
    id *next;
    pthread_t const thread;
    AutoreleasePoolPage * const parent;
    AutoreleasePoolPage *child;
    uint32_t const depth;
    uint32_t hiwat;
};
  • magic checks variables for integrity
  • next points to the next address in the current page that can store variables, initially the bottom of the stack
  • The thread where the thread page is currently located, AutoreleasePool corresponds to one thread by thread (the thread pointer in the structure points to the current thread)
  • parent parent node points to the previous page
  • child child node points to the next page
  • depth The depth of the linked list, the number of nodes
  • hiwat high water mark an upper limit of data storage
  • EMPTY_POOL_PLACEHOLDER empty pool placeholder
  • POOL_BOUNDARY is a boundary object nil. The previous source code variable name is POOL_SENTINEL sentinel object, which is used to distinguish the boundary of each Pool. The scope of a pool may cover multiple AutoreleasePoolPages, and an AutoreleasePoolPage may also contain multiple pools.
  • PAGE_MAX_SIZE = 4096, why is it 4096? In fact, it means that each sector of virtual memory is 4096 bytes and 4K aligned.
  • COUNT the number of objects in a page

The size of AutoreleasePoolPage is fixed. It will be created in 4K size at the beginning, which is the value represented by the field SIZE=PAGE_MAX_SIZE. But sizeof(*this) only calculates the hiwat field.

AutoreleasePoolPage allocates so much memory as SIZE-sizeof(*this) through operator new as a stack to store automatically released object pointers. sizeof is a keyword used to calculate the number of bytes in the data space during compilation, so only the number of bytes occupied by the fields in AutoreleasePoolPage will be calculated.

In addition, when next == begin(), it means that AutoreleasePoolPage is empty; when next == end(), it means that AutoreleasePoolPage is full.

id * begin() {
        return (id *) ((uint8_t *)this+sizeof(*this));
    }

    id * end() {
        return (id *) ((uint8_t *)this+SIZE);
    }

    bool empty() {
        return next == begin();
    }

    bool full() { 
        return next == end();
}

objc_autoreleasePoolPush

Whenever the automatic release pool calls objc_autoreleasePoolPush, it will put the boundary object on the top of the stack and then return the boundary object for release.

atautoreleasepoolobj = objc_autoreleasePoolPush();

atautoreleasepoolobj is the returned boundary object (POOL_BOUNDARY), the value is 0 (that is, nil), then this page becomes as follows:

Push is implemented as follows:

void *objc_autoreleasePoolPush(void) {
    return AutoreleasePoolPage::push();
}

It calls the class method push of AutoreleasePoolPage:

static inline void *push() {
   return autoreleaseFast(POOL_BOUNDARY);
}

Here you will enter a more critical method autoreleaseFast and pass in the boundary object (POOL_BOUNDARY):

static inline id *autoreleaseFast(id obj)
{
   AutoreleasePoolPage *page = hotPage();
   if (page && !page->full()) {
       return page->add(obj);
   } else if (page) {
       return autoreleaseFullPage(obj, page);
   } else {
       return autoreleaseNoPage(obj);
   }
}

The above method selects different codes for execution in three situations:

  • There is a hotPage and the current page is not satisfied. Call the page->add(obj) method to add the object to the stack of AutoreleasePoolPage.
  • There is a hotPage and the current page is full. Call autoreleaseFullPage to initialize a new page. Call the page->add(obj) method to add the object to the stack of the newly created AutoreleasePoolPage.
  • If there is no hotPage, call autoreleaseNoPage internally to create a hotPage, and internally call the page->add(obj) method to add the object to the stack of the newly created AutoreleasePoolPage.

In the end, page->add(obj) will be called to add the object to the automatic release pool.

hotPage is the autoreleasePage node being used in the autoreleasePage linked list. It is essentially a pointer to the autoreleasepage and is stored in the thread's TSD (thread private data: Thread-specific Data):

static inline AutoreleasePoolPage *hotPage() 
    {
        AutoreleasePoolPage *result = (AutoreleasePoolPage *)
            tls_get_direct(key);
        if ((id *)result == EMPTY_POOL_PLACEHOLDER) return nil;
        if (result) result->fastcheck();
        return result;
    }

static inline void *tls_get_direct(tls_key_t k)
{ 
    ASSERT(is_valid_direct_key(k));

    if (_pthread_has_direct_tsd()) {
        return _pthread_getspecific_direct(k);
    } else {
        return pthread_getspecific(k);
    }
}

Let’s take a look at the implementation of add

 id *add(id obj)
    {
        assert(!full());
        unprotect();
        id *ret = next;  // faster than `return next-1` because of aliasing
        *next++ = obj;
        protect();
        return ret;
}

The add function mainly places the obj address at the location pointed by the next pointer, points next to the next location, and finally returns the location pointed to by next last time.

Let's look at the full page processing autoreleaseFullPage. It will find the last page based on the incoming page, and finally create a new page based on this page. In the constructor, this page will be assigned to the child of the previous page, and the previous page will be assigned to this The parent of a page completes the connection of the pages. Then set the newly created page as a hot page and add the obj that needs to be placed in the page.

 static __attribute__((noinline))
    id *autoreleaseFullPage(id obj, AutoreleasePoolPage *page)
    {
        // The hot page is full. 
        // Step to the next non-full page, adding a new page if necessary.
        // Then add the object to that page.
        assert(page == hotPage());
        assert(page->full()  ||  DebugPoolAllocation);

        do {
            if (page->child) page = page->child;
            else page = new AutoreleasePoolPage(page);
        } while (page->full());

        setHotPage(page);
        return page->add(obj);
}

Let's look at empty page processing again. An empty page means that no pool has been pushed, or a pool with an empty placeholder mark has been pushed, but there is no content yet.

    static __attribute__((noinline))
    id *autoreleaseNoPage(id obj)
    {
        // "No page" could mean no pool has been pushed
        // or an empty placeholder pool has been pushed and has no contents yet
        assert(!hotPage());

        bool pushExtraBoundary = false;
        if (haveEmptyPoolPlaceholder()) {
            // We are pushing a second pool over the empty placeholder pool
            // or pushing the first object into the empty placeholder pool.
            // Before doing that, push a pool boundary on behalf of the pool 
            // that is currently represented by the empty placeholder.
            pushExtraBoundary = true;
        }
        else if (obj != POOL_BOUNDARY  &&  DebugMissingPools) {
            // We are pushing an object with no pool in place, 
            // and no-pool debugging was requested by environment.
            _objc_inform("MISSING POOLS: (%p) Object %p of class %s "
                         "autoreleased with no pool in place - "
                         "just leaking - break on "
                         "objc_autoreleaseNoPool() to debug", 
                         pthread_self(), (void*)obj, object_getClassName(obj));
            objc_autoreleaseNoPool(obj);
            return nil;
        }
        else if (obj == POOL_BOUNDARY  &&  !DebugPoolAllocation) {
            // We are pushing a pool with no pool in place,
            // and alloc-per-pool debugging was not requested.
            // Install and return the empty pool placeholder.
            return setEmptyPoolPlaceholder();
        }

        // We are pushing an object or a non-placeholder'd pool.

        // Install the first page.
        AutoreleasePoolPage *page = new AutoreleasePoolPage(nil);
        setHotPage(page);
        
        // Push a boundary on behalf of the previously-placeholder'd pool.
        if (pushExtraBoundary) {
            page->add(POOL_BOUNDARY);
        }
        
        // Push the requested object or pool.
        return page->add(obj);
}

There is a concept EMPTY_POOL_PLACEHOLDER here. When an external call creates AutoreleasePoolPage for the first time, but there is no object to be pushed onto the stack, it will not create an AutoreleasePoolPage object first, but return EMPTY_POOL_PLACEHOLDER as a pointer, which will be recorded in TLS. A piece of information with AUTORELEASE_POOL_KEY as the key and EMPTY_POOL_PLACEHOLDER (1) as the value. This implementation is a bit like lazy loading, creating objects only when needed.

When processing empty pages, it will first check whether there is an EMPTY_POOL_PLACEHOLDER flag on TLS. If there is no empty placeholder and POOL_BOUNDARY is inserted this time, then just set and return the empty placeholder. If there is an empty placeholder, then this time you need to create a new AutoreleasePoolPage page and insert POOL_BOUNDARY and obj.

AutoreleasePoolPage::autorelease(id obj)

Calling the autorelease method on an object puts it into the automatic release pool.
To implement the autorelease method, let’s first look at the call stack of the method:

- [NSObject autorelease]
└── id objc_object::rootAutorelease()
    └── id objc_object::rootAutorelease2()
        └── static id AutoreleasePoolPage::autorelease(id obj)
            └── static id AutoreleasePoolPage::autoreleaseFast(id obj)
                ├── id *add(id obj)
                ├── static id *autoreleaseFullPage(id obj, AutoreleasePoolPage *page)
                │   ├── AutoreleasePoolPage(AutoreleasePoolPage *newParent)
                │   └── id *add(id obj)
                └── static id *autoreleaseNoPage(id obj)
                    ├── AutoreleasePoolPage(AutoreleasePoolPage *newParent)
                    └── id *add(id obj)

In the call stack of the autorelease method, the autoreleaseFast method mentioned above will eventually be called to add the current object to AutoreleasePoolPage.

//被[NSObject autorelease]内部所调用
inline id objc_object::rootAutorelease() {
    if (isTaggedPointer()) return (id)this;
    if (prepareOptimizedReturn(ReturnAtPlus1)) return (id)this;

    return rootAutorelease2();
}

__attribute__((noinline,used)) id objc_object::rootAutorelease2() {
    return AutoreleasePoolPage::autorelease((id)this);
}

static inline id autorelease(id obj) {
   id *dest __unused = autoreleaseFast(obj);
   return obj;
}

The autorelease function is the same as the push function. The key code is to call the autoreleaseFast function to add an object to the linked list stack of the automatic release pool. However, the push function pushes a boundary object onto the stack,
while the autorelease function pushes objects that need to be added to the autorelease pool. .

If there is only one AutoreleasePoolPage object in the current thread and many autorelease object addresses are recorded, the memory will be as shown below:
[The external link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-jPPgmWzz- 1652109849074)(http://ds.devstore.cn/20141101/1414826929671/2.jpg)]

In the situation in the picture above, if another autorelease object is added to this page, it will be full (that is, the next pointer will immediately point to the top of the stack). At this time, the page object of the next page will be created. After the connection with the linked list of this page is completed, the new The next pointer of the page is initialized at the bottom of the stack (the position of begin), and then continues to add new objects to the top of the stack. This is the full page processing we talked about in the previous section.

One thing we should pay attention to here is the processing in the objc_object::rootAutorelease function. For isTaggedPointer, it is returned directly; if the return value is optimized, it is also returned directly; in the remaining cases, autoreleaseFast will be used. The second case is the quick release mechanism of the Autorelease return value that will be mentioned below.

Optimization mechanism of Autorelease return value

Under ARC, the runtime has a set of optimization strategies for autorelease return values. See runtime/objc-object.h for details
. Through the cooperation of the caller and the callee, the returned object does not enter the automatic release pool and reduces unnecessary retain/release calls.

An optimized callee examines the caller's instructions following the returned object. If the caller's instructions are also optimized, then the callee skips all retain count operations: no autorelease, and no retain/autorelease. Instead, it stores the current retain count change (+0 or +1) of the returned object in TLS (thread local storage). If the caller does not appear to have been optimized, the callee does the usual and calls autorelease or retain/autorelease.

Optimized callers will look at TLS. If the change has been set, it calls the necessary retain or release, changing the retain count left by the callee to the retain count expected by the caller. Otherwise the caller assumes that the current result is a +0 change from an unoptimized callee, and calls retain as necessary in this case.

There are two types of optimized callees:

objc_autoreleaseReturnValue
result is currently +1. The unoptimized path autoreleases it.
引用计数当前被+1。未优化的话会对它调用autorelease。
      
objc_retainAutoreleaseReturnValue
result is currently +0. The unoptimized path retains and autoreleases it.
引用计数不增加。未优化的话会对它调用retain和autorelease。

There are two types of optimized callers:

objc_retainAutoreleasedReturnValue
caller wants the value at +1. The unoptimized path retains it.
调用方希望引用计数+1。未优化的话会对它调用retain。
    objc_unsafeClaimAutoreleasedReturnValue
caller wants the value at +0 unsafely. The unoptimized path does nothing.
调用方希望不增加引用计数。未优化的话不会进行任何操作。

example

Callee:
    // compute ret at +1
    return objc_autoreleaseReturnValue(ret);
    
Caller:
    ret = callee();
    ret = objc_retainAutoreleasedReturnValue(ret);
      // use ret at +1 here
    Callee sees the optimized caller, sets TLS, and leaves the result at +1.
    Caller sees the TLS, clears it, and accepts the result at +1 as-is.
    Callee看到了优化的caller,将引用计数+1后设置到TLS。
    Caller看到了TLS,清除它,接受引用计数+1的结果。

The way callee identifies optimized callers is architecture dependent:

 x86_64: Callee looks for `mov rax, rdi` followed by a call or 
    jump instruction to objc_retainAutoreleasedReturnValue or 
    objc_unsafeClaimAutoreleasedReturnValue. 
  i386:  Callee looks for a magic nop `movl %ebp, %ebp` (frame pointer register)
  armv7: Callee looks for a magic nop `mov r7, r7` (frame pointer register). 
  arm64: Callee looks for a magic nop `mov x29, x29` (frame pointer register).

The Tagged pointer object also participates in optimizing the return object technique because it saves the sent message. They also do not enter the autorelease pool without optimization.
Key functions of this technology:

static ALWAYS_INLINE bool 
callerAcceptsOptimizedReturn(const void * const ra0)
判断调用方是否支持优化的return

static ALWAYS_INLINE ReturnDisposition 
getReturnDisposition()
static ALWAYS_INLINE void 
setReturnDisposition(ReturnDisposition disposition)
TLS读取,设置和重置

static ALWAYS_INLINE bool 
prepareOptimizedReturn(ReturnDisposition disposition)
根据所给的disposition参数准备优化的return。如果调用方支持优化(如果disposition为ReturnAtPlus1就设置TLS)就返回true。否则返回false,表明返回对象必须和通常一样,被retain和autorelease,或autorelease。

static ALWAYS_INLINE ReturnDisposition 
acceptOptimizedReturn()
获取存在TLS中的disposition,如果是ReturnAtPlus1还需要重置为ReturnAtPlus0。

for example

+ (instancetype)createPerson {
    id tmp = [self new]; 
    return objc_autoreleaseReturnValue(tmp);
}
- (void)testFoo {
    id tmp = _objc_retainAutoreleasedReturnValue([Person createPerson]);
 
}

When calling objc_autoreleaseReturnValue, the return address will first be obtained through the built-in function __builtin_return_address, and then based on this address, it will be determined whether the caller called the objc_retainAutoreleasedReturnValue function immediately after calling the objc_autoreleaseReturnValue function. If so, then objc_autoreleaseReturnValue() Without registering the returned object into the autorelease pool (without performing autorelease), the runtime will store the value ReturnAtPlus1 and key RETURN_DISPOSITION_KEY that characterize this behavior in TLS, make an optimization mark, and then directly return the object to the caller of the function. If not, objc_autorelease the object and then return.

Objc_retainAutoreleasedReturnValue(), which receives this return value externally, will first query the value corresponding to RETURN_DISPOSITION_KEY in TLS. If it is ReturnAtPlus1, then the object will be returned directly (without calling retain). If not, then objc_retain the object.

Therefore, through the cooperation of objc_autoreleaseReturnValue and objc_retainAutoreleasedReturnValue, TSL is used as a transfer, and the steps of autorelease and retain are omitted under ARC.

The whole process involves several key points:
TLS: Thread Local Storage
Thread Local Storage (TLS) thread local storage, the purpose is very simple, use a piece of memory as a thread-specific storage, read and write in the form of key-value , for example, under non-arm architecture, use the method provided by pthread:

void* pthread_getspecific(pthread_key_t);
int pthread_setspecific(pthread_key_t , const void *);

When the objc_autoreleaseReturnValue method is called on the return value, the runtime stores the RETURN_DISPOSITION_KEY identifier in TLS, and then returns the object directly (without calling autorelease); at the same time, in the objc_retainAutoreleasedReturnValue that receives the return value externally, it is found that RETURN_DISPOSITION_KEY is stored in TLS, then Return this object directly (without calling retain).
As a result, the caller and the callee use TLS for transfer, which tacitly eliminates the need for memory management of the return value.

So the question arises again, what should we do if only one side of the transferred party and the calling party is compiled in the ARC environment? (For example, if we use a non-ARC compiled third-party library in an ARC environment, or vice versa) we
need to use the following.

The prototype of the built-in function __builtin_return_address
is char *__builtin_return_address(int level). Its function is to get the return address of the function. The parameter indicates the number of layers. For example, __builtin_return_address(0) indicates the return address of the current function body. Passing 1 means calling the outer layer of this function. The return value address of the function, and so on.

- (int)foo {
    NSLog(@"%p", __builtin_return_address(0)); // 根据这个地址能找到下面ret的地址
    return 1;
}
// caller
int ret = [sark foo];

It doesn’t look like much, but you need to know that the return value address of the function corresponds to the address where the caller ends the call (or differs by a fixed offset, depending on the compiler). In other words
, The called function also has the opportunity to become a landlord, and can in turn do some bad things to the calling party.
Going back to the above question, if a function knows whether the caller is ARC or non-ARC before returning, there is an opportunity to handle different situations differently.

Reverse the assembly instruction.
By adding certain offsets to the __builtin_return_address above, the callee can locate the assembly instruction behind the return value of the caller:

// caller
int ret = [sark foo];
// 内存中接下来的汇编指令(x86,我不懂汇编,瞎写的)
movq ??? ???
callq ???

The values ​​of these assembly instructions in memory are fixed, such as movq corresponding to 0x48.
So, there is the following function, the input parameter is the value passed in by the caller __builtin_return_address.

//x86
static ALWAYS_INLINE bool 
callerAcceptsOptimizedReturn(const void * const ra0)
{
    const uint8_t *ra1 = (const uint8_t *)ra0;
    const unaligned_uint16_t *ra2;
    const unaligned_uint32_t *ra4 = (const unaligned_uint32_t *)ra1;
    const void **sym;

    // 48 89 c7    movq  %rax,%rdi
    // e8          callq symbol
    if (*ra4 != 0xe8c78948) {
        return false;
    }
    ra1 += (long)*(const unaligned_int32_t *)(ra1 + 4) + 8l;
    ra2 = (const unaligned_uint16_t *)ra1;
    // ff 25       jmpq *symbol@DYLDMAGIC(%rip)
    if (*ra2 != 0x25ff) {
        return false;
    }

    ra1 += 6l + (long)*(const unaligned_int32_t *)(ra1 + 2);
    sym = (const void **)ra1;
    if (*sym != objc_retainAutoreleasedReturnValue  &&  
        *sym != objc_unsafeClaimAutoreleasedReturnValue) 
    {
        return false;
    }

    return true;
}

It checks whether the caller calls objc_retainAutoreleasedReturnValue immediately after returning the value. If so, it knows that the external environment is an ARC environment. Otherwise, the old logic that has not been optimized is used.

objc_autoreleasePoolPop

objc_autoreleasePoolPop(atautoreleasepoolobj);

The return value of objc_autoreleasePoolPush is the address of the sentinel object on this page, which is used as an input parameter by objc_autoreleasePoolPop (sentinel object), so:

1. Find the page where the sentinel object is located based on the incoming sentinel object address.

2. In the current page, all autorelease objects inserted later than the sentinel object are sent once - release message, and the next pointer is moved back to the correct position.

3. Supplement 2: Clean up from the latest added object all the way forward. You can move forward across several pages until the page where the sentinel is located.

After the objc_autoreleasePoolPop was executed just now, it finally turned into the following:

Nested AutoreleasePool

Knowing the above principle, the nested AutoreleasePool is very simple. When popping, it will always be released to the position of the last push. A multi-layered pool is just multiple sentinel objects, just like peeling an onion, one layer at a time. , do not affect each other.

AutoreleasePoolPage::pop() implementation:

static inline void pop(void *token)   // token指针指向栈顶的地址
{
    AutoreleasePoolPage *page;
    id *stop;

    page = pageForPointer(token);   // 通过栈顶的地址找到对应的page
    stop = (id *)token;
    if (DebugPoolAllocation  &&  *stop != POOL_SENTINEL) {
        // This check is not valid with DebugPoolAllocation off
        // after an autorelease with a pool page but no pool in place.
        _objc_fatal("invalid or prematurely-freed autorelease pool %p; ", 
                    token);
    }

    if (PrintPoolHiwat) printHiwat();   // 记录最高水位标记

    page->releaseUntil(stop);   // 从栈顶开始操作出栈,并向栈中的对象发送release消息,直到遇到第一个哨兵对象

    // memory: delete empty children
    // 删除空掉的节点
    if (DebugPoolAllocation  &&  page->empty()) {
        // special case: delete everything during page-per-pool debugging
        AutoreleasePoolPage *parent = page->parent;
        page->kill();
        setHotPage(parent);
    } else if (DebugMissingPools  &&  page->empty()  &&  !page->parent) {
        // special case: delete everything for pop(top) 
        // when debugging missing autorelease pools
        page->kill();
        setHotPage(nil);
    } 
    else if (page->child) {
        // hysteresis: keep one empty child if page is more than half full
        if (page->lessThanHalfFull()) {
            page->child->kill();
        }
        else if (page->child->child) {
            page->child->child->kill();
        }
    }
}

The process is mainly divided into two steps:

  • page->releaseUntil(stop), call objc_release() on all objects between the top of the stack (page->next) and the stop address (POOL_SENTINEL), and reduce the reference count by 1
  • Clear the page object page->kill(), there are two comments
// hysteresis: keep one empty child if this page is more than half full

// special case: delete everything for pop(0)
除非是pop(0)方式调用,这样会清理掉所有page对象;
否则,在当前page存放的对象大于一半时,会保留一个空的子page,
这样估计是为了可能马上需要新建page节省创建page的开销。

Summarize

1. When the child thread uses the autorelease object, if there is no autoreleasepool, it will lazily load one in autoreleaseNoPage.
2. In runloop's run:beforeDate and some source callbacks, there are push and pop operations of autoreleasepool. The summary is that the system has similar autorelease management operations in many places.
3. It doesn’t matter if there is no pop during insertion. Resources will be released when the thread exits, and AutoreleasePoolPage::tls_dealloc will be executed, where the autoreleasepool will be cleared.

Recommended reading

Autorelease behind the shady http://blog.sunnyxx.com/2014/10/15/behind-autorelease/Look
at the source code with questions——When will the sub-thread AutoRelease object be released https://suhou.github.io/ 2018/01/21/%E5%B8%A6%E7%9D%80%E9%97%AE%E9%A2%98%E7%9C%8B%E6%BA%90%E7%A0%81-- --%E5%AD%90%E7%BA%BF%E7%A8%8BAutoRelease%E5%AF%B9%E8%B1%A1%E4%BD%95%E6%97%B6%E9%87%8A %E6%94%BE/
A brief discussion on the implementation principle of Autorelease Pool https://cloud.tencent.com/developer/article/1350726
In-depth understanding of Autorelease Pool https://cloud.tencent.com/developer/article/1006618?fromSource= gwzcw.700857.700857.700857
The underlying implementation principle of AutoreleasePool https://www.jianshu.com/p/50bdd8438857
The usage scenarios and principles of autoreleasepool https://www.jianshu.com/p/9da2929c9b61 The
implementation principle of Objective-C Autorelease Pool http ://blog.leichunfeng.com/blog/2015/05/31/objective-c-autorelease-pool-implementation-principle/#jtss-tsina
The past and present life of the automatic release pool http://www.cocoachina.com/ios/20160702/16569.html

Guess you like

Origin blog.csdn.net/Mamong/article/details/124678060