Actor life cycle management of skynet source code

Skynet is based on multithreading. Each actor will be scheduled by a separate thread, and each actor can kill other actors, send messages to other actors, create actors, that is, an actor may be held by multiple threads, then There are three problems:

When an actor is used at the same time, how to safely release it.
After the actor is released, how to detect that the actor is invalid for external use so that the process can continue.
If the message in the mailbox has request response semantics, then if the message source is notified.
The framework uses handle mapping and reference counting to expose the handle of sc (skynet_context), not pointers. This module is implemented in /skynet-src/skynet_handle.c, to analyze its specific principles, starting from the interface of the header file:

skynet_handle.h

1 #ifndef SKYNET_CONTEXT_HANDLE_H
 2 #define SKYNET_CONTEXT_HANDLE_H
 3 
 4 #include <stdint.h>
 5 
 6 // reserve high 8 bits for remote id
 7 #define HANDLE_MASK 0xffffff
 8 #define HANDLE_REMOTE_SHIFT 24
 9 
10 struct skynet_context;
11 
12 uint32_t skynet_handle_register(struct skynet_context *);
13 int skynet_handle_retire(uint32_t handle);
14 struct skynet_context * skynet_handle_grab(uint32_t handle);
15 void skynet_handle_retireall();
16 
17 uint32_t skynet_handle_findname(const char * name);
18 const char * skynet_handle_namehandle(uint32_t handle, const char *name);
19 
20 void skynet_handle_init(int harbor);
21 
22 #endif

The handle is an integer of uint32_t, and the upper 8 bits represent the remote node (this is the cluster facility that comes with the framework, and subsequent analysis will ignore this part. First, it is not the core of the framework, and second, this cluster facility is not recommended).

Let's take a look at its internal data structure:

define DEFAULT_SLOT_SIZE 4
#define MAX_SLOT_SIZE 0x40000000

struct handle_name {
    
    
    char * name;
    uint32_t handle;
};

struct handle_storage {
    
    
    struct rwlock lock;

    uint32_t harbor;
    uint32_t handle_index;
    int slot_size;
    struct skynet_context ** slot;
    
    int name_cap;
    int name_count;
    struct handle_name *name;
};

static struct handle_storage *H = NULL;

It looks like an array of sc, and I don’t see anything. Look at other methods, skynet_handle_register:

 1 uint32_t
 2 skynet_handle_register(struct skynet_context *ctx) {
    
    
 3     struct handle_storage *s = H;
 4 
 5     rwlock_wlock(&s->lock);
 6     
 7     for (;;) {
    
    
 8         int i;
 9         for (i=0;i<s->slot_size;i++) {
    
    
10             uint32_t handle = (i+s->handle_index) & HANDLE_MASK;
11             int hash = handle & (s->slot_size-1);
12             if (s->slot[hash] == NULL) {
    
    
13                 s->slot[hash] = ctx;
14                 s->handle_index = handle + 1;
15 
16                 rwlock_wunlock(&s->lock);
17 
18                 handle |= s->harbor;
19                 return handle;
20             }
21         }
22         assert((s->slot_size*2 - 1) <= HANDLE_MASK);
23         struct skynet_context ** new_slot = skynet_malloc(s->slot_size * 2 * sizeof(struct skynet_context *));
24         memset(new_slot, 0, s->slot_size * 2 * sizeof(struct skynet_context *));
25         for (i=0;i<s->slot_size;i++) {
    
    
26             int hash = skynet_context_handle(s->slot[i]) & (s->slot_size * 2 - 1);
27             assert(new_slot[hash] == NULL);
28             new_slot[hash] = s->slot[i];
29         }
30         skynet_free(s->slot);
31         s->slot = new_slot;
32         s->slot_size *= 2;
33     }
34 }

This method is to add a handle mapping of sc.

From the code point of view, it is a kind of hash mapping, using read-write locks to ensure thread safety. Lines 9-21 are the selection process of the hash value, which is a self-increasing integer, which is incremented by 1 each time it is calculated. The handle_index is used as a counter, and the modulus is used to map to the sc array. Use the second detection method to resolve the conflict, the 9th line loop ensures that the conflict detection will cover the entire array.

To line 22, it means that the array is full. At this time, the original array will be doubled, and the handle will be modulo and mapped again on the new array. This hash rule has two advantages: 1. The hash value will not be repeated. 2. The search process is truly O(1).

From the 22nd line and the modification of handle_index, we can know that this function is based on two premises: 1. The size of the array will not exceed 0xffffff. 2. The handle_index does not handle overflow, which may be 0, which is assumed to not overflow. I personally think that handle_index is better to deal with the overflow situation, if it is greater than 0xffffff, set it to 1.

Let's take a look at skynet_handle_grab again:

1 struct skynet_context * 
 2 skynet_handle_grab(uint32_t handle) {
    
    
 3     struct handle_storage *s = H;
 4     struct skynet_context * result = NULL;
 5 
 6     rwlock_rlock(&s->lock);
 7 
 8     uint32_t hash = handle & (s->slot_size-1);
 9     struct skynet_context * ctx = s->slot[hash];
10     if (ctx && skynet_context_handle(ctx) == handle) {
    
    
11         result = ctx;
12         skynet_context_grab(result);
13     }
14 
15     rwlock_runlock(&s->lock);
16 
17     return result;
18 }

The function of this function is to find the corresponding sc according to the handle, and return NULL if the handle is invalid. The search is a read lock. The search process is very simple. Take the modulo of the handle, and then judge whether the handle of the element at the index is consistent. The reference count is stored in sc, not in this module. In fact, it should be put in this module to be more pure. In sc, you only need to know how to release itself. A successful search will increase the count of sc (skynet_context_grab).

Let's take a look at skynet_handle_retire:

 1 int
 2 skynet_handle_retire(uint32_t handle) {
    
    
 3     int ret = 0;
 4     struct handle_storage *s = H;
 5 
 6     rwlock_wlock(&s->lock);
 7 
 8     uint32_t hash = handle & (s->slot_size-1);
 9     struct skynet_context * ctx = s->slot[hash];
10 
11     if (ctx != NULL && skynet_context_handle(ctx) == handle) {
    
    
12         s->slot[hash] = NULL;
13         ret = 1;
14         int i;
15         int j=0, n=s->name_count;
16         for (i=0; i<n; ++i) {
    
    
17             if (s->name[i].handle == handle) {
    
    
18                 skynet_free(s->name[i].name);
19                 continue;
20             } else if (i!=j) {
    
    
21                 s->name[j] = s->name[i];
22             }
23             ++j;
24         }
25         s->name_count = j;
26     } else {
    
    
27         ctx = NULL;
28     }
29 
30     rwlock_wunlock(&s->lock);
31 
32     if (ctx) {
    
    
33         // release ctx may call skynet_handle_* , so wunlock first.
34         skynet_context_release(ctx);
35     }
36 
37     return ret;
38 }

The function of this function is to unmap the handle, not to decrement the reference count.

There are two steps for specific implementation: 1. Clear the slot corresponding to the handle and call skynet_context_release. 2. If there is a registered name, delete the corresponding node.

In fact, it would be better to put the control of releasing sc in this module.

The remaining method is the support of handle naming, the name mapping is stored in an array, sorted in lexicographic order, and the binary search method is used when searching.

Now you can look at the specific scenarios of the sc life cycle, just look at two places:

In the message dispatch office, in the skynet_context_message_dispatch function,
the external interface of sc is mainly skynet_command.
You can see it in skynet_context_message_dispatch (line 285 of /skynet-src/skynet_server.c):

struct skynet_context * ctx = skynet_handle_grab(handle);
    if (ctx == NULL) {
    
    
        struct drop_t d = {
    
     handle };
        skynet_mq_release(q, drop_message, &d);
        return skynet_globalmq_pop();
    }

Through skynet_handle_grab, the SC is invalid, which solves the problem 2 raised at the beginning. Other external interfaces of sc also made this judgment.

Then the rest is question 1, the question of safe release. Look at the external release interface of sc, cmd_exit, cmd_kill, which are all handle_exit:

 1 static void
 2 handle_exit(struct skynet_context * context, uint32_t handle) {
    
    
 3     if (handle == 0) {
    
    
 4         handle = context->handle;
 5         skynet_error(context, "KILL self");
 6     } else {
    
    
 7         skynet_error(context, "KILL :%0x", handle);
 8     }
 9     if (G_NODE.monitor_exit) {
    
    
10         skynet_send(context,  handle, G_NODE.monitor_exit, PTYPE_CLIENT, 0, NULL, 0);
11     }
12     skynet_handle_retire(handle);
13 }

This function finally calls skynet_handle_retire, after it releases the handle mapping, it calls skynet_context_release.

Take a look at skynet_context_release:

1 static void 
 2 delete_context(struct skynet_context *ctx) {
    
    
 3     if (ctx->logfile) {
    
    
 4         fclose(ctx->logfile);
 5     }
 6     skynet_module_instance_release(ctx->mod, ctx->instance);
 7     skynet_mq_mark_release(ctx->queue);
 8     CHECKCALLING_DESTROY(ctx)
 9     skynet_free(ctx);
10     context_dec();
11 }
12 
13 struct skynet_context * 
14 skynet_context_release(struct skynet_context *ctx) {
    
    
15     if (ATOM_DEC(&ctx->ref) == 0) {
    
    
16         delete_context(ctx);
17         return NULL;
18     }
19     return ctx;
20 }

After the reference count is 0, sc will be released, so question 1 is guaranteed as follows:

There are two situations after calling handle_exit:

1. Other logic flows have already acquired sc, so the reference count must be greater than 0, and sc will not be released at this time. It will be released when the last logic flow decrements the reference count, which is safe.

2. The sc is released, and other logic flows start skynet_handle_grab. Because the handle mapping has been removed, all searches are invalid. The logic flow can know this and make a judgment, which is safe.

When sc is released, the mailbox (message_queue) is not released, only skynet_mq_mark_release is called to set the release flag, then where is it released? Let’s first think about such a situation. If sc is released and the mailbox is not released, then skynet_handle_grab will fail to find, and the mailbox will still be in the level 1 queue. Then the release place can only be in skynet_context_message_dispatch. Let’s look back. It is called the mailbox released by skynet_mq_release in the branch where sc is judged to be invalid.

Why should the mailbox be released separately and not released together with sc? Because sc is released by reference counting, the release timing is not clear, and it may be in any logical flow. It is impossible to judge whether it should be pushed back to the level 1 queue in the message scheduling, so it must be independent.

Only problem 3 is left. It just needs to see how the message is processed when the mailbox is released. In the drop_message of /skynet-src/skynet_server.c:

static void
drop_message(struct skynet_message *msg, void *ud) {
    
    
    struct drop_t *d = ud;
    skynet_free(msg->data);
    uint32_t source = d->handle;
    assert(source);
    // report error to the message source
    skynet_send(NULL, source, msg->source, PTYPE_ERROR, 0, NULL, 0);
}

It is solved by sending a PTYPE_ERROR to the message source, so that the sc expecting to receive the response has the opportunity to end the suspended process. But I have a question, why don’t you bring the session when responding? Do you have to find the mailbox by the source? Look at this when the news is distributed.

If there is no gc, then in multi-threaded programming, how to safely release resources is a problem that must be faced. It is usually solved independently in another module, there are two common methods:

Handle mapping and reference counting in this article. Smart pointers are usually used in c++, and the reference count is automatically added and subtracted through the destructor and copy constructor as a mandatory guarantee. Personally think the former is more flexible.
Only mark when released, and periodically reclaim resources at a certain frequency.

Click to learn skynet
Insert picture description here
now. For more skynet information, please join the group: 832218493 for free!

Guess you like

Origin blog.csdn.net/lingshengxueyuan/article/details/111653972