Encounter [original] interview "watchdog" hung "wheel of time" on his neck, I asked you afraid?

Redisson watchdog and Netty time round, to find out? The process of writing a play about the way his face. Technical Well, not that thing in the process of growing up constantly in her face.

Adorned live chat

Hello everyone, a week off fast, blink of an eye went to the weekend.

The old rules, this number still live chat features, first adorned board.

The above picture was taken by me in the course of time running in a Gouzi. You can see the picture there is a track map, but also a Gouzi.

The trajectories full length 21km, the distance is just a half-horse, and along the way will pass through the North Sea, Shichahai, Nanluoguxiang, Lama Temple, Ditan, Drum Tower Street, Deshengmen these relatively well-known attractions. Personally, I really like this course, ran several times in Beijing time.

If you are in Beijing, there are going to run a race track and then send this idea to force the installation of a circle of friends can be in the number of public attention, in the background replies, "Gou Zi run." I will reply to specific road map for you.

Due to the recent outbreak, the last distance running has been a long, long time, although the company has a treadmill, but I have always been not used to running on a treadmill.

Chengdu basically lifted the ban this week, the area inside runners slowly came out, so I decided to post this article finished, be sure to run out at least a 10km, ready to open this summer, after all, one summer, I inadvertently the abdominal muscles also leaked out activity activity.

Okay, back to the article.

After the interview is very important to re-set

Before writing "Please Hammer Hammer of the gods was a fight," the article, the long-haired brother hammer out this section, which has written this passage:

Then there is the reader to ask: brother, watchdog tell us about the pictures. When the interview was asked, he did not answer it.

I hear this question first to come to mind a few questions:

1. interview you asked, do not answer it, then what?

2. After the end of the interview you did not interview the recovery disk it?

3. For questions he did not answer it, did not go to explore it?

Even when you have forgotten your face questions, just to see my article, I suddenly remembered: Oh, I encountered this problem before, not resolved.

This approach is wrong, friend.

Interview is a technological battle, so re-set after the interview is very, very important, after the end of the interview should be the first thing is to review the entire interview process, look at the whole process of the interview, which places you know but he did not say clearly, which is where they should know but do not really know, you need to ascend.

Then immediately, immediately, immediately on the phone tag or carry on down note after-hours re-record their own summed up the key points.

These key points can be a good place performance, but more should be the need to upgrade the place.

You may also read through this routine: during the interview there are a few questions unanswered up, and finally the interviewer says you first go back to such notice it. So after the interview, you had to learn to not answer the question came up, and then put their learning summary sent to the interviewer. The interviewer look, yo, this guy can ah, learning ability is also good.

Then you really notice it ready for the next round of interviews.

I have not used this trick, but this routine, passing the idea is this: in their field range, do not understand the problems encountered, you have to take the initiative to solve.

After the interview re-set, very important.

Re-set the course of the idea to form words, preserved, very, very important.

Formation of the text, you can also share it to help people later.

Well, since readers asked this question, I will expand a little bit, I know have to share.

Look at sample code

Redisson distributed lock may have used most of my friends. The code for everyone to see is how the first use.

See these lines of code, you first do not look down, you first think about it and make your own wheels compared to what is very obviously not the same place?

I share with you two very straightforward question I encountered the first time Redission do when it distributed lock.

Where 1.value go?

2. Where is the expiration time go?

I said before, if we ourselves create the wheel, based on distributed lock Redis do so, the following needs to send a command to the Redis:

SET key random_value NX PX 3000

In our example code above, why only key did not value it?

We know the value that must have. Remember "Please Hammer Hammer of the gods was a fight," the article which said, when the interviewer asks:

你给我讲一讲基于Redis的加锁和释放锁的细节吧。

Our three key points to answer:

Atomic lock command.

2. Set the value of time, put the random_value.

3.value set the value of the random number is mainly to safer release the lock, the lock is released when you need to check whether the key exists, and the key value is the value corresponding to the designated me, is the same as to release the lock. So you can see there are acquisition, judge, delete three operations, in order to guarantee atomicity, we need to use lua script.

Therefore, this value is very important.

In addition, Step 3, when the lock is released why lua script, some readers have asked, in fact, this thing a few words can make it clear, so here I am about spots:

You see these three operations: acquisition, judge, delete.

Get operation, read not write, without any problems. The problem lies in between the judge and removed. If the atomic operation is not, the following situations occur:

1. A thread in the judgment of the value of their own into them, before performing the delete key, the program led to the GC STW.

A thread lock during 2.STW though not perform the removal, but due to the expiration of the time was redis released.

After 3.STW, before thread A delete operation, thread B plus the same key locks.

4. The results you guessed how? A thread to thread B plus lock deleted. This is a problem.

Why lua scripts can solve this problem?

Because lua script execution is atomic, coupled with the Redis commands are executed single-threaded, so before lua script completes, had to wait for another command. It will not appear on top of the situation said.

The second question is where the expiration time go?

See above lock codes, such as the expiration time is not set like.

Let's talk about what has not expired time the problem is. Well obviously, easily lead to deadlock.

Server lock operation is not performed before releasing the lock operation, the server collapse.

Oh excluded, like to mention a deadlock.

Where's value?

For this problem, we first need to determine is, value must be yes.

When we put their own value, it is the general put forward a random value, entered a plug on the bin.

In addition, I have seen some online analysis Redis distributed lock article which value directly into the throw OK. We have said before, this is wrong ah, my friends. Pay attention to distinguish.

When using Redssion, we know that the key is definitely framework to help us to create. So we just need to verify the source of our ideas can be.

But first Do not panic, we have a more simple authentication method: a program run up, then go inside to look at Redis not get away?

Looked after discovery, oh well, not only validate our ideas, as well as a windfall yet.

A pleasant surprise : We saw TTL: 25 Description useless though we set the expiration time, but the expiration time frame to help us set up. In this part of this section not press table, etc. described in detail in the next section.

Windfall two : you can see why we go into a Hash type. Not our usual type String.

Obviously, key is the UUID: 1, this 1 is what the meaning of it?

Why use Hash type, instead of String type it?

We took these questions to see a source.

Note that this article Redssion the Maven version 3.12.3.

Redssion source code is very good Debug, I suggest you practice again.

First lock operation will call to this method:

org.redisson.RedissonLock#lock(long, java.util.concurrent.TimeUnit, boolean)

It can be seen here at the time, acquired thredId is 1. That key back inside UUID stitching 1. Here is not that it? We then look down.

Debug then forward three steps to get to this location the following:

org.redisson.RedissonLock#tryLockInnerAsync

Here's getLockName (threadId) is actually what we are looking for:

You see, this is a bunch of stuff, not that we have just seen the UUID: 1 it? This one is the thread ID.

what? You ask me why I say this is id UUID?

Intuition, program ape gut tells me that this is a UUID. But I can give you to test.

The id is the source of the following interfaces:

org.redisson.connection.ConnectionManager

Which has an interface implementation class 5:

When you create ConnectionManager, each implementation class constructor is passed UUID.

Therefore, we can conclude the:

Distributed Lock using Redssion do, do not need to explicitly specify the value, the framework will help us generate a string of UUID and locking operations threads threadId colon spliced ​​together.

There is no challenge even a little boring process of exploration ah. (In fact, I want to express the source code really is not hard, do not have the psychological fear, with a question to see the source code.)

But do not worry, this is just an appetizer.

For the second question: Why use Hash type, instead of String type it?

In the next section we find the expiration time where to go at the same time, to find the answer to this question.

Expiration time go?

This problem, we can find the answer from this code inside:

org.redisson.RedissonLock#tryLockInnerAsync

We first look at a few of this method corresponds to the Senate:

The main concern I boxed part:

script: lua script is to be executed.

keys: redis is the key. Here it is why KEYS [1].

params: the parameters is lua script. Here is 30,000 ARVG [1]. UUID: thredId is ARVG [2].

So this time we also know that expired, the default is 30000ms, namely 30s.

Know the meaning of the above three parameters, we come dismantling this lua script is very simple, we put him first disassembled into three parts:

Part I: locking

Look at the operation of the first locking portion:

Line 4, exists to determine the first with KEYS [1] (i.e., why) exists.

If not, then enter on line 5, use hincrby command. hincrby command is doing know, right?

After entering line 6, KEY [1] set the expiration time, 30000ms.

Then, line 7, were returned to nil, the end.

Thus, one atom of locking the operation is completed.

Here, we have verified from the source point of view: because using a hincrby command, Redssion do when key lock is indeed a Hash structure.

Part II: reentry

When the first portion is determined if the branch KEYS [1] is present, it will enter into this branch:

Since KEYS [1] is a Hash structure, line 13, this means that the acquisition KEYS [1] in the field ARGV [2] data, determines whether there.

If present, the line 14 enters the code for adding a manipulation command hincrby ARGV [2] field.

Then line 15, nothing to say, is to re-set the expiration time for the 30s. After the first 16 rows, return to nil, the end.

So, you feel what the role of the first 14 lines of code is? Enter, and then add one, you think what?

See here, unlock the lua script is not a must-see, but also would like to think there is definitely a minus one operation, then reduced to 0, the lock is released . For a while we went to verify this point.

So, here also explains why Redssion to do with the Hash type lock. Because it supports re-enter it.

String type you use, you realize how reentrant functions, the keyboard to give you a realization, I learn from this? (In fact, it is also possible, that is a bit contrary to the. No meaning.)

Part III: Return

One line of code, no problem. Simply sends KEY [1] Remaining lifetime.

Through this three-part analysis of lua, we know: The default expiration time is 30s. In the case where a key or a lock when a successful re-entry lock will return successfully empty, failure will only lock the lock remaining time of the current returns.

Remember this conclusion, we will use this return value in the next section of this watchdog work in loud noise.

In addition, write articles when I found Redssion the latest version 3.12.3 and compared to the previous version, lua script when the lock has a slightly different, as follows:

Before using 3.12.3 version is hset, now using hincrby. Resulting in first and second portions is a little high degree of similarity. It looks a little confused easily.

You go online to find should see are said hset operation. Since 3.12.3 version just released a month.

Congratulations, friends, and learned a knowledge unobtainable.

Watchdog ye work?

Friends to see this section, it is tough. In this section, we finally see the long-sawed a watchdog.

org.redisson.RedissonLock#tryAcquireAsync

TtlRemaining here is that after a lua script returns a value. After earlier we know, when the lock is successful or successful re-entry will return null . Enter this method:

org.redisson.RedissonLock#scheduleExpirationRenewal

This method is tiling job of a watchdog.

After Debug, you will encounter this method:

org.redisson.RedissonLock#renewExpiration

Obviously, you can see from the numbers above marked:

①: This is a task.

②: core code you need to perform this task.

③: This task is performed once after each internalLockLeaseTime / 3ms. The internalLockLeaseTime default is 30,000. So the task is executed every 10s.

② Then we look at what the core code is executed inside:

The lua script, first determine the UUID: threadId exists, if there is put the key expiration time set back to 30s, this is a continued life operation.

Come in two years to be a primary school arithmetic problem:

Word problems: key default expiration time is 30s, every 30s / 3 time will be to carry out the operation continued life, so whenever the key is how much ttl (remaining time) to return when the operation will be continued life?

A: seen from casual working, 30s / 3 = 10s. Thus to obtain the formula: 30s - 10s = 20s.

Therefore, whenever a TTL key (remaining time) of the time 20, the operation for continued life, re-key time is set to the default expiration time 30s.

Note that I have always stressed that the above default time 30s .

Because this time can be modified, for example, we want to modify the 60s, so:

So internalLockLeaseTime becomes a 60,000:

Then additional questions came.

Extra credit: After reading the above material, when the default time is modified to 60s, so whenever the key is ttl (remaining time) when the number of returns, the operation will be continued life?

A: When available by the title, each time over 60s / 3 = 20s, the task is triggered, the watchdog work.

Therefore, 60s -20s = 40s. Ttl whenever the return key 40, the operation will be continued life.

You have to learn to deformation, friends, okay?

Next, we look at this task is how to achieve the task.

You can see, this class netty Timeout is inside the package.

This task is task-based time round netty do.

The interviewer asked you: What is the time round?

You do not know. Then you read on.

Time round is what?

You hear time round, what you first thought?

Heard the word, even if you do not know time round, you should also think of the wheels Well, not that ring a thing.

A casual search online, you will know that it really grew into a ring:

It works as follows:

Time round the picture size is 8 cells, each cell holds and points to a list of tasks to be performed.

We assume that every 1s revolution of a cell, cell 0 is currently located, is now to add tasks After a 5s, 5 + 0 = 5 is added to a linked list of task node 5 grid while the node identifier round = 0.

We assume that every 1s revolution a grid, currently located in the 0th frame, now add tasks After a 17S is (0 + 17)% 8 = 1, then add a node pointing to the task in the first cell, and numerals round = 2, time elapsed after the first round of every cell, the list corresponding to the task round decremented by one. The time when the third round after the first frame 1, will perform the task.

Note that each time the wheel will perform the task round = 0.

Know works, we'll look at the front said Timeout class, in fact, return HashedWheelTimer inside newTimeout method:

We analyzed earlier, when Redssion achieve watchdog function, using newTimeout method. The method of the three parameters:

1.task, tasks, for Redssion watchdog function, this task is to expire time corresponding key reset, default is 30s.

2.delay, how often executed once for Redssion watchdog function, this delay value is internalLockLeaseTime / 3 counted out, the default is 10s.

3.unit, unit of time.

In fact, you might have noticed, this time we have been away from the Redssion enter Netty.

We just need to tell newTimeout method, how much time we have to perform what tasks on the line every time.

Then why do not you write a more simple and easy to understand Demo analyze this time round it?

For example, the following this:

Demo above it should be well understood.

Here, we know that the task of achieving the watchdog timer is based, and this task is timed Netty time round based implementation.

For HashedWheelTimer source, I would like to be the beginning of a REVIEW, read, read to access to information, we found that the source for the interpretation of this link which has been in place, I simply write their own that part removed, we have interested parties can go to read about:

https://www.jianshu.com/p/1eb1b7c67d63

In addition, about time round, you can also look at the IBM Forum inside the article "On time programming in Linux and realization of the principle": `

https://www.ibm.com/developerworks/cn/linux/1308_liuming_linuxtime3/index.html

Unlock operation

Remember when we lock operation said?

进入,然后加一,你联想到了什么? 


这不就是可重入锁吗! 


看到这里的时候,解锁的 lua 脚本都不必看的,想也能想到,肯定是有一个减一的操作,然后减到 0,就释放这把锁。
一会我们就去验证这个点。

This section, we went to verify this point, see the following release lock lua script execution:

There is not a determination of the counter, if minus one is less than or equal to 0. On the implementation of the del key operation.

Unlock operation really quite simple, is the implementation of a publish command after del. Your guess here is publish what?

Well guess first revalidation, bold hypothesis, A careful!

Here it is redis-based publish / subscribe functionality. Unlock time to release an event, you notice that what stuff?

Certainly tell another thread and I am running out of locks, you come to get it.

What other thread is a thread of it?

Just want to apply the same lock thread.

tryAcquire code we previously analyzed, when the ttl is not null, only one case, that is locked failed:

So locking thread failure to perform a subscribe method to complete the subscription.

Thus, publish and release operation of the lock on the echoes.

Then you not only a problem to solve: how to let the watchdog know without continued life?

In fact, after the implementation of lua script has unlocked by reactive programming, complete cancel the operation.

Since then, our lock, the watchdog continued life, unlocking the set and you're done.

Add that the way to fight face

Prior to her face, let me ask a question it: under what circumstances the watchdog will fail?

Do not give me that down, then down, because the thread did not, just do not prolonging the watchdog, after which the key expires on redis deleted.

I asked what time the failure is not exactly start?

The answer is, call lock method when a pass into the designated time, so if there is no call unLock method within the specified time, the lock will still be released. Like this:

rLock.lock(5,TimeUnit.SECONDS);

After 5s the lock will automatically release. The operation will not be continued life! ! !

Corresponding source code below, note the comment I wrote:

So, I think a long time ago I said this in the group, the red boxed part is wrong:

Explicitly specified timeout period of time, the watchdog will not start mechanism.

Slap at yourself thing ......

Well it was great, which I often do.

Moreover, the study of people, which can be called to play face? This is called growth.

In addition, this picture is very good, for everyone to share:

图片来源:https://juejin.im/post/5bf3f15851882526a643e207

There are questions raised by a reader, renewal time, the need to limit the number of times?

I think it is necessary to limit, if your code has been engaged in renewal operations to illustrate two cases:

1. some exceptional reason, this causes data you need to handle more than before, therefore, takes longer, leading to renewal has been conducting operations.

2. The code you have a problem, resulting in an infinite loop, which is the deadlock of the pot, Redssion not back.

Finally, there is a problem, which locks it safe, or do you think there will be any problems?

what? You do not know?

Share've said in the previous article:

Asynchronous communication, the situation described above occurs between the nodes. So Redis is introducing a solution is what?

RedLock, before writing "Please Hammer Hammer of the gods was a fight," the article, is to speak of RedLock. If you do not know, you go to look at one.

In fact, then one day I suddenly thought, to see if Redis distributed lock problem from the perspective of the CAP, I think it may be a little more comprehensible.

Distributed Lock conformance requirements CP, but to meet the asynchronous communication between Redis cluster architecture is the AP, and therefore not on the way, is a problem ah.

But why do Redis distributed lock was so popular it?

Probably because most of the scenes in it can tolerate this problem, it could be the presence of the user luck it, or that users will use as a black box, you do not know may be a problem.

One last word (seeking attention)

After writing a tell time is 2:00 over:

The point of a "look" of it, more tired of weeks, not white whore I need a little positive feedback.

Caishuxueqian, it is inevitable there will be flaws, if you find the wrong place, please leave a message and pointed out to me, and I modify them. (Each technical articles I have this sentence, I meant to say.)

Thank you for reading, I adhere to the original , welcome and thank you for your attention.

I was why technology is not a big brother, but like to share, and there are warm materials Sichuan good man.

Welcome to public concern number [why] technology, adhere to the original output. Sharing technology, taste life, you and I would like to make progress together.

Guess you like

Origin www.cnblogs.com/thisiswhy/p/12596069.html