Source code analysis of ServiceStack.Redis (connection and connection pool)

A few days ago, there was a failure in redis connection creation in the production environment. During the analysis process, I had a better understanding of the connection creation and connection pool mechanism of ServiceStack.Redis. After the problem analysis is over, the knowledge points learned are systematically sorted out through this article.

The process of getting RedisClient from the connection pool

In the business program, the client object is obtained through the GetClient() method of the PooledRedisClientManager object, and the source code here is used as the entry:

View code

public IRedisClient GetClient()
        {
            RedisClient redisClient = null;
            DateTime now = DateTime.Now;
            for (; ; )
            {
                if (!this.deactiveClientQueue.TryPop(out redisClient))
                {
                    if (this.redisClientSize >= this.maxRedisClient)
                    {
                        Thread.Sleep(3);
                        if (this.PoolTimeout != null && (DateTime.Now - now).TotalMilliseconds >= (double)this.PoolTimeout.Value)
                        {
                            break;
                        }
                    }
                    else
                    {
                        redisClient = this.CreateRedisClient();
                        if (redisClient != null)
                        {
                            goto Block_5;
                        }
                    }
                }
                else
                {
                    if (!redisClient.HadExceptions)
                    {
                        goto Block_6;
                    }
                    List<RedisClient> obj = this.writeClients;
                    lock (obj)
                    {
                        this.writeClients.Remove(redisClient);
                        this.redisClientSize--;
                    }
                    RedisState.DisposeDeactivatedClient(redisClient);
                }
            }
            bool flag2 = true;
            if (flag2)
            {
                throw new TimeoutException("Redis Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use.");
            }
            return redisClient;
        Block_5:
            this.writeClients.Add(redisClient);
            return redisClient;
        Block_6:
            redisClient.Active = true;
            this.InitClient(redisClient);
            return redisClient;
        }

The main body of this method is an infinite loop, which mainly implements these functions:

  • this.deactiveClientQueue represents the idle Client collection, which is of type ConcurrentStack<RedisClient>.
  • When this.deactiveClientQueue can pop out redisClient, jump to the Block_6 branch: mark the redisClient.Active property, execute this.InitClient(redisClient), and then return the redisClient instance.
  • When this.deactiveClientQueue has no elements that can be Popped, first perform the judgment of the upper limit of the number of Clients this.redisClientSize >= this.maxRedisClient;
    • If the upper limit is not reached, execute redisClient = this.CreateRedisClient();
    • If the upper limit is reached, sleep for 3 milliseconds first, and then determine whether the connection pool timeout time this.PoolTimeout is exceeded, in milliseconds. If it times out, break directly to interrupt the loop, and if it does not time out, continue to the next for loop.

The above process is the main process of obtaining the Client from the connection pool, where this.deactiveClientQueue is equivalent to the "Client pool". It should be noted that the meaning of this.PoolTimeout is the time the caller waits when the connection pool is exhausted.

The above process is represented by a flowchart as:

The process of creating a new Client: CreateRedisClient()

The source code is as follows:

View code

  private RedisClient CreateRedisClient()
		{
			if (this.redisClientSize >= this.maxRedisClient)
			{
				return null;
			}
			object obj = this.lckObj;
			RedisClient result;
			lock (obj)
			{
				if (this.redisClientSize >= this.maxRedisClient)
				{
					result = null;
				}
				else
				{
					Random random = new Random((int)DateTime.Now.Ticks);
					RedisClient newClient = this.InitNewClient(this.RedisResolver.CreateMasterClient(random.Next(100)));
					newClient.OnDispose += delegate()
					{
						if (!newClient.HadExceptions)
						{
							List<RedisClient> obj2 = this.writeClients;
							lock (obj2)
							{
								if (!newClient.HadExceptions)
								{
									try
									{
										this.deactiveClientQueue.Push(newClient);
										return;
									}
									catch
									{
										this.writeClients.Remove(newClient);
										this.redisClientSize--;
										RedisState.DisposeDeactivatedClient(newClient);
									}
								}
							}
						}
						this.writeClients.Remove(newClient);
						this.redisClientSize--;
						RedisState.DisposeDeactivatedClient(newClient);
					};
					this.redisClientSize++;
					result = newClient;
				}
			}
			return result;
		}

Based on concurrency considerations, the process of creating a new Client needs to increase the concurrent lock limit, that is, at lock (obj). At this time, if multiple threads enter the CreateRedisClient() method, only one thread actually executes, and other threads block waiting for the lock to be released. This phenomenon can be analyzed and viewed through the syncblk and clrstack commands of windbg. The rest is to continue to call this.InitNewClient(this.RedisResolver.CreateMasterClient(random.Next(100))) to create objects, and add processing logic to the OnDispose event of newClient. It should be noted that the OnDispose event here is not a destructor in the traditional sense, but an operation used by the caller to recycle the RedisClient object to the connection pool after it is used up, that is, on the premise that the newClient object is not abnormal, put it Push to the this.deactiveClientQueue stack, where the connection pool is recycled and expanded.

Interpretation of this.InitNewClient() method

Here is the initialization of the newly created RedisClient object, including Id, Active, etc., and continue to call this.InitClient() for further initialization.

Interpretation of this.RedisResolver.CreateMasterClient()

this.redisResolver is the IRedisResolver interface type. There are three implementations in the source code, as shown in the following screenshot. Here, the common sentinel mode of production is taken as an example for analysis.

The RedisSentinelResolver class corresponds to the sentinel mode, and the relevant operation source code is as follows:

View code

public RedisClient CreateMasterClient(int desiredIndex)
		{
			return this.CreateRedisClient(this.GetReadWriteHost(desiredIndex), true);
		}
		public RedisEndpoint GetReadWriteHost(int desiredIndex)
		{
			return this.sentinel.GetMaster() ?? this.masters[desiredIndex % this.masters.Length];
		}

		public virtual RedisClient CreateRedisClient(RedisEndpoint config, bool master)
		{
			RedisClient result = this.ClientFactory(config);
			if (master)
			{
				RedisServerRole redisServerRole = RedisServerRole.Unknown;
				try
				{
					using (RedisClient redisClient = this.ClientFactory(config))
					{
						redisClient.ConnectTimeout = 5000;
						redisClient.ReceiveTimeout = 5000;
						redisServerRole = redisClient.GetServerRole();
						if (redisServerRole == RedisServerRole.Master)
						{
							this.lastValidMasterFromSentinelAt = DateTime.UtcNow;
							return result;
						}
					}
				}
				catch (Exception exception)
				{
					Interlocked.Increment(ref RedisState.TotalInvalidMasters);
					using (RedisClient redisClient2 = this.ClientFactory(config))
					{
						redisClient2.ConnectTimeout = 5000;
						redisClient2.ReceiveTimeout = 5000;
						if (redisClient2.GetHostString() == this.lastInvalidMasterHost)
						{
							object obj = this.oLock;
							lock (obj)
							{
								if (DateTime.UtcNow - this.lastValidMasterFromSentinelAt > this.sentinel.WaitBeforeForcingMasterFailover)
								{
									this.lastInvalidMasterHost = null;
									this.lastValidMasterFromSentinelAt = DateTime.UtcNow;
									RedisSentinelResolver.log.Error("Valid master was not found at '{0}' within '{1}'. Sending SENTINEL failover...".Fmt(redisClient2.GetHostString(), this.sentinel.WaitBeforeForcingMasterFailover), exception);
									Interlocked.Increment(ref RedisState.TotalForcedMasterFailovers);
									this.sentinel.ForceMasterFailover();
									Thread.Sleep(this.sentinel.WaitBetweenFailedHosts);
									redisServerRole = redisClient2.GetServerRole();
								}
								goto IL_16E;
							}
						}
						this.lastInvalidMasterHost = redisClient2.GetHostString();
						IL_16E:;
					}
				}
				if (redisServerRole != RedisServerRole.Master && RedisConfig.VerifyMasterConnections)
				{
					try
					{
						Stopwatch stopwatch = Stopwatch.StartNew();
						for (;;)
						{
							try
							{
								RedisEndpoint master2 = this.sentinel.GetMaster();
								using (RedisClient redisClient3 = this.ClientFactory(master2))
								{
									redisClient3.ReceiveTimeout = 5000;
									redisClient3.ConnectTimeout = this.sentinel.SentinelWorkerConnectTimeoutMs;
									if (redisClient3.GetServerRole() == RedisServerRole.Master)
									{
										this.lastValidMasterFromSentinelAt = DateTime.UtcNow;
										return this.ClientFactory(master2);
									}
									Interlocked.Increment(ref RedisState.TotalInvalidMasters);
								}
							}
							catch
							{
							}
							if (stopwatch.Elapsed > this.sentinel.MaxWaitBetweenFailedHosts)
							{
								break;
							}
							Thread.Sleep(this.sentinel.WaitBetweenFailedHosts);
						}
						throw new TimeoutException("Max Wait Between Sentinel Lookups Elapsed: {0}".Fmt(this.sentinel.MaxWaitBetweenFailedHosts.ToString()));
					}
					catch (Exception exception2)
					{
						RedisSentinelResolver.log.Error("Redis Master Host '{0}' is {1}. Resetting allHosts...".Fmt(config.GetHostString(), redisServerRole), exception2);
						List<RedisEndpoint> list = new List<RedisEndpoint>();
						List<RedisEndpoint> list2 = new List<RedisEndpoint>();
						RedisClient redisClient4 = null;
						foreach (RedisEndpoint redisEndpoint in this.allHosts)
						{
							try
							{
								using (RedisClient redisClient5 = this.ClientFactory(redisEndpoint))
								{
									redisClient5.ReceiveTimeout = 5000;
									redisClient5.ConnectTimeout = RedisConfig.HostLookupTimeoutMs;
									RedisServerRole serverRole = redisClient5.GetServerRole();
									if (serverRole != RedisServerRole.Master)
									{
										if (serverRole == RedisServerRole.Slave)
										{
											list2.Add(redisEndpoint);
										}
									}
									else
									{
										list.Add(redisEndpoint);
										if (redisClient4 == null)
										{
											redisClient4 = this.ClientFactory(redisEndpoint);
										}
									}
								}
							}
							catch
							{
							}
						}
						if (redisClient4 == null)
						{
							Interlocked.Increment(ref RedisState.TotalNoMastersFound);
							string message = "No master found in: " + string.Join(", ", this.allHosts.Map((RedisEndpoint x) => x.GetHostString()));
							RedisSentinelResolver.log.Error(message);
							throw new Exception(message);
						}
						this.ResetMasters(list);
						this.ResetSlaves(list2);
						return redisClient4;
					}
					return result;
				}
				return result;
			}
			return result;
		}

The logic of the GetReadWriteHost() method is: preferentially use the master node information obtained by this.sentinel.GetMaster(). If GetMaster() fails, select a random one from the existing set of masters to connect.

Then enter the CreateRedisClient() method:

  • First, the object redisClient is created through the this.ClientFactory() factory, and the counting and new RedisClient() operations are implemented inside the factory. Not much content.
  • Then execute redisClient.GetServerRole(), which means to verify with the server that the currently connected node is indeed the Master role. If confirmed, it is returned directly to the caller. [If the process of sending the query request is abnormal and certain conditions are met, a failover request will be initiated, namely this.sentinel.ForceMasterFailover();]
  • If the currently connected node is not the Master role, call this.sentinel.GetMaster() multiple times to query the Master node information and re-instantiate the RedisClient object;
  • If it still fails to connect to the Master node after the timeout, it will enter the catch exception processing process, traverse all the nodes of this.allHosts and update the corresponding node roles.

So far, through the above process, the RedisClient object of the master node can finally be obtained and returned to the caller. 

In the above process, the implementation of several methods is more important and complicated. The following explains them one by one:

Analysis of the implementation principle of GetMaster() of the RedisSentinel class

The calling place is very simple, but there are many implementations of this method. The source code of the RedisSentinel class is as follows:

View code

public RedisEndpoint GetMaster()
		{
			RedisSentinelWorker validSentinelWorker = this.GetValidSentinelWorker();
			RedisSentinelWorker obj = validSentinelWorker;
			RedisEndpoint result;
			lock (obj)
			{
				string masterHost = validSentinelWorker.GetMasterHost(this.masterName);
				if (this.ScanForOtherSentinels && DateTime.UtcNow - this.lastSentinelsRefresh > this.RefreshSentinelHostsAfter)
				{
					this.RefreshActiveSentinels();
				}
				result = ((masterHost != null) ? ((this.HostFilter != null) ? this.HostFilter(masterHost) : masterHost).ToRedisEndpoint(null) : null);
			}
			return result;
		}

		private RedisSentinelWorker GetValidSentinelWorker()
		{
			if (this.isDisposed)
			{
				throw new ObjectDisposedException(base.GetType().Name);
			}
			if (this.worker != null)
			{
				return this.worker;
			}
			RedisException innerException = null;
			while (this.worker == null && this.ShouldRetry())
			{
				try
				{
					this.worker = this.GetNextSentinel();
					this.GetSentinelInfo();
					this.worker.BeginListeningForConfigurationChanges();
					this.failures = 0;
					return this.worker;
				}
				catch (RedisException ex)
				{
					if (this.OnWorkerError != null)
					{
						this.OnWorkerError(ex);
					}
					innerException = ex;
					this.worker = null;
					this.failures++;
					Interlocked.Increment(ref RedisState.TotalFailedSentinelWorkers);
				}
			}
			this.failures = 0;
			Thread.Sleep(this.WaitBetweenFailedHosts);
			throw new RedisException("No Redis Sentinels were available", innerException);
		}
		private RedisSentinelWorker GetNextSentinel()
		{
			object obj = this.oLock;
			RedisSentinelWorker result;
			lock (obj)
			{
				if (this.worker != null)
				{
					this.worker.Dispose();
					this.worker = null;
				}
				int num = this.sentinelIndex + 1;
				this.sentinelIndex = num;
				if (num >= this.SentinelEndpoints.Length)
				{
					this.sentinelIndex = 0;
				}
				result = new RedisSentinelWorker(this, this.SentinelEndpoints[this.sentinelIndex])
				{
					OnSentinelError = new Action<Exception>(this.OnSentinelError)
				};
			}
			return result;
		}
		private void OnSentinelError(Exception ex)
		{
			if (this.worker != null)
			{
				RedisSentinel.Log.Error("Error on existing SentinelWorker, reconnecting...");
				if (this.OnWorkerError != null)
				{
					this.OnWorkerError(ex);
				}
				this.worker = this.GetNextSentinel();
				this.worker.BeginListeningForConfigurationChanges();
			}
		}

First get the RedisSentinelWorker object through GetValidSentinelWorker(). The implementation of this method includes the control of the retry mechanism, and finally gives the this.worker field, that is, the RedisSentinelWorker object instance, through the this.GetNextSentinel() method.

The GetNextSentinel() method includes operations such as synchronization locks, calling this.worker.Dispose(), randomly selecting sentinel nodes, and instantiating RedisSentinelWorker objects.

The following is to lock the validSentinelWorker, and then continue to execute string masterHost = validSentinelWorker.GetMasterHost(this.masterName);

The code of the corresponding RedisSentinelWorker class is as follows:

View code

		internal string GetMasterHost(string masterName)
		{
			string result;
			try
			{
				result = this.GetMasterHostInternal(masterName);
			}
			catch (Exception obj)
			{
				if (this.OnSentinelError != null)
				{
					this.OnSentinelError(obj);
				}
				result = null;
			}
			return result;
		}
		private string GetMasterHostInternal(string masterName)
		{
			List<string> list = this.sentinelClient.SentinelGetMasterAddrByName(masterName);
			if (list.Count <= 0)
			{
				return null;
			}
			return this.SanitizeMasterConfig(list);
		}
		public void Dispose()
		{
			new IDisposable[]
			{
				this.sentinelClient,
				this.sentinePubSub
			}.Dispose(RedisSentinelWorker.Log);
		}

Note in the GetMasterHost() method: when an exception occurs, the OnSentinelError event of this object will be triggered. As the name implies, this event is used for the subsequent processing of sentinel exceptions. Through the source code search, only the GetNextSentinel() method adds a handler to the OnSentinelError event --> the private void OnSentinelError(Exception ex) method in RedisSentinel. And this method internally prints the log and triggers the event this.OnWorkerError, and then calls GetNextSentinel() to re-assign the this.worker field.

Note: The Dispose() method actually calls the logout operations of this.sentinelClient and this.sentinePubSub respectively.

Related functions and implementation of the RedisNativeClient class

Then the SentinelGetMasterAddrByName() method of the RedisNativeClient class is called:

The meaning of several methods in this class is combined: send the query command of the sentry client to the server through Socket, and format the returned result into the required RedisEndpoint type.

The method SendReceive() also includes mechanisms such as Socket connection, retry, frequency control, and timeout control.

View code

        public List<string> SentinelGetMasterAddrByName(string masterName)
		{
			List<byte[]> list = new List<byte[]>
			{
				Commands.Sentinel,
				Commands.GetMasterAddrByName,
				masterName.ToUtf8Bytes()
			};
			return this.SendExpectMultiData(list.ToArray()).ToStringList();
		}
		protected byte[][] SendExpectMultiData(params byte[][] cmdWithBinaryArgs)
		{
			return this.SendReceive<byte[][]>(cmdWithBinaryArgs, new Func<byte[][]>(this.ReadMultiData), (this.Pipeline != null) ? new Action<Func<byte[][]>>(this.Pipeline.CompleteMultiBytesQueuedCommand) : null, false) ?? TypeConstants.EmptyByteArrayArray;
		}

		protected T SendReceive<T>(byte[][] cmdWithBinaryArgs, Func<T> fn, Action<Func<T>> completePipelineFn = null, bool sendWithoutRead = false)
		{
			int num = 0;
			Exception ex = null;
			DateTime utcNow = DateTime.UtcNow;
			T t;
			for (;;)
			{
				try
				{
					this.TryConnectIfNeeded();
					if (this.socket == null)
					{
						throw new RedisRetryableException("Socket is not connected");
					}
					if (num == 0)
					{
						this.WriteCommandToSendBuffer(cmdWithBinaryArgs);
					}
					if (this.Pipeline == null)
					{
						this.FlushSendBuffer();
					}
					else if (!sendWithoutRead)
					{
						if (completePipelineFn == null)
						{
							throw new NotSupportedException("Pipeline is not supported.");
						}
						completePipelineFn(fn);
						t = default(T);
						t = t;
						break;
					}
					T t2 = default(T);
					if (fn != null)
					{
						t2 = fn();
					}
					if (this.Pipeline == null)
					{
						this.ResetSendBuffer();
					}
					if (num > 0)
					{
						Interlocked.Increment(ref RedisState.TotalRetrySuccess);
					}
					Interlocked.Increment(ref RedisState.TotalCommandsSent);
					t = t2;
				}
				catch (Exception ex2)
				{
					RedisRetryableException ex3 = ex2 as RedisRetryableException;
					if ((ex3 == null && ex2 is RedisException) || ex2 is LicenseException)
					{
						this.ResetSendBuffer();
						throw;
					}
					Exception ex4 = ex3 ?? this.GetRetryableException(ex2);
					if (ex4 == null)
					{
						throw this.CreateConnectionError(ex ?? ex2);
					}
					if (ex == null)
					{
						ex = ex4;
					}
					if (!(DateTime.UtcNow - utcNow < this.retryTimeout))
					{
						if (this.Pipeline == null)
						{
							this.ResetSendBuffer();
						}
						Interlocked.Increment(ref RedisState.TotalRetryTimedout);
						throw this.CreateRetryTimeoutException(this.retryTimeout, ex);
					}
					Interlocked.Increment(ref RedisState.TotalRetryCount);
					Thread.Sleep(RedisNativeClient.GetBackOffMultiplier(++num));
					continue;
				}
				break;
			}
			return t;
		}

Summarize

This article focuses on the creation and acquisition of Redis connections, and has a deeper understanding of the internal implementation mechanism of the SDK. On this basis, it is more convenient to analyze the faults related to the Redis SDK in the production environment.

Original address: https://www.cnblogs.com/chen943354/p/15913197.html 

Guess you like

Origin blog.csdn.net/wdjnb/article/details/123088210