Intelligent Multi-channel TCP Congestion Control for Video Streams in Deep Space Internet of Things Communication|Literature Reading|Literature Analysis and Learning|Congestion Control|MPTCP|SVC

foreword

那么这里博主先安利一些干货满满的专栏了!

首先是博主的高质量博客的汇总,这个专栏里面的博客,都是博主最最用心写的一部分,干货满满,希望对大家有帮助。

高质量博客汇总https://blog.csdn.net/yu_cblog/category_12379430.html?spm=1001.2014.3001.5482


Intelligent Multi-Path TCP Congestion Control for video streaming in Internet of Deep Space Things communication

Summary

Literature Analysis and Learning

Today I will explain to you this document, intelligent multi-channel TCP congestion control of video streams in deep space IoT communication. Then I also saw that many students in the front were talking about congestion control algorithms. In fact, mine is also a congestion control algorithm, but the background is different.

First of all, we want everyone to better understand what I'm going to talk about next. Let's make it clear that this article is an article about congestion control algorithms based on TCP. Therefore, he is operating at the transport layer. Secondly, he uses MPTCP, that is, multi-channel TCP. Secondly, he is aimed at video streaming transmission. Thirdly, he is aimed at deep space communication. Perhaps the above terms may not be involved in that "Top-Down", so here I will give you some background knowledge.

background knowledge

What is deep space communication?

Deep space communication refers to the communication method in space, which is mainly used in deep space exploration missions outside the solar system, such as Mars exploration and Jupiter exploration. The distance involved in deep space communication is very long. For example, after Musk's starship goes to Mars, it needs to use deep space communication to get in touch with the earth.

What difficulties have encountered in deep space communication?

Based on windows and ACKs, the TCP protocol (that is, the TCP we learned in class) has poor performance on deep space links, with extremely high propagation delays and link errors. So in order to solve this problem, there have been a lot of related research to solve this problem before. However, these methods have the following limitations:

(1) Rule-based methods use a fixed set of rules to handle each situation, and sometimes cannot handle various situations flexibly. (3) Single-path TCP leads to long RRT and lower link utilization.

(2) When old information about link conditions is received at the source after a long RRT, congestion handling decisions based on these past information may not be optimal decisions.

Article Solution

So the author proposed his method in this article, which is DQN for SVC-based CC in MPTCP, what does this mean?

What? DQN stands for Deep Q Network, SVC is a video processing method, and MPTCP is a multi-channel TCP protocol. In short,

Firstly, it is found that TCP is not very good, so in order to solve the problem, the author first uses MPTCP to replace TCP, and then uses SVC to process the video on this basis, and then on this basis, plus deep Q network, so that it can be very good Guarantee the congestion control effect of deep space video transmission. Then, here is the complete model diagram of the method proposed by the author, and then I will explain each module in detail and their functions.

SVC Scalable Video Coding Technology

Scalable Video Coding SVC is a video coding technology that allows dynamic adaptation of video resolution and frame rate over different transmission networks and receiving end devices. SVC can divide the original video stream into multiple encoding layers, each encoding layer has a different bit rate and resolution, which allows the receiving end device to choose the encoding layer that best suits its bandwidth and processing power to decode the video.

First of all, SVC technology first divides a part of block data (that is, video frames within a period of time) into M parts, of which M-1 part is the Enhancement layer, and one of them is the base layer. Then we define the weight function.

 

where Chi,t is the ith block in the buffer at slot t. L(·) is a function of the layer number of the output block. W is the weight of the block, the default value is set to 100, M is the total number of layers, set to 3, so the larger L is, the smaller (ML) is, so this is a decreasing function. By definition: Weight increases as layers decrease. (I see that some groups did not talk about this before, so it’s okay not to talk about it)

Then, SVC technology was applied to this field to solve HoL blocking (I remember this was also mentioned by a group last week). HoL blocking (Head-of-Line Blocking) means that in a queue, if a certain data packet is being processed, subsequent data packets cannot be processed immediately, forming a block.

If we use SVC, there can be several different layers of information contained in the packet. The receiving device can select different layers for decoding according to the needs. If the data packets of the high-resolution layer are delayed and blocked, the receiving device can immediately switch to the low-resolution layer, thus avoiding the HoL blocking problem. But if there is no SVC encoding, then the receiving host can only wait and block at the accept.

Congestion Control on MPTCP

That is, this part is MPTCP. 

What is MPTCP?
MPTCP (Multipath TCP) is a multipath transmission control protocol. The difference between him and tcp is that mptcp combines multiple network paths.

So in essence, how is MPTCP realized?

MPTCP essentially divides an upper-layer packet into multiple parts (we know that IP packets will be fragmented if the length exceeds the MTU, and this is the same reason, although the implementation details may be different) and then these The split data is transmitted on different TCP sub-connections.

So what is the advantage of MPTCP?

We are all familiar with TCP's congestion control algorithm, that is, additive increases and multiplicative decreases. When the network is congested, TCP will halve the congestion window, then restart the slow start, and gradually increase the sending window.

Hey! But the congestion control of MPTCP is different! When MPTCP detects that a path is congested, it will reduce the transmission rate on the path by weakening the window of the path, instead of directly reducing the window like TCP. Half. At the same time, MPTCP supports multi-path transmission, so it can increase the transmission rate on other paths, and use multiple paths to achieve higher bandwidth utilization. To give an example, assume that MPTCP uses two transmission paths at the same time. When one of the paths is congested, MPTCP can choose to increase the transmission rate on the other path to achieve a higher overall transmission rate, while TCP will not work. He You can only halve the congestion window, and then start slowly!

So now, the final question is, how should this MPTCP adjust the transmission rate on each path when it is congested?

For example, how much is weakened here, and how much is increased here? Then the author uses reinforcement learning (RL) to solve this quantitative problem.

First of all, reinforcement learning requires a reward function, which is what we often call the reward function. The reward function needs to be defined. We first need to define the state.

 In this project, we calculate the state every time t. The states are defined as follows.

 Among them, sit represents the state on the i-th TCP path in the time interval starting at time t, Cwti represents the congestion window size of the current path at the current time, the second represents the packet loss rate, the third represents the available bandwidth, and the third represents the available bandwidth. Four means delay.

 The overall status of all paths can be shown above. Wait for a while, what we expect is that after throwing this state into the reinforcement learning network, the reinforcement learning network can tell me that my MPTCP is specific

How should it be adjusted? So we define "action" -- which means how to adjust it. Next, we can define that the action of MPTCP and SVC is Ai within a certain period of time starting from t.

 Next, we define the reward function.

 

Where W(Ch, t) represents the layer weight function, Lri, t is the loss rate of the i-th sub-flow, and Thi, t is the achieved throughput, so the greater the throughput, the greater the value of the reward function, and the layer weight The product of the function and the packet loss rate can actually be understood as a success rate for the data packet to reach the opposite host. Of course, if this increase, the reward function must also increase.

Next, we use a series of variables defined above to analyze the process of running the algorithm.

 

 

First of all, we need to make it clear that what is sent into the network is [si, ai, ri, si], that is, state + action + reward

(Line 1-3) The algorithm randomly initializes Q and the value of the replay buffer. What does this buffer mean? In fact, it can be understood as repeated training. After we send a state to the network, the state may have changed. Correspondingly, his reward will also change. At this time, we put this state that has entered the network into the waiting queue (not exactly a queue, it is randomly sampled, not first-in-first-out) to continue, so that we can pass Feed the network multiple times to find the action corresponding to the maximum reward for each state. (Lines 5-9) The system collects the size of the congestion window, the loss rate of each subflow, the bandwidth utilization, and the RTT of each subflow received from the environment. Then, by selecting the maximum reward returned by the action, the system performs an action to change the size of the congestion window (lines 11-13) <specifically how to change it, I have been looking for a long time but it is not mentioned in the article>. After performing an action, the system stores the state, action, reward, and next state in the replay buffer and performs random sampling (Lines 14-15). Finally, gradient descent (a backpropagation here, as I understand it) is performed on each learning step to update the network parameters (lines 16-19).

At this point, you can actually get what is the optimal processing action corresponding to each state (that is, the 1*4 matrix), that is, how should MPTCP set the transmission rate of each path? All resolved.

performance analysis 

Then the author's specific model is the content mentioned above, so how is his performance? Due to the particularity of the deep space environment, the communication environment of this experiment is realized through computer simulation, rather than through real experiments or Simulation test.

To evaluate the performance of the proposed dqn-based MPTCP CC scheme, we set up two asymmetric links with different RRT and loss rates. Detailed link settings are shown in Table 1.

Now let's analyze the performance of this DQN for SVC-based CC in MPTCP.

Let's look directly at the graph analysis.

 

This fig4 shows the percentage of layers that are successfully received and played when streaming data is sent over a simulated interplanetary backhaul link.

We can see that Proposed is the method proposed by the author, and the latter are the methods used by the author for comparison.

We can see that whether it is the acceptance probability of the BL layer of SVC, the acceptance probability of the EL layer, or the total acceptance probability of BL+EL, the method proposed by the author far exceeds other compared models. We can look at this picture, the method proposed by the author, the addition of EL+BL can actually be understood as the probability that the entire complete video data packet reaches the host of the other party. We can see that they are all close to 100%. I think this is a bit exaggerated.

Then we mentioned earlier that the author introduced SVC to solve the HoL blocking problem. Now let's see how the solution works.

 

In this figure, the vertical axis is the probability of HoL blocking. The abscissa is the asymmetry of each link rtt of the network. Asymmetry is defined as follows.

Performance evaluation is also performed according to the asymmetry of the rtt of each link. The asymmetry of the connecting rod is defined according to the given relative error.

As shown in Fig. 5(a), it is clear that as the aforementioned asymmetry increases, HoL blockage also increases. When the maximum asymmetry was 60%, the model showed only 12% HoL blocking, while the other models were basically above 50%.

The blue one is the model proposed by the author, which is far ahead of other models.

Then there is a graph that measures the time it takes to transfer a complete stream file.

 

Obviously, the author's method is too advantageous
. It can be seen that the performance of his scheme is 20%, 21% and 19% higher than that of TCP BBR, DRL-TCP and QLE-DS respectively.

what i think can be improved

Although, it seems that looking at the pictures given by these authors just now, it seems that this model is completely SOTA, and there is no room for improvement. In fact, I think this model still has a lot of room for improvement.

Here, I mainly put forward some ideas for my improvement on MPTCP.

MPTCP itself is already a protocol that supports multi-path transmission. However, in the deep space network environment, due to the complexity and instability of the transmission path, MPTCP still needs further improvement to adapt to different transmission scenarios. The following are some methods for further improvement of MPTCP: First of all, I want to say about security.

1. Strengthen security: As mentioned earlier, deep space network transmission has important security requirements, and security measures such as data encryption, authentication, and anti-tampering need to be provided. Therefore, in MPTCP, further encapsulation is used to strengthen data security protection to ensure the security and integrity of data transmission.

2. I feel that the experimental part of this article is a bit short. In fact, you can explore the effect of the value of SVCM on the effect.

3. In addition, there are only two links to simulate the experiments he has done so far, so I think the state of deep space communication has not been well simulated, because the state of deep space communication is too complicated, so I think this The article also needs supplementary experiments to better simulate the state of deep space communication.

Guess you like

Origin blog.csdn.net/Yu_Cblog/article/details/132041117