table of Contents
2.2.2 Delete the confirmed data segment + RTT sampling
1 Overview of ACK response
During the processing of the TCP input data segment, if the input segment is found to carry ACK information, tcp_ack() will be called for ACK-related processing. In practice, ACK information will always be carried, because carrying ACK does not require any additional overhead, so for each segment of the input (except for special segments such as RST [direct link disconnection]), this process is always performed. Let's take a look at the TCP processing of ACK confirmation in this note.
2 ACK processing tcp_ack()
2.1 Parameter flag
tcp_ack() has a very important parameter flag, which runs through the entire ACK processing process. It records any information that can be obtained from the input section (such as whether data is carried, whether ACK is repeated, whether it is SACK, etc.). Refer to the following operations such as congestion control and RTT sampling. The flag may be a combination of the following values.
flag | value | description |
---|---|---|
FLAG_DATA | 0x01 | ACK segment carries data |
FLAG_WIN_UPDATE | 0x02 | After receiving the ACK segment, the sending window is updated, the left boundary may be updated, or the right boundary may be updated (the notification window becomes larger) |
FLAG_DATA_ACKED | 0x04 | ACK segment confirms new data |
FLAG_RETRANS_DATA_ACKED | 0x08 | The data carried in the ACK segment has been received |
FLAG_SYN_ACKED | 0x10 | ACK segment confirms SYN segment |
FLAG_DATA_SACKED | 0x20 | The ACK segment confirms the new data |
FLAG_ECE | 0x40 | The ACK segment carries the ECE flag |
FLAG_DATA_LOST | 0x80 | SACK detected data loss |
FLAG_SLOWPATH | 0x100 | The ACK segment is processed by the slow path |
FLAG_ONLY_ORIG_SACKED | 0x200 | |
FLAG_SND_UNA_ADVANCED | 0x400 | The ACK segment has updated snd_una, that is, after receiving the ACK, the left boundary of the sending window can be moved to the right |
FLAG_DSACKING_ACK | 0x800 | The ACK segment contains DSACK information |
FLAG_NONHEAD_RETRANS_ACKED | 0x1000 | |
FLAG_SACK_RENEGING | 0x2000 | It is detected that the data segment confirmed by the previous SACK is discarded by the opposite end (this is allowed by the protocol) |
In addition, some basic flag combinations are defined:
#define FLAG_ACKED (FLAG_DATA_ACKED|FLAG_SYN_ACKED)
//用于判断输入的数据段是否为重复段
#define FLAG_NOT_DUP (FLAG_DATA|FLAG_WIN_UPDATE|FLAG_ACKED)
#define FLAG_CA_ALERT (FLAG_DATA_SACKED|FLAG_ECE)
#define FLAG_FORWARD_PROGRESS (FLAG_ACKED|FLAG_DATA_SACKED)
#define FLAG_ANY_PROGRESS (FLAG_FORWARD_PROGRESS|FLAG_SND_UNA_ADVANCED)
2.2 tcp_ack() processing flow
The core operation of tcp_ack() does three things:
- Update the sending window;
- Clear the confirmed data (including retransmission data) in the sending queue and perform RTT sampling;
- Perform congestion control.
/* This routine deals with incoming acks, but not outgoing ones. */
static int tcp_ack(struct sock *sk, struct sk_buff *skb, int flag)
{
struct inet_connection_sock *icsk = inet_csk(sk);
struct tcp_sock *tp = tcp_sk(sk);
//TCB中尚未被确认的最小序号
u32 prior_snd_una = tp->snd_una;
//ACK段中的序号
u32 ack_seq = TCP_SKB_CB(skb)->seq;
//ACK段中的确认号
u32 ack = TCP_SKB_CB(skb)->ack_seq;
u32 prior_in_flight;
u32 prior_fackets;
int prior_packets;
int frto_cwnd = 0;
/* If the ack is newer than sent or older than previous acks
* then we can probably ignore it.
*/
//确认的是还没有发送的数据,这是无意义的确认,直接返回
if (after(ack, tp->snd_nxt))
goto uninteresting_ack;
//该确认号已经收到过了。这种可能是重复ACK,也有可能是正常的,比如该AC段有延时。
//这种ACK有可能还携带了有效的SACK信息
if (before(ack, prior_snd_una))
goto old_ack;
//到这里,说明确认号在期望的范围内[snd_una, snd_nxt],
//确认号确认了新数据,设置FLAG_SND_UNA_ADVANCED标记。
//if判断是为了排除ack==prior_snd_una的情况
if (after(ack, prior_snd_una))
flag |= FLAG_SND_UNA_ADVANCED;
//tcp_abc特性相关
if (sysctl_tcp_abc) {
if (icsk->icsk_ca_state < TCP_CA_CWR)
tp->bytes_acked += ack - prior_snd_una;
else if (icsk->icsk_ca_state == TCP_CA_Loss)
/* we assume just one segment left network */
tp->bytes_acked += min(ack - prior_snd_una, tp->mss_cache);
}
prior_fackets = tp->fackets_out;
prior_in_flight = tcp_packets_in_flight(tp);
//下面是更新发送窗口,按照快速路径和慢速路径分别处理
if (!(flag & FLAG_SLOWPATH) && after(ack, prior_snd_una)) {
/* Window is constant, pure forward advance.
* No more checks are required.
* Note, we use the fact that SND.UNA>=SND.WL2.
*/
//记录最近一次导致发送窗口更新的ACK段的序号,即tp->snd_wl1=ack_seq
tcp_update_wl(tp, ack, ack_seq);
//更新发送窗口左边界
tp->snd_una = ack;
//设置发送窗口更新标记
flag |= FLAG_WIN_UPDATE;
//通知拥塞控制算法,发生了CA_EVENT_FAST_ACK事件
tcp_ca_event(sk, CA_EVENT_FAST_ACK);
NET_INC_STATS_BH(LINUX_MIB_TCPHPACKS);
} else {
//慢速路径处理
//ACK段还携带了数据,设置FLAG_DATA标记
if (ack_seq != TCP_SKB_CB(skb)->end_seq)
flag |= FLAG_DATA;
else
NET_INC_STATS_BH(LINUX_MIB_TCPPUREACKS);
//更新发送窗口
flag |= tcp_ack_update_window(sk, skb, ack, ack_seq);
//SACK相关处理
if (TCP_SKB_CB(skb)->sacked)
flag |= tcp_sacktag_write_queue(sk, skb, prior_snd_una);
//ECN相关处理
if (TCP_ECN_rcv_ecn_echo(tp, tcp_hdr(skb)))
flag |= FLAG_ECE;
//通知拥塞控制算法,发生了CA_EVENT_SLOW_ACK事件
tcp_ca_event(sk, CA_EVENT_SLOW_ACK);
}
//清除软件错误
sk->sk_err_soft = 0;
//更新最近一次接收到ACK段的时间戳
tp->rcv_tstamp = tcp_time_stamp;
//如果之前根本就没有待确认的段,那么无需后续的重传队列以及拥塞控制处理;
//这种情况下需要做和持续定时器相关的操作,因为可能之前传送过探测报文
prior_packets = tp->packets_out;
if (!prior_packets)
goto no_queue;
//删除重传队列中已经确认的数据段,并进行时延采样
flag |= tcp_clean_rtx_queue(sk, prior_fackets);
//F-RTO算法相关内容
if (tp->frto_counter)
frto_cwnd = tcp_process_frto(sk, flag);
/* Guarantee sacktag reordering detection against wrap-arounds */
if (before(tp->frto_highmark, tp->snd_una))
tp->frto_highmark = 0;
//拥塞控制相关
if (tcp_ack_is_dubious(sk, flag)) {
/* Advance CWND, if state allows this. */
if ((flag & FLAG_DATA_ACKED) && !frto_cwnd &&
tcp_may_raise_cwnd(sk, flag))
tcp_cong_avoid(sk, ack, prior_in_flight);
tcp_fastretrans_alert(sk, prior_packets - tp->packets_out,
flag);
} else {
if ((flag & FLAG_DATA_ACKED) && !frto_cwnd)
tcp_cong_avoid(sk, ack, prior_in_flight);
}
if ((flag & FLAG_FORWARD_PROGRESS) || !(flag & FLAG_NOT_DUP))
dst_confirm(sk->sk_dst_cache);
return 1;
no_queue:
//之前没有未被确认的段,收到了ACK,进行持续定时器相关处理
icsk->icsk_probes_out = 0;
/* If this ack opens up a zero window, clear backoff. It was
* being used to time the probes, and is probably far higher than
* it needs to be for normal retransmission.
*/
if (tcp_send_head(sk))
tcp_ack_probe(sk);
return 1;
old_ack:
//虽然该ACK已经收到过了,但是如果其携带了SACK信息,需要更新确认内容
if (TCP_SKB_CB(skb)->sacked)
tcp_sacktag_write_queue(sk, skb, prior_snd_una);
uninteresting_ack:
SOCK_DEBUG(sk, "Ack %u out of %u:%u\n", ack, tp->snd_una, tp->snd_nxt);
return 0;
}
2.2.1 Update sending window
"Linux Kernel Protocol Stack TCP Data Transmission Window"
2.2.2 Delete the confirmed data segment + RTT sampling
"Linux kernel protocol stack TCP data receiving clear send queue + RTT sampling"
2.2.3 Congestion Control