[Computer Network - Transport Layer] TCP Protocol

1 Overview of Transport Layer

1.1 Functions of the transport layer

  • End-to-end communication : Provide end-to-end communication (logical communication) between application processes. Therefore, the transport layer is also called an end-to-end protocol.
  • Error detection : detect the header and data part.
  • Two protocols : connection-oriented TCP and connectionless UDP.
  • Multiplexing and demultiplexing :

insert image description here

concept explain
Transport layer TCP multiplexing The packets of some application processes of the sender are encapsulated using the TCP protocol at the transport layer
Transport layer UDP multiplexing The packets of some application processes of the sender are encapsulated using the UDP protocol at the transport layer
Internet layer IP multiplexing The data of different protocols of the sender can be encapsulated into IP datagrams (IP protocol field, TCP is 6, UDP is 17)
Internet layer IP decommissioning The receiver's Internet layer delivers the data to the corresponding upper layer protocol (TCP or UDP) after removing the header
Transport Layer TCP Demultiplexing The receiver's transport layer uses the TCP protocol to remove the header and deliver the data to the destination application process
Transport layer UDP demultiplexing The transport layer of the receiver uses the UDP protocol to remove the header and deliver the data to the destination application process

1.2 Port number

  • Port : The application process is identified by the port number. The length of the port number is 16b, which can represent 65535 different port numbers.
The port number type explain
0 ~ 1023 well-known port number Assigned by IANA to some of the most important application protocols in the application layer of the TCP/IP architecture
1024 ~ 49151 Register port number Used by applications that do not have well-known port numbers. To use such port numbers, they must be registered with IANA to prevent duplication
49152 ~ 65535 ephemeral port number Only used by the client, dynamically selected by the client process at runtime, and will be taken back by the system after the communication is over, so that it can be used by other client processes
  • Well-known port numbers to remember:
FTP SMTP DNS DHCP HTTP BGP HTTPS RIP
21/20 25 53 67/68 80 179 443 520

[Note] OSPF does not use the transport layer protocol, so there is no corresponding port number. But the IP protocol field value of OSPF is 89.

2 TCP segments

2.1 TCP segment header format

insert image description here

  • Source port and destination port : each occupies 2b, and the port is the service interface between the transport layer and the application layer.
  • Serial number (seq) : occupies 32b, and the value range is 0 ~ 2 32 -1. Used to indicate the sequence number of the first byte of this TCP segment.
  • Confirmation number (ack) : occupies 32b, the value range is 0 ~ 2 32 -1. It is used to indicate the sequence number of the first byte expected to receive the next TCP segment from the other party, and it is also an acknowledgment of all data received before. If the acknowledgment number is N, it means that all the data up to sequence number N-1 have been received.
  • Data offset (header length) : 4b, indicating the length of the TCP header. The value of this field is in units of 4B, and the maximum field value is 15, so the maximum length of the TCP header is 60B.
  • Reserved : occupying 6b, reserved for future use, but should be set to 0 at present.
  • Urgent bit (URG) : When URG=1, this segment has urgent data and should be sent as soon as possible.
  • Acknowledgment bit (ACK) : When ACK=1, the confirmation number field is valid. After the TCP connection is established, ACK must be 1.
  • Push bit (PSH) : When PSH=1, it will be delivered to the application process as soon as possible, instead of waiting until enough data is received before delivering it up.
  • Reset bit (RST) : When RST=1, it indicates that there is a serious error in the TCP connection, and the connection must be released and then re-established.
  • Synchronization bit (SYN) : When SYN=1 and ACK=0, it indicates that this is a TCP connection request segment. If the other side agrees to establish a connection, it should set SYN=1 and ACK=1 in the header of the response TCP segment.
  • Termination bit (FIN) : When FIN=1, it indicates that the sender of this TCP segment has sent all the data, and now requests to release the TCP connection.
  • Window : occupying 16b, indicating the size of the receiving window of the party sending this segment, that is, the size of the available space of the receiving buffer, which is used to represent the receiving ability of the receiving party . The value of this field is in units of 1B.
  • Checksum : 16b, used to check whether there is a bit error in the entire TCP segment during transmission.
  • Urgent pointer : occupying 16b, used to indicate the length of urgent data. The value of this field is in units of 1B.
  • Options : Variable length with multiple options. One of the options is Maximum Segment Size (MSS) , which indicates the maximum length of the data part in the message segment.
  • Padding : In order to make the entire header length be an integer multiple of 4B.

[Note] Focus on the bits of seq, ack, ACK, SYN, and FIN.

2.2 The process of TCP data transmission

Assume that client A and server B have established a TCP connection.

(1) Now A sends a TCP confirmation segment to B :

capital data part
seq=201, ack=800, ACK=1 100B(seq=201~300)
  • seq=201: It means that the data part of A starts from sequence number 201, and the total length of the data part is 100B, so the last sequence number of the data part is 300.
  • ack=800: It means that A expects that the sequence number of the data part of the next message segment sent by B starts from 800.
  • ACK=1: After the connection is established, all transmitted segments must be set to 1.

(2) Then B sends a TCP confirmation segment to A :

capital data part
seq=800, ack=301, ACK=1 200B(seq=800~999)
  • seq=800: The data part of B starts from serial number 800, and the total length of the data part is 200B, so the last serial number of the data part is 999.
  • ack=301: Means that B expects that the sequence number of the data part of the next message segment sent by A starts from 301, and at the same time confirms that A's message segment (seq=201~300) has been received.
  • ACK=1: After the connection is established, all transmitted segments must be set to 1.

(3) If A sends 3 TCP acknowledgment segments to B , and the second segment is lost:

capital data part
seq=201, ack=x, ACK=1 100B(seq=201~300)
seq=301, ack=x, ACK=1 100B (seq=301~400) ( lost )
seq=401, ack=x, ACK=1 100B(seq=401~500)

Then B only correctly receives the 1st and 3rd segment. At this time, the TCP segment sent by B to A should confirm the last TCP segment that has been received correctly and arrived in order , which is the cumulative confirmation (see Section 4 for more detailed explanation):

capital data part
seq=x, ack=301, ACK=1 200B(seq=x~x+199)

3 TCP connection management

A TCP connection has three phases: connection establishment, data transfer (described above), and connection release.

3.1 Establishment of TCP connection - three-way handshake

Suppose there are client A and server B ready to establish a TCP connection.

insert image description here

Server B process is in LISTEN (monitoring) state , waiting for client A's connection request.

3.1.1 The client sends a TCP connection request segment to the server

Client A sends a TCP connection request segment to server B :

capital data part
seq=x, ack=0, SYN=1, ACK=0 SYN segment cannot carry data
  • seq=x: Randomly select an initial sequence number x as the initial sequence number of the data part of client A's TCP segment.
  • ack=0: Since ACK=0, the ack field is invalid.
  • SYN=1: The value of the synchronization flag bit SYN in the TCP connection request segment must be set to 1.
  • ACK=0: When SYN=1, ACK=0, it indicates that this is a TCP connection request segment .

At this point, Client A enters the SYN-SENT (Synchronization Sent) state .

[Note] TCP stipulates that the message segment whose synchronization flag bit SYN is set to 1 (such as the TCP connection request message segment and the TCP connection request confirmation message segment) cannot carry data, but consumes a sequence number .

3.1.2 The server sends a TCP connection request confirmation segment to the client

After server B receives the TCP connection request segment, it sends a TCP connection request confirmation segment to client A :

capital data part
seq=y, ack=x+1, SYN=1, ACK=1 SYN segment cannot carry data
  • seq=y: Randomly select an initial sequence number y as the initial sequence number of the data part of the TCP segment of server B.
  • ack=x+1: Server B expects to receive the data part sequence number of the next segment from client A starting from x+1, and at the same time confirms that client A's segment (seq=x) has been received.
  • SYN=1: The value of the synchronization flag bit SYN in the confirmation message segment of the TCP connection request must be set to 1.
  • ACK=1: When SYN=1, ACK=1, it indicates that this is a TCP connection request confirmation segment .

At this point, Server B enters the SYN-RCVD (synchronously received) state .

3.1.3 The client sends a TCP confirmation segment to the server

After client A receives the TCP connection request acknowledgment segment, it sends a TCP acknowledgment segment to server B :

capital data part
seq=x+1, ack=y+1, SYN=0, ACK=1 TCP confirmation segment can carry data
  • seq=x+1: The serial number of the data part of the TCP segment of client A is x+1.
  • ack=y+1: Client A expects to receive the data part sequence number of the next message segment sent by server B starting from y+1, and at the same time confirms that server B's message segment (seq=y) has been received.
  • SYN=0: The value of the synchronization flag bit SYN in the TCP confirmation segment must be set to 0.
  • ACK=1: When SYN=0, ACK=1, it indicates that this is an ordinary TCP confirmation message segment .

At this point, client A enters the ESTABLISHED (connection established) state .

[Note] TCP stipulates that ordinary TCP acknowledgment segments can carry data, but if they do not carry data, the sequence number will not be consumed . If the segment does not carry data, the sequence number of the next data segment to be sent by client A is still x+1.

After server B receives the TCP acknowledgment segment, it also enters the ESTABLISHED (connection established) state . So far the TCP connection has been established.

[Summary] The process of TCP establishing a connection:

客户机:“我有话要跟你讲,不知可不可以?”
服务器:“可以,你讲吧!”
客户机:“好的!blablabla”

3.2 Release of TCP connection - wave four times

Suppose there are client A and server B ready to release the TCP connection.

insert image description here

The client A process and the server B process are in the ESTABLISHED (connection established) state .

3.2.1 The client sends a TCP connection release segment to the server

Client A sends a TCP connection release segment to server B :

capital data part
seq=u, ack=v, FIN=1, ACK=1 The FIN segment may or may not carry data
  • seq=u: The sequence number of the data part of the TCP segment of client A is u, which is equal to the sequence number of the last byte of the data part transmitted by client A plus 1.
  • ack=v: The value of the ack field is v, which is equal to the sequence number of the last byte of data that client A has previously received plus 1.
  • FIN=1: The value of the synchronization flag bit SYN in the TCP connection release message segment must be set to 1.
  • ACK=1: When FIN=1, ACK=1, it indicates that this is a TCP connection release segment .

【Note】TCP stipulates that even if the TCP message segment with the termination flag bit FIN equal to 1 does not carry data, it will consume a sequence number.

At this point, client A enters the FIN-WAIT-1 (finish wait 1) state .

3.2.2 The server sends a TCP confirmation segment to the client

After server B receives the TCP connection release segment, it sends a TCP confirmation segment to client A :

capital data part
seq=v, ack=u+1, FIN=0, ACK=1 TCP confirmation segment can carry data
  • seq=v: The sequence number of the data part of the TCP segment of server B is v.
  • ack=u+1: Server B expects to receive the data part sequence number of the next segment sent by client A starting from u+1, and at the same time confirms that client A's connection release segment (seq=u) has been received.
  • FIN=0: The value of the synchronization flag bit SYN in the TCP confirmation segment must be set to 0.
  • ACK=1: When FIN=0, ACK=1, it indicates that this is an ordinary TCP confirmation message segment .

At this point, server B enters the CLOSE-WAIT (close wait) state . Client A has no more data to send. However, if server B still has data to send, client A still needs to receive it, that is, the connection from server B to client A is not closed. This is called a half-closed state , and this state may last for a period of time.

3.2.3 The server sends a TCP connection release segment to the client

服务器 B 向客户机 A 发送若干个 TCP 确认报文段,直到发送最后一次报文段,即 TCP 连接释放报文段

首部 数据部分
seq=w, ack=u+1, FIN=1, ACK=1 FIN 报文段可携带也可不携带数据
  • seq=w:经过发送若干个 TCP 确认报文段,服务器 B 的 TCP 报文段数据部分的序号变为 w。
  • ack=u+1:服务器 B 期望收到客户机 A 发来的下一个报文段的数据部分序号是从 u+1 开始的,同时重复确认客户机 A 的连接释放报文段(seq=u)已收到。
  • FIN=1TCP 连接释放报文段中的同步标志位 SYN 的值必须设置为 1。
  • ACK=1:当 FIN=1,ACK=1 时,表明这是一个 TCP 连接释放报文段

此时,服务器 B 进入 LAST-ACK(最后确认)状态

3.2.4 客户机向服务器发送 TCP 确认报文段

客户机 A 收到 TCP 连接释放报文段后,向服务器 B 发送 TCP 确认报文段

首部 数据部分
seq=u+1, ack=w+1, FIN=0, ACK=1 TCP 确认报文段可以携带数据
  • seq=u+1:客户机 A 的 TCP 报文段数据部分的序号为 u+1。
  • ack=w+1:客户机 A 期望收到服务器 B 发来的下一个报文段的数据部分序号是从 w+1 开始的,同时确认服务器 B 的连接释放报文段(seq=w)已收到。
  • FIN=0TCP 确认报文段中的同步标志位 SYN 的值必须设置为 0。
  • ACK=1:当 FIN=0,ACK=1 时,表明这是一个普通的 TCP 确认报文段

此时,客户机 A 进入 TIME-WAIT(时间等待)状态,服务器 B 收到 TCP 确认报文段后进入CLOSED(连接关闭)状态。但是 TCP 连接仍未释放,必须经过 2MSL(最长报文段寿命,Maximum Segment Lifetime)的时间后,客户机 A 才能进入 CLOSED(连接关闭)状态

【总结】TCP 释放连接的过程:

客户机:“我准备走了。”
服务器:“等一下,我还有一些话没说完。blablabla”
服务器:“blablabla”
服务器:“blablabla,我说完了,你可以走了。”
客户机:“好的!那我走了!”

4 TCP 流量控制和可靠传输

TCP 协议提供一种基于滑动窗口协议的流量控制机制。TCP 使用了校验、序号、确认和重传等机制来达到可靠传输的目的,在下面的过程中将体现这一特点。

假设有发送方 A 和接收方 B 已建立 TCP 连接,不考虑 TCP 的拥塞控制。再假定 A 只给 B 发送数据,B 对 A 进行流量控制。

在 A 和 B 建立 TCP 连接后,B 告诉 A:“我的接收窗口 rwnd=500”,因此 A 将自己的发送窗口 swnd 也设置为 500。窗口大小是以最大报文段 MSS 为单位的,一般将 MSS 设置为 1B,所以 “swnd=500”意思是发送窗口的大小为 500B。

A 的发送缓冲区情况如下:

数据 1~100B 101~200B 201~300B 301~400B 401~500B 501~600B 601~700B 701~800B
swnd
A 已发送
B 已确认
B 已收到

A 发送数据:seq=1,seq=101,seq=201,seq=301,发送缓冲区情况如下:

数据 1~100B 101~200B 201~300B 301~400B 401~500B 501~600B 601~700B 701~800B
swnd
A 已发送
B 已确认
B 已收到

4.1 接收方对发送方的第 1 次流量控制

但是很不巧,seq=201 丢失了,B 只收到 seq=1,seq=101,seq=301。其中 seq=301 是失序报文段,但 B 不会丢弃它。TCP 作如下规定:每接收到一个失序报文段,就要发送一次冗余 ACK,指明下一个期待的报文数据序号。

很明显,现在 B 期望收到 seq=201 的报文,因此 seq=301 是不会被确认的,这被称为累积确认。现在 B 的接收窗口已经接受了 200B 的数据,还有 300B 未接收,于是发送 ACK=1,ack=201,rwnd=300 的冗余报文给 A。

此时发送方 A 的 swnd 调整为 300,swnd 滑动到首个未确认的数据位置。发送缓冲区情况如下:

数据 1~100B 101~200B 201~300B 301~400B 401~500B 501~600B 601~700B 701~800B
swnd
A 已发送
B 已确认
B 已收到

发送方 A 继续发送 seq=401,对于接收方 B 来说,seq=401 依然是失序报文段,继续发送 ACK=1,ack=201,rwnd=300 的冗余报文给 A。此时发送缓冲区情况如下:

数据 1~100B 101~200B 201~300B 301~400B 401~500B 501~600B 601~700B 701~800B
swnd
A 已发送
B 已确认
B 已收到

4.2 接收方对发送方的第 2 次流量控制

发送方 A 也没有对 seq=201 坐视不理,实质上从发出 seq=201 开始,重传计时器也开始计数了,A 发现计时器超时但仍未收到来自 B 的确认,于是进行超时重传

【注】其实,当发送方 A 连续收到三个冗余 ACK 后,就可以立即重新发送 seq=201,而不必等待计时器超时,这被称为快重传算法,在下一节“拥塞控制”将提到。

接收方 B 收到 seq=201 后,由于网络流量原因,接收窗口需减小到 100B,于是返回 ACK=1,ack=501,rwnd=100 的报文。此时发送缓冲区情况如下:

数据 1~100B 101~200B 201~300B 301~400B 401~500B 501~600B 601~700B 701~800B
swnd
A 已发送
B 已确认
B 已收到

发送方 A 发送 seq=501,此时发送缓冲区情况如下:

数据 1~100B 101~200B 201~300B 301~400B 401~500B 501~600B 601~700B 701~800B
swnd
A 已发送
B 已确认
B 已收到

4.3 接收方对发送方的第 3 次流量控制

接收方 B 收到 seq=501 后,返回 ACK=1,ack=601,rwnd=0 的报文,表示 B 不再接收任何数据。此时发送缓冲区情况如下:

数据 1~100B 101~200B 201~300B 301~400B 401~500B 501~600B 601~700B 701~800B
swnd
A 已发送
B 已确认
B 已收到

由于发送方 A 的 swnd=0,因此 A 不能发送任何数据了。接收方 B 必须发送一个非零窗口通知,以告知发送方 A:“你可以开始发送大小为 xxx 的窗口了。”所以,A 一直在等待这个通知,只要收到通知,将恢复传输过程。

然而可能会出现这样一种情况:B 发出的通知丢失,A 只能无限等待下去。为了打破由于非零窗口通知报文段丢失而引起的双方互相等待的死锁局面,TCP 为每一个连接都设有一个持续计时器

  • 只要 TCP 连接的一方收到对方的零窗口通知,就启动持续计时器。
  • 当持续计时器超时时,就发送一个零窗口探测报文段,仅携带 1 字节的数据。
  • 对方在确认这个零窗口探测报文段时,给出自己现在的接收窗口值 rwnd。
    • 如果接收窗口值 rwnd 仍然是 0,那么收到这个报文段的一方就重新启动持续计时器,继续以上过程。
    • 如果接收窗口值 rwnd 不是 0,那么死锁的局面就可以被打破了。

【注】TCP 规定:即使接收窗口值为 0,也必须接受零窗口探测报文段、确认报文段以及携带有紧急数据的报文段。

5 TCP 拥塞控制

5.1 慢开始和拥塞避免

设置一个慢开始门限阈值 ssthresh,初始值为 16。根据发送方的拥塞窗口 cwnd 的大小执行不同的算法:

  • 当 cwnd < ssthresh 时,使用慢开始算法;
  • 当 cwnd > ssthresh 时,使用拥塞避免算法;
  • 当 cwnd = ssthresh 时,使用拥塞避免算法。

慢开始算法:从 cwnd=1 开始,每经过一个传输轮次(即往返时延 RTT)指数规律增长,cwnd=2,cwnd=4,cwnd=8,当 cwnd = ssthresh = 16 时,改用拥塞避免算法。

拥塞避免算法:每经过一个传输轮次(即往返时延 RTT),cwnd 加 1,即线性规律增长。只要发送方判断网络出现拥塞,则令 ssthresh = cwnd / 2。然后令 cwnd=1,重新执行慢算法。

【注】传输轮次和往返时延的区别:

  • 传输轮次:发送一批报文段并收到它们的确认的时间。
  • 往返时延 RTT:开始发送一批报文段到开始发送下一批报文段的时间。

例如有以下传输过程:

传输轮次 cwnd 发送的 TCP 数据部分的序号 算法 备注
1 1 0 号 慢开始 初始时,cwnd = 1,ssthresh = 16
2 2 1 ~ 2 号 慢开始
3 4 3 ~ 6 号 慢开始
4 8 7 ~ 14 号 慢开始
5 16 15 ~ 30 号 拥塞避免 cwnd = ssthresh = 16
6 17 31 ~ 47 号 拥塞避免
7 18 48 ~ 64 号 拥塞避免
13 24 171 ~ 194 号 拥塞避免 重传计时器发生超时,说明网络拥塞,ssthresh = cwnd/2 = 12
14 1 195 号 慢开始 cwnd 重新设置为 1
15 2 196 ~ 197 号 慢开始
16 4 198 ~ 201 号 慢开始
17 8 202 ~ 209 号 慢开始
18 12 210 ~ 221 号 拥塞避免 cwnd = 16 > ssthresh = 12,改用拥塞避免算法
19 13 222 ~ 234 号 拥塞避免

慢开始和拥塞避免算法的实现过程如下图:

insert image description here

【注】慢开始和拥塞避免的含义:

  • “慢开始”是指一开始向网络注入的报文段少,而并不是指拥塞窗口 cwnd 的值增长速度慢。
  • “拥塞避免”也并非指完全能够避免拥塞,而是指在拥塞避免阶段将 cwnd 值控制为按线性规律增长,使网络比较不容易出现拥塞。

5.2 快重传和快恢复

快重传和快恢复是对慢开始和拥塞避免算法的改进。根据发送方的拥塞窗口 cwnd 的大小执行不同的算法:

  • 当 cwnd < ssthresh 时,若是首次传输,则使用慢开始算法;如果不是首次传输,则使用拥塞避免算法;
  • 当 cwnd > ssthresh 时,使用拥塞避免算法;
  • 当 cwnd = ssthresh 时,使用拥塞避免算法。

快重传算法:当发送方连续接收到三个冗余 ACK 报文时,直接重传对方尚未收到的报文段,而不必等待该报文段的重传计时器超时。

快恢复算法:当发送方连续接收到三个冗余 ACK 报文时,令 ssthresh = cwnd / 2,然后 cwnd 从该 ssthresh 开始线性增加。

例如有以下传输过程:

传输轮次 cwnd 发送的 TCP 数据部分的序号 算法 备注
1 1 0 号 慢开始 初始时,cwnd = 1,ssthresh = 16
2 2 1 ~ 2 号 慢开始
3 4 3 ~ 6 号 慢开始
4 8 7 ~ 14 号 慢开始
5 16 15 ~ 30 号 拥塞避免 cwnd = ssthresh = 16
6 17 31st ~ 47th congestion avoidance
7 18 48th ~ 64th congestion avoidance
13 24 No. 171 ~ 194 congestion avoidance The sender receives three redundant ACK messages consecutively, indicating network congestion, ssthresh = cwnd/2 = 12
14 12 No. 195 ~ 206 congestion avoidance cwnd set to ssthresh = 12
15 13 No. 207 ~ 219 congestion avoidance
16 14 No. 220 ~ 233 congestion avoidance
17 15 No. 234 ~ 248 congestion avoidance
18 16 No. 249 ~ 264 congestion avoidance
19 17 No. 265 ~ 281 congestion avoidance

The implementation process of fast retransmission and fast recovery algorithm is as follows:

insert image description here

It should be noted that the sending window of the sender is determined by the minimum value of both the receive window of the receiver and the congestion window of the sender, ieswnd = min(rwnd, cwnd) .

Guess you like

Origin blog.csdn.net/baidu_39514357/article/details/130054073