Detailed Web Protocol and Packet Capture: HTTP/2 [3]
Preface
"Detailed Web Protocol and Packet Capture" course study, lectured by Teacher Tao Hui
Learning Content:
- HTTP-TLS/SSL-TCP/IP learns web protocols from top to bottom according to the applicationDetailed knowledge of HTTP/2
- Practical verification combined with packet capture tools: network panel, Tcpdump, Wireshark under chrome
Article Directory
- Detailed Web Protocol and Packet Capture: HTTP/2 [3]
-
- Preface
- Concepts and features of HTTP/2
- Use WireShark to decrypt TLS-SSL messages
- h2c: Upgrade from HTTP1.1 to HTTP/2 on TCP
- h2: Upgrade from HTTP1.1 to HTTP/2 on TLS
- The relationship between frames, messages, and streams
- Frame format
- HPACK algorithm
- Active message push on the server side (PUSH_PROMISE frame)
- Stream state transition
- RST_STREAM frame and common error codes
- Stream priority and resource allocation principle (PRIORITY frame)
- Different from TCP flow control
- gRPC framework
- Problems with HTTP/2
- HTTP3 QUIC protocol
- Seven-layer load balancing
- reference
Concepts and features of HTTP/2
HTTP/2 (Hypertext Transfer Protocol version 2, originally named HTTP 2.0 ), abbreviated as h2 (encrypted connection based on TLS/1.2 or above) or h2c (non-encrypted connection) [1] , it is the HTTP protocol The second major version is used on the World Wide Web .
The HTTP/2 standard was officially published as RFC 7540 in May 2015.
The HTTP/2 protocol itself does not require encryption, but the developers of most clients (such as Firefox, Chrome, Safari, Opera, IE, Edge) state that they will only implement the HTTP/2 protocol encrypted by TLS , which makes TLS-encrypted HTTP/2 (i.e. h2 ) has become a de facto mandatory standard, and h2c has actually been abandoned by mainstream browsers.
Main features
- Significant reduction in the amount of transmitted data
- Binary transmission
- Header compression: HPACK algorithm
- Multiplexing and related functions
- Message priority
- Server message push
- Parallel push
http2.0 does not change the semantics of http1.x, but repackages the header and body of the original http1.x with a frame.
Use WireShark to decrypt TLS-SSL messages
- wireshark obtains the key generated during the TLS handshake phase
- HTTP/2 packets decrypted by wireshark from TLS
h2c: Upgrade from HTTP1.1 to HTTP/2 on TCP
The upgrade of h2c and HTTP1.1 negotiation handshake to WebSocket is very similar
Step 1: Negotiation returns to 101 Switching Protocols
Step 2: Client launches Magic frame
Unified connection process: h2 and h2c are unified
h2: Upgrade from HTTP1.1 to HTTP/2 on TLS
- ALPN extended negotiation protocol
Upgrade process
- Clinet Hello package
- List the supported protocols in the alpn extension
- Server Hello package
- The server selects the h2 protocol and tells the client
- Then send Magic frame
The relationship between frames, messages, and streams
HTTP/2 core concepts
The relationship between Stream, Message, and Frame
- Correspondence between Stream and Frame through stream ID
- Composition of Message: HEADERS Frame and DATA Frame
Multiplex
-
Disordered in transmission, assembled when received
-
The same stream must be transmitted in order, the order between different streams can be arbitrary
Frame format
The role of Sream ID
9-byte standard frame header
-
[Function One]: The key to multiplexing
- The implementation of the receiving end can concurrently assemble messages accordingly
- The frames in the same Stream must be ordered
- SETTINGS_MAX_CONCURRENT_STREAMS controls the number of concurrent streams
-
[Function two]: the key to push dependency requests
- The stream established by the client must be odd
- The stream established by the server must be an even number
-
[Function three]: Binding regulations for flow state management
-
[Function four]: Application layer flow control only affects data frames
- Streams with StreamID 0 are only used to transmit control frames
Frame type and subtype of setting frame (Setting frame)
To9-byte standard frame headerFor reference, introduce the frame length and frame type
- Frame length:Length
- Frame type:Type and Flags, For different Types, Flags content also echoes different
-
Setting sets the format of the frame (type=0x4):
-
Set frame is not negotiation, is the sender to the recipient notification properties and its ability to
-
One setting frame can set multiple objects at the same time
- Identifier: Set the object
- Value: Setting value
-
Setting type:
-
HPACK algorithm
How HPACK reduces the size of HTTP headers
HPACK compression algorithm: simplify the message content and reduce the message size during transmission
-
The message sender and message receiver jointly maintain a static table and a dynamic table (the two together serve asdictionarycharacter of)
-
For each request, the sender encodes and compresses the message header according to the content of the dictionary and some specific specifications
-
The receiver decodes according to the dictionary and judges whether it needs to update the dynamic table according to the instruction
-
Three compression methods
- Static dictionary
- Dynamic dictionary
- Compression algorithm: Huffman coding
Static dictionary: https://httpwg.org/specs/rfc7541.html#static.table.definition
- Only include known header fields
- name:value can be determined
- Only name
Static dictionary and dynamic dictionary are also called index table
HPACK compression diagram
- First use the static table, after multiple requests to standardize the dynamic dictionary, and finally combine with Huffman coding compression
Schematic diagram of index table usage
How to use Huffman coding in HPACK
principle:
- Symbols with greater probability of occurrence use shorter codes, and symbols with lower probability use longer codes
Static Huffman coding: https://httpwg.org/specs/rfc7541.html#huffman.code
Dynamic Huffman coding
Huffman tree construction process
Encoding of integer numbers in HPACK (ie index value encoding)
- Integer number code less than 31
- Encoding of integer large numbers
- Decoding of integer large numbers
Encoding of header name and value in HPACK (HEADER frame)
- HEADER frame format
When the HEADER frame is too large, a CONTINUATION frame will follow (Type=0x9)
-
Follow the HEADER frame or PUSH_PROMISE frame to supplement the complete HTTP header
For better理解Header Block Fragment
-
Literal encoding
Next, let’s look at the practice:
-
The name and value are in the index table (including static table and dynamic table)
-
Encoding method: the first digit is 1 and the remaining 7 digits are the index number
-
For example: method: the serial number of GET in the static index table is 2, which should be expressed as 1000 0010, and the HEX is 82
-
wireshark see example
-
-
The name is in the index table, and the value needs to be coded and added to the dynamic table at the same time
-
Coding method: the first two digits 01
-
wireshark example
-
-
The name and value need to be coded and transmitted, and added to the dynamic table
-
The first two pass 01
-
-
The name is in the index table, and the value needs to be coded and transmitted and not updated to the dynamic table
-
The first four pass 0000
-
-
The name and value need to be coded and transmitted, and are not updated to the dynamic table
-
The first four pass 0000
-
-
The name is in the index table, the value needs to be coded and passed, and never updated to the dynamic table
-
The first four pass 0001
-
-
The name and value need to be coded and transmitted, and never updated to the dynamic table
-
The first four pass 0001
-
Active message push on the server side (PUSH_PROMISE frame)
- Push resources to the browser cache in advance
- characteristic:
- Push based on sent request
- Method to realize:
- Push resources must correspond to a request
- The request is sent by the server-side PUSH_PROMISE frame
- Response is sent in STREAM with even ID
In HTTP/1.1, after obtaining HTML, when CSS resources are needed:
PUSH_PROMISE method in HTTP2
-
Format of PUSH frame
-
type=0x5, and can only be pushed by the server
-
-
wireshark example
- Notify the client that the image resource is coming in stream 1
- Send image resources in stream 2
-
Disable PUSH push mode:
- Frame format chapter: frame type
Stream state transition
-
Stream features
-
Message features
-
Stream status
-
Frame symbol and stream status interpretation
-
flow chart
-
RST_STREAM frame and common error codes
-
RST_STREAM帧(type=0x3)
-
Common error codes: not only used in RST_STREAM frames, but also in GOAWAY frames, etc.
Stream priority and resource allocation principle (PRIORITY frame)
Adjust the priority settings of different requests
-
PRIORITY priority setting frame
-
exclusive flag
Different from TCP flow control
-
There is no multiplexing on the TCP connection in HTTP/1.1
-
In HTTP/2, multiplexing means that multiple streams must share TCP layer flow control
-
Problem: Multiple Streams compete for TCP flow control, and mutual interference may cause Stream blocking
-
The proxy server has limited memory and when the upstream and downstream network speeds are inconsistent, manage memory through flow control
-
Therefore, HTTP/2 is required for flow control at the application layer
-
How to perform flow control at the application layer?
-
The sending speed is determined by the application layer
-
WINDOW_UPDATE帧
-
wireshark example
-
Flow control window
gRPC framework
Problems with HTTP/2
-
TCP and TCP+TLS chain building handshake too many problems
-
Multiplexing and TCP's head-of-line blocking problem
-
Essentially, the resources specified by TCP must arrive in order
-
HTTP3 QUIC protocol
-
Location of QUIC protocol:
-
Comparison of HTTP/2 and QUIC protocol
-
HTTP3 connection migration
- After allowing the client to change the IP address and port, it can still connect before reuse
-
Solve the problem of head blocking
-
HTTP3: 1RTT complete handshake
Seven-layer load balancing
-
HTTP conversion protocol
-
WAF firewall
-
-
Reverse proxy and cache function