Detailed Web Protocol and Packet Capture: HTTP/2 [3]

Detailed Web Protocol and Packet Capture: HTTP/2 [3]

Preface

"Detailed Web Protocol and Packet Capture" course study, lectured by Teacher Tao Hui

Learning Content:

  1. HTTP-TLS/SSL-TCP/IP learns web protocols from top to bottom according to the applicationDetailed knowledge of HTTP/2
  2. Practical verification combined with packet capture tools: network panel, Tcpdump, Wireshark under chrome

Concepts and features of HTTP/2

HTTP/2 (Hypertext Transfer Protocol version 2, originally named HTTP 2.0 ), abbreviated as h2 (encrypted connection based on TLS/1.2 or above) or h2c (non-encrypted connection) [1] , it is the HTTP protocol The second major version is used on the World Wide Web .

The HTTP/2 standard was officially published as RFC 7540 in May 2015.

The HTTP/2 protocol itself does not require encryption, but the developers of most clients (such as Firefox, Chrome, Safari, Opera, IE, Edge) state that they will only implement the HTTP/2 protocol encrypted by TLS , which makes TLS-encrypted HTTP/2 (i.e. h2 ) has become a de facto mandatory standard, and h2c has actually been abandoned by mainstream browsers.

Main features

  • Significant reduction in the amount of transmitted data
    • Binary transmission
    • Header compression: HPACK algorithm
  • Multiplexing and related functions
    • Message priority
  • Server message push
    • Parallel push

http2.0 does not change the semantics of http1.x, but repackages the header and body of the original http1.x with a frame.

image-20201216151044692

Use WireShark to decrypt TLS-SSL messages

  • wireshark obtains the key generated during the TLS handshake phase

image-20201125162753581

  • HTTP/2 packets decrypted by wireshark from TLS

image-20201125162946231

h2c: Upgrade from HTTP1.1 to HTTP/2 on TCP

The upgrade of h2c and HTTP1.1 negotiation handshake to WebSocket is very similar

image-20201125170911963

Step 1: Negotiation returns to 101 Switching Protocols

image-20201125171314318

Step 2: Client launches Magic frame

image-20201125171119206

image-20201125171657939

Unified connection process: h2 and h2c are unified

image-20201125171600941

h2: Upgrade from HTTP1.1 to HTTP/2 on TLS

  • ALPN extended negotiation protocol

image-20201125172619078

Upgrade process

  • Clinet Hello package
    • List the supported protocols in the alpn extension

image-20201125172449821

  • Server Hello package
    • The server selects the h2 protocol and tells the client

image-20201125172538157

  • Then send Magic frame

The relationship between frames, messages, and streams

HTTP/2 core concepts

image-20201125173452038

The relationship between Stream, Message, and Frame

  • Correspondence between Stream and Frame through stream ID

image-20201125174214062

  • Composition of Message: HEADERS Frame and DATA Frame

image-20201125174349950

Multiplex

  • Disordered in transmission, assembled when received

  • The same stream must be transmitted in order, the order between different streams can be arbitrary

    image-20201125174745541

Frame format

The role of Sream ID

9-byte standard frame header

image-20201127104858255

  • [Function One]: The key to multiplexing

    • The implementation of the receiving end can concurrently assemble messages accordingly
    • The frames in the same Stream must be ordered
    • SETTINGS_MAX_CONCURRENT_STREAMS controls the number of concurrent streams
  • [Function two]: the key to push dependency requests

    • The stream established by the client must be odd
    • The stream established by the server must be an even number

    image-20201127134336509

  • [Function three]: Binding regulations for flow state management

    image-20201127135224324

  • [Function four]: Application layer flow control only affects data frames

    • Streams with StreamID 0 are only used to transmit control frames
  • image-20201127135501538

Frame type and subtype of setting frame (Setting frame)

To9-byte standard frame headerFor reference, introduce the frame length and frame type

image-20201127104858255

  • Frame length:Length

image-20201127140435234

  • Frame type:Type and Flags, For different Types, Flags content also echoes different

image-20201127140647850

  • Setting sets the format of the frame (type=0x4):

    • Set frame is not negotiation, is the sender to the recipient notification properties and its ability to

    • One setting frame can set multiple objects at the same time

      image-20201127141912043

      • Identifier: Set the object
      • Value: Setting value
    • Setting type:

      image-20201127142150761

image-20201127142631122

HPACK algorithm

How HPACK reduces the size of HTTP headers

HPACK compression algorithm: simplify the message content and reduce the message size during transmission

  • The message sender and message receiver jointly maintain a static table and a dynamic table (the two together serve asdictionarycharacter of)

  • For each request, the sender encodes and compresses the message header according to the content of the dictionary and some specific specifications

  • The receiver decodes according to the dictionary and judges whether it needs to update the dynamic table according to the instruction

  • Three compression methods

    • Static dictionary
    • Dynamic dictionary
    • Compression algorithm: Huffman coding

Static dictionary: https://httpwg.org/specs/rfc7541.html#static.table.definition

  • Only include known header fields
    • name:value can be determined
    • Only name

Static dictionary and dynamic dictionary are also called index table

HPACK compression diagram

  • First use the static table, after multiple requests to standardize the dynamic dictionary, and finally combine with Huffman coding compression

image-20201204162745674

Schematic diagram of index table usage

image-20201204162606677

How to use Huffman coding in HPACK

principle:

  • Symbols with greater probability of occurrence use shorter codes, and symbols with lower probability use longer codes

Static Huffman coding: https://httpwg.org/specs/rfc7541.html#huffman.code

image-20201204164609526

Dynamic Huffman coding

Huffman tree construction process

image-20201204164725478

Encoding of integer numbers in HPACK (ie index value encoding)

  • Integer number code less than 31

image-20201204165501524

  • Encoding of integer large numbers

image-20201204165716005

  • Decoding of integer large numbers

image-20201204165757980

Encoding of header name and value in HPACK (HEADER frame)

  • HEADER frame format

image-20201215162457545

When the HEADER frame is too large, a CONTINUATION frame will follow (Type=0x9)

  • Follow the HEADER frame or PUSH_PROMISE frame to supplement the complete HTTP header

    image-20201215162746996

For better理解Header Block Fragment

image-20201215164119228

  • Literal encoding

    image-20201215164236155

Next, let’s look at the practice:

  • The name and value are in the index table (including static table and dynamic table)

    • Encoding method: the first digit is 1 and the remaining 7 digits are the index number

      image-20201215164539913

    • For example: method: the serial number of GET in the static index table is 2, which should be expressed as 1000 0010, and the HEX is 82

    • wireshark see example

      image-20201215170923512

  • The name is in the index table, and the value needs to be coded and added to the dynamic table at the same time

    • Coding method: the first two digits 01

      image-20201215171336984

    • wireshark example

      image-20201215173900032

  • The name and value need to be coded and transmitted, and added to the dynamic table

    • The first two pass 01

      image-20201215174025681

  • The name is in the index table, and the value needs to be coded and transmitted and not updated to the dynamic table

    • The first four pass 0000

      image-20201215174506744

  • The name and value need to be coded and transmitted, and are not updated to the dynamic table

    • The first four pass 0000

      image-20201215174614658

  • The name is in the index table, the value needs to be coded and passed, and never updated to the dynamic table

    • The first four pass 0001

      image-20201215174729479

  • The name and value need to be coded and transmitted, and never updated to the dynamic table

    • The first four pass 0001

      image-20201215174711439

image-20201215174838263

Active message push on the server side (PUSH_PROMISE frame)

  • Push resources to the browser cache in advance
  • characteristic:
    • Push based on sent request
  • Method to realize:
    • Push resources must correspond to a request
    • The request is sent by the server-side PUSH_PROMISE frame
    • Response is sent in STREAM with even ID

In HTTP/1.1, after obtaining HTML, when CSS resources are needed:

image-20201216101249776

PUSH_PROMISE method in HTTP2

image-20201216101545636

  • Format of PUSH frame

    • type=0x5, and can only be pushed by the server

      image-20201216101831028

  • wireshark example

    • Notify the client that the image resource is coming in stream 1

    image-20201216102557617

    • Send image resources in stream 2

    image-20201216103104934

  • Disable PUSH push mode:

    image-20201216103210301

Stream state transition

  • Stream features

    image-20201216111543568

  • Message features

    image-20201216111703608

  • Stream status

    • Frame symbol and stream status interpretation

      image-20201216112555558

    • flow chart

      image-20201216112708493

RST_STREAM frame and common error codes

  • RST_STREAM帧(type=0x3)

    image-20201216113405009

  • Common error codes: not only used in RST_STREAM frames, but also in GOAWAY frames, etc.

    image-20201216113536379

    image-20201216113640776

    image-20201216113701466

Stream priority and resource allocation principle (PRIORITY frame)

Adjust the priority settings of different requests

  • PRIORITY priority setting frame

    • image-20201216114858950

    • exclusive flag

      image-20201216115331082

Different from TCP flow control

  • There is no multiplexing on the TCP connection in HTTP/1.1

    image-20201216145904009

  • In HTTP/2, multiplexing means that multiple streams must share TCP layer flow control

    • Problem: Multiple Streams compete for TCP flow control, and mutual interference may cause Stream blocking

      image-20201216150148349

    • The proxy server has limited memory and when the upstream and downstream network speeds are inconsistent, manage memory through flow control

    • Therefore, HTTP/2 is required for flow control at the application layer

How to perform flow control at the application layer?

  • The sending speed is determined by the application layer

    image-20201216150337392

  • WINDOW_UPDATE帧

    image-20201216150624145

  • wireshark example

    image-20201216150821064

  • Flow control window

    image-20201216151004237

gRPC framework

Problems with HTTP/2

  • TCP and TCP+TLS chain building handshake too many problems

    image-20201216152601654

  • Multiplexing and TCP's head-of-line blocking problem

    • Essentially, the resources specified by TCP must arrive in order

      image-20201216153041321

HTTP3 QUIC protocol

  • Location of QUIC protocol:

    image-20201216153219240

  • Comparison of HTTP/2 and QUIC protocol

    image-20201216153712609

  • HTTP3 connection migration

    • After allowing the client to change the IP address and port, it can still connect before reuse
  • Solve the problem of head blocking

    image-20201216160058829

  • HTTP3: 1RTT complete handshake

    image-20201216160131834

Seven-layer load balancing

image-20201216155617265

  • HTTP conversion protocol

    image-20201216155733352

    • WAF firewall

      image-20201216155804301

  • Reverse proxy and cache function

    image-20201216155920707

    image-20201216155935721

reference

Explain the http-2 header compression algorithm in detail

HTTP/2 stream status

Guess you like

Origin blog.csdn.net/weixin_39664643/article/details/110238244