IS-IS GR (Graceful Restart) process

Overview:

In order to ensure the continuity of the business, the network system is required to ensure the uninterrupted forwarding of the business in the event of a failure to achieve high network availability. GR (Graceful Restart) graceful restart is a technology that realizes uninterrupted service forwarding when the master/standby switch or the protocol restarts.

Under normal circumstances, since the control and forwarding of distributed devices are separated, the main control board is responsible for the control and management of the entire device, including protocol operation and routing calculation, while the interface board is responsible for data forwarding. After the device is switched between active and standby or the protocol restarts, the neighbor relationship with its surrounding devices will definitely be disconnected. The disconnection of the neighbor relationship will directly reset the route (generate a new LSP and recalculate the route), and the update of the routing table will directly affect Causes changes to the FIB table, and ultimately leads to business interruption.

In response to this situation, the IETF formulated the GR specification (RFC3847) for IS-IS. The basic idea of ​​the GR specification is to notify its peripheral devices to continue to maintain their adjacency and routing information when the device is switched or the protocol restarts. After the device is switched or the protocol is restarted, neighboring neighbors help it to restore the previous link state database and routing table, and the link state database and routing table of neighboring neighbors will also maintain a stable state, thus avoiding routing oscillations and no routing.
Oscillation ensures that the FIB table of the device has never changed, thereby ensuring uninterrupted service forwarding.

Basic terms:

GR Restart: A device with GR capability that has a protocol restart event.
GR Helper: A device that has a neighbor relationship with GR Restart and assists in completing the GR process.
GR Session: When IS-IS neighbors are established, they negotiate GR capabilities. Generally, the process of GR capability negotiation is called GR Session. The content of the negotiation includes whether both parties have GR capability. Once the GR capability negotiation is passed, the GR process can be entered when the protocol restarts.

It should be noted here that devices with distributed architecture can act as GR Restart and GR Helper; while centralized devices can only serve as GR Helper to assist GR Restart to complete the GR process.

IS-IS GR TLV:

Restart TLV is an extension part contained in IIH (IS-to-IS Hello PDUs) messages, and the Type is 211. All IIH packets of devices that support IS-IS GR capability include Restart TLV. Restart TLV carries some parameters for protocol restart. The message format is shown in the following figure: (OSPF is implemented with Type 9 LSA, but ISIS is implemented through TVL field) The
Insert picture description here
field explanation is as follows:
Insert picture description here

IS-IS GR timer:

In the GR capability expansion of IS-IS, three timers are introduced, namely T1, T2, and T3.

T1 timer: If the GR Restarter has sent an IIH message with RR set, but has not received the GR Helper's acknowledgment message containing the Restart TLV and RA set IIH message until the T1 timer expires, T1 will be reset Timer and continue to send IIH messages containing Restart TLV. When the confirmation message is received or the T1 timer has expired 3 times, the T1 timer is cancelled. The T1 timer is set to 3 seconds by default. A process with the IS-IS GR feature enabled maintains a T1 timer on each interface. On the Level-1-2 router, the broadcast network interface maintains a T1 timer for each level.

T2 timer: the time from the restart of GR Restarter to the completion of LSDB synchronization of all devices at this level. The T2 timer is the longest time that the system waits for LSDB synchronization of each layer, generally 60 seconds. The Level-1 and Level-2 LSDBs each maintain a T2 timer.

T3 timer: The maximum time allowed for GR Restarter to successfully complete GR. The initial value of the T3 timer is 65535 seconds, but after receiving the IIH packet with RA set by the neighbor responding, the value will become the smallest value in the Remaining time field of each IIH packet. The T3 timer timeout indicates that GR failed.

The entire system maintains a T3 timer.

The IS-IS GR process is as follows:

Insert picture description here
R1 is GR Restarter, and R2 is GR Helper. The detailed process of GR between R1 and R2 is as follows:

  1. The T2 and T3 timers are started when the IS-IS protocol of R1 is re-enabled globally. When the interface of R1 is UP again and the protocol is enabled, the T1 timer is started on the interface and a Hello message is sent.
  2. When R2 receives the Hello message from R1, it keeps the state of neighbor R1 unchanged and immediately sends a Hello message. After that, R2 sends CSNP packets and LSP packets to R1 to assist it in LSDB synchronization.
  3. When R1 receives the Hello message on the interface and receives all CSNP messages, it can cancel the T1 timer. Otherwise, it sends Hello messages periodically until it receives Hello messages and all CSNP messages or T1 timers. The T1 timer of the interface is cancelled when the maximum number of timeouts is reached.
  4. When the LSDB synchronization is completed, R1 cancels the T2 timer.
  5. After all the T2 timers are cancelled, the T3 timer can be cancelled, the GR process ends, and the normal process of IS-IS is officially entered. At this time, you need to start the IIH timer on all interfaces, and then send normal Hello messages periodically.
  6. R1 re-calculates the route after restoring all routing information and refreshes the FIB table.

Organize data sources: "HCIE Routing and Exchange Learning Guide", Huawei hedex document

Guess you like

Origin blog.csdn.net/tushanpeipei/article/details/112668817