Technical interpretation丨Logical backup of GaussDB data warehouse high-availability disaster recovery weapon

Abstract: The Roach tool of GaussDB data warehouse provides two main forms of backup, physical backup and logical backup. Logical backup extracts and backs up logical objects in the database, which can effectively deal with fine-grained backups such as single tables and schema levels, which is more flexible and convenient.

1. Introduction

In the era of big data, data integrity and reliability have become one of the core capabilities of a data warehouse. While GaussDB data warehouse is widely favored by users with its outstanding distributed computing and storage capabilities, it also focuses on innovation and polishing in the field of data backup and disaster recovery. The reliability of data can be said to be the "life gate" of the data warehouse. For users of enterprises, governments, etc., if files are damaged due to hardware failures or accidental deletion of business operations, resulting in data damage or loss, the loss will be immeasurable. The Roach tool provided by GaussDB will, with its stable, fast, and reliable backup capabilities, prepare a reliable "regret medicine" for customers by restoring databases or business tables through backups, thereby effectively recovering customer losses.

Figure 1 Schematic diagram of data warehouse backup and recovery

Two, Roach backup and recovery basic framework

The Roach tool of GaussDB data warehouse provides two main forms of backup, physical backup and logical backup. The physical backup is directly stored on the backup medium by copying the file block, and using the backup file block when restoring, rebuild the data directory of the instance DN and CN in the cluster for restoration. In this article, we mainly focus on logical backup. In the current GaussDB data warehouse, logical backup has better flexibility than physical backup. It makes full use of GaussDB's powerful data import and export capabilities, which is different from physical backup files. Whole copy, logical backup extracts and backs up the logical objects of the database. The granularity can be table-level, schema-level, database-level, and customized selection according to customer needs; in a customer data warehouse with thousands of tables, if Only want to back up a table, then the current logical backup is a better choice.

Before explaining the logical backup, let's first talk about the design architecture of the Roach tool. This framework is the basis for all logical or physical backups——

Figure 2 Schematic diagram of Roach backup and recovery tool framework

Roach is a distributed backup and recovery tool. Take a cluster composed of Node1, 2, and 3 as an example. The total entry for backup is the python process GaussRoach.py, which will pull up a roach master process on the current node and all other nodes in the cluster. Each node pulls up a rotor agent process, which is a typical master-slave framework. The master process establishes long TCP connections with all agent processes, encapsulates messages to communicate with each node, and issues tasks such as backup. On each node , It will back up database objects such as CN and DN on the node in a distributed manner.

Three, the principle of logical backup

The following briefly describes the execution process of logical backup

1) Export and backup of the table definition to be backed up

If it is a database-level backup, metadata will be exported schema by schema; when processing each schema, all table definitions will be exported one by one. Therefore, the following figure shows the process of Roach logical backup exporting a table DDL. After receiving the backup instruction, the Roach Master node issues an instruction to a node Roach Agent with CN. The agent process then calls gs_dump and connects to the CN to export the table definition DDL.

Figure 3 Schematic diagram of logical backup table metadata DDL export backup

2) Create appearance

The Roach logical backup process is essentially the process of creating an external table for data export, similar to the table definition export in the previous step. After receiving the Master command, Roach Agent creates and writes external tables based on the exported table definition, and the created external tables use gsmpp_server, server. In the option, location is roach://{Roach Agent listening port}, where the Roach Agent listening port is a parameter configurable, and will accept connections from all DN instances on the node. The Roach logical backup appearance definition is similar to the following form, which should be backed up The table has only one int type field id. The Roach Agent listening port in the example in the figure is 8080, which is configurable and the export format is csv.

Figure 4 External table created by Roach logical backup

3) Connection between Roach tool and DN and data export and backup

At present, the centralized main data import and export appearances of GaussDB data warehouse include GDS, HDFS, OBS, Roach, etc. The appearance of Roach is similar to other types of appearances, which are all completed through FDW (Foreign Data Wrapper), but a series of Roach’s FDW API interface is implemented. In addition, Roach also implements five main low-level read and write APIs such as Open/Read/Write/Close/ErrorReport to realize data interaction between DN and Roach Agent.

Figure 5 Schematic diagram of the logical backup table data backup process

As shown in Figure 5, the process of logical backup data can be briefly described in the following phase1 ~ phase5

  • Phase1:  The command for backing up data is issued by the Master to all Agents, connected to a CN, connected to the database to create external tables and export servers, and create and export external tables. The Roach Agent of each node will also create a TblServer thread, listen to the Agent Port port, and wait for the DN to connect ;
  • Phase2:  even a CN executes insert into roachft select * from A; the sql query will be sent to all DNs, through the registered Roach FDW API, the DN calls the callback function to encapsulate a PGXCNode message, and try to connect as a self-evident instance Agent Port of this node in server url;
  • Phase3:  Each time the Roach Agent’s TblServer receives a DN connection, it will allocate a data communication socket slot, and fork a child process for the backup service of the DN instance; the Agent will wait for all DNs on the node to establish connections and create lengthof (all DN of the node) child processes, data backup in parallel.
  • Phase4:  Each backup subprocess continuously reads the table data through the established connection. After all the data blocks of the table are read, a FINISH_BACKUP message is sent to the Roach Agent, then the data transmission is stopped and the data read from the DN First, it is stored in the buffer of the Agent child process.
  • Phase5:  Each Agent process will create a BackupSender thread, responsible for consuming the table data stored in the buffer, establishing a connection with the backup medium, and streaming data transmission; Phase4 and 5 are asynchronous and parallel actions in actual operation, not waiting After all table data is written into the buffer, it is sent to the backup medium.

Four, summary

The principle of Roach logical backup is roughly explained. Logical backup can effectively cope with fine-grained backups such as single table and schema level, which is more flexible and convenient. The recovery process of logical backup is basically a reverse process from the above backup process. In short, it is table definition recovery, node and DN metadata recovery, and data import process. A major advantage of recovery is that it does not stop the cluster or move. Other data has little impact on the business of other libraries or tables. In the follow-up blog post, we can interpret it in more detail.

 

Click to follow and learn about Huawei Cloud's fresh technology for the first time~

Guess you like

Origin blog.csdn.net/devcloud/article/details/109092573