PostgreSQL technical insider (10) Basic principles of WAL log module

The transaction log is an important part of the database and records the historical information of all changes and operations in the database system. WAL log (Write Ahead Logging), also known as xlog, is a type of transaction log and a series of technologies used to ensure data consistency and transaction integrity in relational database systems. It is used in database recovery, high availability, streaming replication, Logical replication and other modules play an extremely important role.

In this live broadcast, we introduced the basic principles, composition and characteristics of the WAL log module. The following content is compiled based on the live transcript.

Introduction to WAL log

When the database writes or updates data, it must ensure that the transaction always maintains ACID characteristics. When a system failure occurs, the database uses transaction log playback to ensure that data is not lost after the failure is restored.

picture

Figure 1: Schematic diagram of stand-alone WAL log process

As shown in Figure 1, in a single-machine scenario, if each write or update directly writes the table file, the cost of updating the table file once is relatively high, and the random write performance of the hard disk will be very poor. At this time, the data can be written into the memory by introducing a buffer pool. Compared with writing table files directly, this method has higher performance.

At the same time, in order to ensure the persistence of data, the WAL log needs to be introduced: before the memory is updated, the WAL log is written first, and then the memory is updated. In this case, even if there is a power outage or failure, the data can be accurately restored, ensuring the ACID of the database.

Compared with updating table files directly, WAL log is less expensive and has a shorter execution path. In PostgreSQL, WAL log writing is also random writing.

picture

Figure 2: Schematic diagram of online WAL log process

In addition, WAL log can also support master-slave synchronization, hot backup and other functions in online scenarios.

Taking Greenplum as an example, if WAL log is not introduced, the master and slave need to agree on a synchronization/backup agreement, or execute the same SQL statement on the slave node. This not only complicates the operation, but also makes hot switching difficult.

After the WAL log is introduced, the WAL log can be directly synchronized between the master and slave nodes to ensure data consistency. When the master node fails, the slave node can quickly replay the corresponding WAL log to restore the data to a usable state, making the entire process easier to operate .

WAL log implementation

Different databases also have different requirements for WAL log implementation, which are mainly reflected in four aspects:

  • The first is the format , which generally consists of two parts: meta+data. The meta part records the meta information of the associated resources, and data is the raw data customized by the resource. Meta and data can be stored separately or unified. When stored separately, a single WAL log needs to read the complete meta first, and then solve the data as needed; when stored uniformly, it can be solved one by one. For example, when stored separately, the data composition is often meta1+meta2.. metaN+data1+data2...dataN; while when stored uniformly, the data composition is often meta1+data1+meta2+data2...metaN+ dataN.
  • Secondly, there are two ways to modify data: undo log and redo log . The undo log is written from back to front, and the redo log is written from front to back. PostgreSQL uses redo log.
  • In addition, the cyclic check code information (CRC) is divided into two types: complete data and segmented data . The advantage of segmented CRC is that when an error occurs, bad block data can be quickly located, and the damage range is small, but the price is slower; in contrast, the CRC of complete data is faster to read and write, but the speed is slower. If a single meta is damaged, the entire WAL log may be damaged, resulting in high recovery costs.
  • Finally, whether you need to delete the disk mainly depends on the specific scenario. If you only do synchronization and backup, you can consider not to delete the disk.

The composition of WAL log

In PostgreSQL, WAL log consists of four parts: header, block header, block private data block, and custom resource data block.

picture

Figure 3: WAL log composition diagram in PostgreSQL

The header and block header are equivalent to the meta mentioned above. They are mainly used for rapid positioning of data blocks, description of data blocks, and CRC operations on data blocks. Among them, the block header is private and needs to be bound to the page. The block private data and the WAL log itself belong to the data part and are used to store specific data.

In the WAL log itself, the initialization resource manager rmgr (Resource managers definition) is the main carrier of custom resources and the producer and consumer of the WAL log data block content .

WAL log checkpoint

During the execution of WAL log, the amount of data will continue to accumulate. When it reaches a certain amount, it will have an impact on system performance, so WAL log data needs to be cleaned regularly.

Cleaning the page cache and xlog files requires the use of the checkpoint mechanism. After executing the checkpoint, the page cache can be cleared, which ensures that performance will not degrade due to the page cache becoming too large.

The main functions of checkpoint include dirty data block writeback, xlog recycling (non-archive xlog and synchronized xlog) and checkpoint redo .

Usually, there are four scenarios that trigger checkpoint, including regular cleaning, maximum data length limit, checkpoint statement, and database shutdown. Of course, checkpoint may also be triggered in other scenarios, and I will not list them one by one here.

Automatic checkpoint refers to executing the checkpoint command at a certain time interval. The time interval can be configured in the PostgreSQL.conf file. The default is 5 minutes.

WAL log recovery与replay

As shown in Figure 4, in GPDB, the data recovery process includes data replay. When the database starts, a startup process will open the checkpoint redo file, start reading xlog in sequence, and perform recovery operations.

picture

Figure 4: Recovery process diagram

In the online scenario, after the primary/master cluster completes data recovery, it will exit recovery. At this time, the WAL sender process will still continue to send xlog information to the slave node. At this time, the startup process in the mirror/standby cluster will not exit. Instead, it will continuously receive xlog information through the WAL receiver and perform replay operations in the startup process.

picture

Figure 5: Schematic diagram of replay operation flow

As shown in Figure 5, the standby database continuously synchronizes the corresponding log data from the main database, and applies each WAL record in the standby database, stream copying the record of each WAL log transmission; the main database starts the WAL sender process, which is mainly responsible for sending the main database to the main database. The WAL log records generated by the server are sent to the slave library.

Correspondingly, the slave library starts the WAL receiver process, communicates with the corresponding WAL sender process, and is responsible for receiving WAL log records sent by the main library; at the same time, the slave library starts the startup process, which is responsible for recording the WAL logs received by the WAL receiver process on the slave library. replay to achieve master-slave data synchronization. In GPDB, synchronous replication is supported by default, and asynchronous replication is also supported.

Example: Changes in WAL log in insert scenario

Figure 6 shows the changes in WAL log in the insert (single data) scenario. Interested readers can debug the code corresponding to the function names marked in the figure.

picture

Figure 6: Changes in WAL log in insert scenario

Custom WAL Resource Managers characteristics

In previous versions of PostgreSQL, rmgr was a static enum. If you want to add new Resource Managers, you need to define them in the kernel.

In PostgreSQL 15, the xlog module supports new changes in Custom WAL Resource Managers, supports dynamic registration structures, and adds some new callback functions.

Custom WAL Resource Managers support external extensions to dynamically add custom resource types, such as table access method or index access method implemented in the extension.

At present, HashData's enterprise-level product series has fully supported the new features of PostgreSQL 15. HashData will continue to improve related functions in the future to further enhance product usability.

Summarize

The core idea of ​​the WAL mechanism in PostgreSQL is: logs are written to disk first, and then data is written to disk . Before writing data to disk and becoming fixed data, it is first written to the log.

Guess you like

Origin blog.csdn.net/m0_54979897/article/details/133139230