Apache Ignite Transaction Architecture: Ignite Persistent Transaction Processing

In the previous article in this series , failure and recovery were introduced, and here are the topics that will be covered in the remaining articles in this series:

  • Transactions for Ignite persistence (WAL, checkpoints, and more)
  • Third-party persistent transaction processing

In this article, I will focus on Ignite persistent transaction processing.

Anyone who uses Apache Ignite as an in-memory data grid (IMDG) knows that if the entire cluster goes down, just keeping the data in memory is a serious problem, and other IMDG and caching technologies will also face the same problem. One solution to this problem is to integrate Ignite with third-party persistent storage, and then provide read-through and write-through capabilities, as shown in Figure 1:Figure 1: Persistence using third-party storage

However, this approach has some drawbacks, which will be explained in the next article in this series.

As an alternative to third-party persistence, Ignite has developed a solidified memory architecture, as shown in Figure 2, which can store and process data and indexes in memory and on disk at the same time. This feature is very simple to use, making the Ignite cluster On the premise that the data is placed on the disk, the memory-level performance can be obtained:Figure 2: Solidified memory

Hardened memory works like virtual memory commonly found in modern operating systems, but the key difference is that if Ignite persistence is turned on, hardened memory saves all datasets and indexes on disk, while virtual memory only stores Disks are used for swapping.

In Figure 2, you can also see several functional highlights of the solidified memory architecture. In the memory part, the advantages include [1]:

  • Off- heap : All data and indexes are stored off-heap in Java, which is helpful for processing large amounts of cluster data;
  • Eliminates obvious garbage collection pauses : by keeping all data and indexes off-heap, only application code can pause due to garbage collection;
  • Predictable memory usage : memory usage is fully configurable to meet specific application needs;
  • Automatic memory defragmentation : Ignite avoids memory fragmentation by performing a defragmentation process that is synchronized with data access;
  • Optimization of memory usage and performance : Ignite uses the same format for data and indexes on memory and disk, avoiding unnecessary data format conversion;

For native persistence, the advantages include:

  • Optional persistence : data storage is easy to configure (memory, memory + disk), or memory as a caching layer for disk;
  • Data recovery : Native persistence saves the complete data set, without worrying about cluster crashes and restarts, without losing any data, and still providing strong transaction consistency;
  • Cache hot data in memory : Solidified memory can save hot data in memory and automatically clear cold data in memory when memory space decreases;
  • Execute SQL in the data : Ignite can be used as a full-featured distributed SQL database;
  • Fast cluster restart : If the cluster fails, Ignite can restart and quickly enter the working state;

In addition, persistent storage is ACID compliant and can store data and indexes in flash, SSD, Intel 3D XPoint, and other non-volatile storage. Regardless of whether persistence is used, each node only manages a subset of the overall data. set.

Ignite manages persistence and provides transaction and consistency guarantees through two mechanisms. The two mechanisms are described below: write-ahead logs (WAL) and checkpoints, starting with WAL.

Write Ahead Log (WAL)

After enabling Ignite persistence, Ignite maintains a dedicated partition file for each partition on the node. When the data in memory is updated, the update is not directly written to the corresponding partition file, but appended to the end of the write-ahead log (WAL). Compared with direct update, using WAL will have a significant performance improvement. In addition, WAL also provides a recovery mechanism in the event of cluster or node failure.

The WAL is split into several files, called segments. The segments are populated sequentially. By default 10 segments are created, but this number is configurable. These segment files are used in such a way that when the first segment is full, it is copied to the WAL archive, which is kept for a configurable period of time, while the data is copied from the first segment to the archive , the second segment will be active, and for each segment file, this process will be executed in a loop.

WAL has different modes of operation based on different consistency guarantees, ranging from strong consistency with no data loss to no consistency with potential data loss.

As mentioned before, each update is first added to the WAL, and each update is uniquely identified by the cache ID and entry key, so in the event of a failure or restart, the cluster can always revert to the most recent successful commit Transactions or atomic updates.

The WAL stores both logical and physical records. The logical records describe the behavior of the transaction. The structure is as follows:

  • Operation description (DataRecord) : stores operation type information (create, update, delete) and (key, value, version);
  • Transaction record (TxRecord) : Stores transaction information (start preparation, preparation, commit, rollback);
  • CheckpointRecord : Stores checkpoint start information;

The structure of the data record is shown in Figure 3:Figure 3: Structure of the data record

As can be seen from Figure 3, the data record consists of a set of operation entries (ie Entry 1, Entry 2), each of which includes:

  • cache ID;
  • type of operation;
  • transaction ID;
  • key;
  • value;

The operation type can be one of the following:

  • Create : add to the cache for the first time, including (key, value);
  • Update : update existing data, including (key, value);
  • delete : delete data, including (key);

If the same key is updated several times in a transaction, the updates are merged into one update with the latest value.

checking point

In the event of a cluster failure, the recovery time can be long because the WAL can be large. To solve this problem, Ignite introduced checkpoint technology, if all data cannot fit into memory and must be written to disk, checkpoint is also necessary, so that all data is available.

Remember the process of appending the update operation to the WAL, but the updated (dirty) pages in memory still need to be copied to the corresponding partition file. The process of copying these pages from memory to the partition file is called checkpointing. The benefit of checkpointing is that the pages on disk are kept up-to-date, so that when old segments are removed, the WAL archive can be compressed.

In Figure 4, you can see the working process of checkpointing:Figure 4: Ignite native persistence

The message flow is as follows:

  1. When the node receives an update request ( 1. Update ), the node will find the memory data page to which the data to be inserted or updated belongs, and then the page is updated and marked as a dirty page;
  2. Updates are appended to the end of the WAL ( 2. Persist );
  3. The node sends an acknowledgment message to the update initiator to confirm that the operation was successful ( 3. Ack );
  4. Checkpointing is triggered periodically ( 4. Checkpointing ), and the frequency of the checkpointing process is configurable. Dirty pages are copied from memory to disk and delivered to specific partition files.

If transactions and checkpoints are considered, Ignite uses a checkpointLock to provide the following guarantees:

  1. 1: Start checkpoint and 0 update is executing;
  2. 0: Begin checkpointing and N updates are being performed.

The beginning of a checkpoint does not wait for the transaction to end. This means that transactions can be started before the checkpoint, but committed during and after the checkpoint process.

snapshot

Another mechanism that facilitates data recovery is snapshots, but this feature is provided by GridGain in the flagship version, and it is equivalent to backups in database systems.

Summarize

Ignite guarantees robustness in persistent data management. The WAL mechanism ensures high performance of data recovery in the event of node or cluster failure. Checkpoints allow dirty pages to be flushed to disk and keep the WAL in a controllable state. In short, persistent storage maintains the same transactional consistency as in-memory storage.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324380358&siteId=291194637