Linux snapshot (snapshot) principle and practice (1) Basic principle of snapshot

Principles and Practices of Linux Snapshots

0. Background

There are many articles on the Internet that introduce the principle of storage snapshots. I would say that there are too many articles, but most of them only talk about the most basic principles.

I haven't found an article that can start from the basic principle, implement it in the ubiquitous linux, and then come to some specific operation examples.

So I came up with the idea of ​​writing a snapshot science popularization, hoping to combine the basic principles of snapshots with the specific snapshot implementation under linux, and then do some operation experiments to increase the understanding of snapshots.

After only writing 1/3, I found that the length was too long, so I split one article into two now, and named it "Linux snapshot (snapshot) principle and practice":

  • The first introduces the principles, including:
    • Snapshot Fundamentals
    • Snapshot model under Linux
  • The second is more hands-on, including several Linux labs:
    • Experiment of writing and verifying data multiple times in COW mode
    • Experiment of writing and verifying data multiple times in ROW mode
    • Validation of COW mode and ROW mode data changes in merge operations

After I finished writing both articles, I found that there are still many problems that have not been explained clearly, such as:

  • Various behavioral relationships of snapshot devices under Linux,
  • Details of the COW equipment,
  • Analyze the driver code,
  • How to expand the snapshot device,
  • How to debug the snapshot device, etc.

It’s just that the original intention of writing this article has been achieved, so the introduction to the snapshot device under Linux has come to an end. As for whether to continue the topic mentioned above, let’s see the situation later.

Since too many people on the Internet have introduced the basic principles of snapshots, this article’s introduction to the principles of snapshots is relatively simple, which is a brief summary, and does not mention the significance, development and other topics of snapshots. If you feel that the part about the principle in this article is not detailed enough, please refer to several articles I recommend, and search by yourself.

This article serves as the principle article of "Linux snapshot (snapshot) principle and practice":

Section 1 introduces my understanding of snapshots. In fact, I used to be a little confused about why the word snapshot is called, why is it not called photography, and what is a snapshot? If you think it is too nonsense, please skip it yourself.

Section 2.1 summarizes the two full snapshot methods, and I feel that few people use full snapshots;

Section 2.2 summarizes the two incremental snapshot methods COW and ROW. There are a lot of introductions on the Internet. I don’t even want to draw pictures, so I just borrow others;

Section 3 introduces the snapshot model under linux. The essence of this article should be the only article on the Internet that comprehensively introduces the details of linux snapshots;

1. How to understand snapshot?

The Chinese translation of snapshot is a snapshot, but what exactly is a snapshot? I saw an explanation on the Internet, and I think it is particularly accurate:

Snapshot, a concept from the field of photography, refers to a state at a specific point in time.

Source: "How to understand Git snapshot (snapshot)?" ", https://www.h5w3.com/82381.html

so,

  • When we take a photo, the photo represents a state of the object being photographed at a specific point in time;

  • When we use git to manage the version, each node (commit, branch, tag) corresponds to a snapshot, recording the state of the git warehouse at the node;

  • When we create a snapshot of the file system, the snapshot records the state of the file system at that point in time;

What is the use of snapshots?

According to "Git snapshot (snapshot) in the end how to understand? The scene quoted in the article :

Imagine taking a photo of a table and recording the position and status of all items on the table, which can be called a snapshot.

We don’t have to store all the items, we only need to store this photo. Next time we want to restore the previous state, we only need to find the photo at that time, and then place the items according to the position in the photo. OK.

– From V2EX: https://www.v2ex.com/t/124019

Therefore, for the git warehouse, you only need to restore the content of each file to the state at the node; for the file system, you only need to restore the data of each block to the state at the time of the snapshot, so that the scene at that time can be completely restored.

After writing this paragraph, I went out and read the explanation of snapshot on Wikipedia:

insert image description here

**Figure 1. Wikipedia's explanation of snapshot**

  • In photographic terms, a photo taken without preparation is called a snapshot.
  • A computer storage term, snapshot represents the state of a system at a specific point in time.

insert image description here

Figure 2. Wikipedia's explanation of Snapshot (computer storage)

Further explain snapshot:

In computer systems, a snapshot is the state of the system at a specific point in time. The term was coined as an analogy in photography. It can refer to an actual copy of the system state or to some system-provided functionality.

I was stunned. For me, here is a simple and clear explanation of what a snapshot is. The damn wall~

In addition, it can also be seen from here that there are some subtle differences between snapshot (snapshot) and photography (photographing):

insert image description here

Figure 3. Wikipedia's explanation of Snapshot (Photography)

In fact, I think this explanation on the Chinese page of Wikipedia is closer to our understanding habits:

insert image description here

Figure 4. Wikipedia's explanation of capture

Therefore, snapshot (snapshot) is actually a snapshot of the state of the system, representing the state of the system at a specific point in time.

2. The principle of snapshot

The snapshot discussed here actually refers to the storage snapshot.

For the principle of storing snapshots, you can refer to the following articles:

The previous articles explained the basic principles of snapshots from a macro perspective, while "COW and ROW Snapshot Technology Principles" not only described the principles, but also introduced the details of the read and write operations of COW and ROW snapshots in depth.

With the above articles, there is no need to repeat the principle of snapshots. But for the completeness of the content, here is a brief explanation. For more detailed content, please refer to the previous articles~

According to the definition of SNIA, there are two types of snapshots: full snapshot and incremental snapshot. A variety of different snapshot technologies are mentioned in articles on the Internet that introduce the principles of snapshots:

  • full snapshot
    • Clone
    • Split Mirror
  • incremental snapshot
    • Copy-On-Write COW (Copy-On-Write)
    • Redirect ROW when writing (Redirect-On-Write)

2.1 Full snapshot

1. Clone

Clone snapshot creates a complete copy of the data. The object of the snapshot can be a storage volume, a file system or a LUN. The advantage of Clone is that it has high availability, but the disadvantage is also obvious, that is, when it is created, the data Make a complete copy.

A very serious problem that needs to be faced when using clone (Clone) snapshots is that each snapshot needs to occupy the same storage space as the source data space. Especially in the case of a large number of snapshots, the resource cost will be very high. In addition, it takes a long time to create.

When the clone (Clone) snapshot is created, all operations of the system will stop, a snapshot space with the same size as the source data space will be created, and the data will be completely copied to the snapshot space. After the snapshot is completed, the system will resume normal operation.

This passage comes from: https://www.cnblogs.com/zhaochunhui/p/13597675.html

Read this passage carefully, it has the same meaning as backup operation.

2. Split Mirror

The split mirror snapshot is also a full snapshot. This snapshot method is relatively simple. First, create a mirror volume of the source volume. Every time the disk writes data, the content will be written to the source volume and the snapshot volume at the same time. When the snapshot is started, the mirrored volume can be quickly detached to generate a snapshot volume.

In short, a split mirror snapshot has no effect on read operations, and has two write operations on write operations, one for the source volume and one for the mirror volume.

2.2 Incremental snapshot

1. Copy-On-Write

Create a snapshot volume outside of the source volume to store snapshot data.

write data

When new data is first written to a storage location on the source volume:

  1. First read the original content and write it to the snapshot volume
  2. Then write new data to the storage device of the source volume.

Operation 1 occurs only when data is written for the first time, and the next write operation for this location will directly write new data to the source volume, and no copy-on-write operation will be performed.

insert image description here

Figure 5. Copy-on-write (COW) model

Image source: https://www.cnblogs.com/jing99/p/7446214.html

Therefore, the source volume always stores the latest current data.

read data

If you need to access snapshot data at a certain point in time:

  • Read directly from the source volume for unchanged blocks
  • Blocks that have changed and been copied are read from the snapshot volume

If you need to access the latest data, read directly from the source volume.

Advantages and disadvantages

Advantages: COW does not occupy any storage resources and does not affect system performance before the snapshot operation.

shortcoming:

  1. Reduce the write performance of the source volume. When modifying the source data, three read and write operations occur:

    1. Read source data.
    2. Write source data to the snapshot volume.
    3. Write new data to the source volume.

    If data is written frequently, this method will consume a lot of I/O.

  2. There is no complete physical copy of the snapshot. The snapshot volume only saves part of the data of the source volume.

  3. If the amount of data copied to the snapshot volume exceeds the reserved space, the snapshot will become invalid.

2. Redirect-On-Write

The implementation principle of ROW is very similar to that of COW, the difference is that the first write operation of ROW to the source volume will redirect new data to the reserved snapshot volume. Therefore, the original data in the ROW snapshot remains in the source volume, and in order to ensure the integrity of the snapshot data, some systems change the status of the source volume from read-write to read-only when creating a snapshot.

write data

When new data is written to a storage location on the source volume for the first time, it will be redirected and written to the snapshot volume.

When the data is rewritten again, the system will select a new storage location for the updated data in the snapshot volume, and the data in the snapshot volume for the previous write redirection will be retained, but will not be referenced.

insert image description here

Figure 6. The redirect-on-write (ROW) model

Image source: https://www.cnblogs.com/jing99/p/7446214.html

Therefore, the source volume always stores the old data when the snapshot is created.

read data

If you need to access snapshot data at a certain point in time, read it directly from the source volume.

If you need to access the latest data:

  • Read directly from the source volume for unchanged blocks
  • Blocks that have changed and are redirected are read from the snapshot volume
Advantages and disadvantages

Advantage: Does not degrade the write performance of the source volume. Write operations after the snapshot is created on the source volume will be redirected, and all write I/Os will be redirected to the snapshot volume, while all snapshot data (old data) will remain in the source volume. Therefore, only one write operation is required to update data, which solves the performance problem of COW writing twice.

shortcoming:

  1. There is no complete physical copy of the latest data.

    The ROW snapshot volume saves the new data after the source volume has been changed, but it is not a complete copy of the latest data. Therefore, when multiple snapshots are created, a snapshot chain will be generated, and the change tracking of source volume data and the deletion of snapshots will become extremely complicated. When restoring the snapshot, the snapshot files will be continuously merged, resulting in a large system overhead.

  2. Read performance degrades. Due to the use of redirected writing, the original continuous data is scattered to the disk, and continuous writing becomes random writing, resulting in a decrease in read performance.

In addition, since the source volume stores the old data when the snapshot was created, in order for the source volume to obtain updated data, a merge operation is required to write the changed data in the snapshot volume back to the source volume.

3. Linux snapshot (snapshot)

Based on the principle of snapshots, there can be various specific implementations. Here we mainly discuss the implementation of snapshots in the Linux system.

3.1 Principle of Linux snapshot

The snapshot (snapshot) under Linux is an incremental snapshot, which is implemented based on the Device Mapper driver. The snapshot device is one of the many virtual devices of the Device Mapper.

Device Mapper supports snapshot, the code is located in drivers/mdthe directory , please refer to the documentation:

Documentation/admin-guide/device-mapper/snapshot.rst

Online version: https://docs.kernel.org/admin-guide/device-mapper/snapshot.html

If Documentation/admin-guidethe directory does not exist, check the directory Documentation/device-mapper.

If you don't want to go deep into the code, but want to have a thorough understanding of snapshot, it is strongly recommended to refer to the official documentation. Because this is first-hand information, all other interpretations on the Internet are second-hand, including this one.

What is the difference between first-hand and second-hand information? Transfer me 88 on WeChat, let me tell you, don’t ask why so many 88, IQ tax.

Can't understand this official document? Then go to the official account and add me to WeChat to discuss~

For incremental snapshots, in addition to the source volume, whether in COW or ROW mode, a snapshot volume needs to be allocated. In linux, this snapshot volume is collectively called COW device.

In fact, in the code and documentation of linux, there is no mention of the two incremental snapshot methods, COW and ROW, and the three letters "ROW" do not even appear. It is precisely because theoretically there are two ways of incremental snapshots, COW and ROW, but they are not seen in the actual linux driver and documentation, which made me take a lot of detours.

Therefore, linux snapshots support both COW and ROW methods, but they are both called Copy-On-Write.

For the snapshot under linux, there are three types of virtual target devices, namely: snapshot-origin, snapshot and snapshot-merge. These 3 devices correspond to their own different scenarios:

  • snapshot-origin, all read operations are directly mapped to the back-end device. The so-called back-end device is the source volume mentioned here. For each write operation, the original data (old data in the source volume) is saved to the COW device, and new data is written to the source volume.
  • snapshot, all write operations are saved to the COW device. For read operations, if the data has not changed, the data is read from the source volume; if the data has changed, the data is read from the COW device;
  • snapshot-merge, used to merge the data in the COW device back to the source volume;

Are the read and write operations of snapshot-origin and snapshot devices familiar?

That's right, the snapshot-origin device corresponds to the COW operation model, and the snapshot device corresponds to the ROW operation model .

It is assumed here that there are two physical devices source and cow, corresponding to the source volume and the snapshot volume respectively. Based on these two physical devices, two snapshot devices snapshot-origin and snapshot are virtualized, as shown in the following figure:

Source and cow can also be virtual devices (in reality, cows are basically virtual devices). For the convenience of description, it is assumed that both are physical devices.

insert image description here

Figure 7. COW and ROW for a Linux snapshot device

For the snapshot-origin device, when writing, copy the old data in the source volume source device to the snapshot volume cow device (step 1'), and then write the new data to the source volume source device (step 1). The read operation is directly mapped to the source volume source device (step 2);

For the snapshot device, when writing, write new data directly to the snapshot volume cow device (step 3). When reading, the unchanged data is mapped to the source volume source device (step 4''), and the changed data is mapped to the snapshot volume cow device (step 4').

In addition, the snapshot-origin and snapshot virtual devices can exist at the same time, and the two can share a snapshot volume cow device. Therefore, both COW and ROW operations are stored in the snapshot volume cow device.

As for snapshot-merge, its sole purpose is to merge the data in the snapshot volume cow back to the source volume source, such as the following scenarios:

  • For COW operations, the source volume always maintains the latest data, and the snapshot volume saves the old data. If the source volume needs to be rolled back to the state of the snapshot time point, a merge operation needs to be performed to write back the old data in the snapshot volume source volume.

  • For ROW operations, the source volume always maintains the state of the snapshot time point, and the snapshot volume saves the latest data. If the source volume needs to be updated to the latest state, a merge operation is also required to write the new data in the snapshot volume into the source volume.

In a word, to restore the source volume in COW mode (restore to the snapshot point), or to update the source volume in ROW mode (update to the latest data), a merge operation is required. Updates to the Android virtual partition system fall into the latter category.

Therefore, there are some special requirements for the snapshot-merge device:

  • snapshot-merge and snapshot use the same parameters, only valid under persistent snapshot (persistent snapshot)
  • snapshot-merge assumes the role of snapshot-origin, if the source volume still has a snapshot-origin device, it must not be loaded
  • snapshot-merge merges changed blocks from the snapshot volume cow back into the source volume source
  • After the merge starts, modifications to the source volume by means other than merge will be postponed until after the merge is complete

So much is explained here, the summary is:

If operating on the snapshot-origin device, it corresponds to the COW model;

If operating on the snapshot device, it corresponds to the ROW model;

snapshot-merge assumes the role of snapshot-origin, and merges the changes in the snapshot volume cow back to the source volume source;

3.2 Structure of COW equipment

The snapshot device on Linux is a block device, and its smallest block unit is sector, with a size of 512 bytes.

Various snapshot operations are performed in the form of chunk data blocks. Chunk is a data block larger than a sector, and the size is set by chunksize when creating a snapshot. For example, setting chunksize = 8 indicates that a chunk is composed of 8 sectors, so 1 x chunk = 8 x sector = 4KB. The default chunksize in the snapshot driver is 32, corresponding to a chunk size of 16KB.

As can be seen from the previous section, the data of the device mapped through the incremental snapshot may be saved on the source volume or the snapshot device.

How to determine whether a piece of data should be on the source volume or on the snapshot device? The answer is through a lookup table, or mapping table, on the snapshot volume.

For Linux snapshots:

  • If there is a record for a piece of data in the lookup table, then the data exists in both the source volume and the snapshot volume, and corresponding selection processing is required for reading and writing.
  • If there is no record for a certain piece of data in the lookup table, the data only exists on the source volume, and only needs to be operated on the source volume when reading.

In general, the approximate structure of a snapshot volume device is as follows:

insert image description here

Figure 8. Structure of a COW device

In a snapshot volume COW device:

  1. The initial header is a disk header of a chunk;
  2. The next step is the lookup table itself. If the lookup table is small, it will occupy 1 chunk. If it is large, it may occupy multiple chunks;
  3. Then there is the data block area corresponding to the lookup table, and each block of data occupies 1 chunk;
  4. After the end of the data block area is a free block;

At present, there is no plan to describe the details of the COW equipment in great detail. We will decide whether to use a separate article to introduce it later depending on the situation.

4. Other

The next article will verify the various characteristics of Linux snapshots through the command line tools of the Linux terminal in a practical way, and further deepen the understanding of snapshots, including:

  • Experiment of writing and verifying data multiple times in COW mode
  • Experiment of writing and verifying data multiple times in ROW mode
  • Validation of COW mode and ROW mode data changes in merge operations

You are also welcome to add me on WeChat for further discussion. You can get your personal WeChat account by replying "wx" in the background of the official account "Rocky Watching the World".

This article refers to the following articles, and the authors of these articles are also grateful:

Guess you like

Origin blog.csdn.net/guyongqiangx/article/details/128494795