After reading this story, you will understand how Redis persists!

I am Redis , and a man named Antirez brought me into this world.

"Wake up! Wake up!", vaguely, I heard someone calling me.

Slowly opened his eyes, it turned out that MySQL was next to him.

"Why did I fall asleep?"

"Hey, did you make an error just now, and the whole process crashed! A lot of query requests were sent to me!" MySQL said.

I just woke up, my mind was still a little confused, MySQL big brother helped me up and continue to work.

"Oh! All the data I cached before is gone!"

"WTF? Didn't you do persistence ?" My face changed as soon as I heard it.

I shook my head awkwardly, "I'm all stored in memory, so it's so fast."

"Then you can also save it on the hard disk. In this case, you can create a cache from scratch. It's not a waste of time!"

I nodded, "Let me think about it and see how to do this persistence".

RDB endurance

Within a few days, I came up with a plan: RDB

Since my data is stored in memory, the easiest way is to iterate over and write them all into the file.

In order to save space, I defined a binary format, coded the data one by one, and generated an RDB file.

However, my data volume is a bit large. It will take a lot of time to back up all of them at once, so I can’t do this too frequently, or I don’t have to do business and just take time to back up.

Also, if there have been no write operations, but read operations, then I don’t have to repeat backups, which wastes time.

After thinking about it, I decided to provide a configuration parameter that can support periodic backups and avoid useless work.

like this:

save 900 1 # 1 write in 900 seconds (15 minutes)

save 300 10 # There are 10 writes in 300 seconds (5 minutes)

save 60 10000 # There are 10000 writes in 60 seconds (1 minute)

Multiple conditions can be used in combination, as long as one of the above conditions is met, I will go for a backup.

Then I thought about it again, this still doesn't work, I have to fork a child process to do this, and I can't waste my time.

With the backup file, next time I encounter a crash and exit, or even the server goes out of power, as long as my backup file is still there, I can read it at startup and quickly restore the previous state!

MySQL:binlog

I took this plan and showed it to the MySQL elder brother happily, expecting him to give me some encouragement.

"Brother, there is something wrong with your plan." Unexpectedly, he poured me a basin of cold water.

"Problem? What's the problem?"

"Look, you go to backup periodically, the cycle is still on the minute level, do you know how many requests our service has to respond to every second, and how much data must not be lost like you?" MySQL said earnestly.

I was a little short of breath, "However, this backup needs to traverse all the data at a time, and the overhead is still quite large. It is not suitable for high-frequency execution."

"Who told you to traverse all the data at once? Come on, let me show you something", MySQL Big Brother took me to a file directory:

mysql-bin.000001

mysql-bin.000002

mysql-bin.000003

···

"Look, these are my binary logs binlog . Guess what is installed in it?", Big Brother MySQL said, pointing to this pile of files.

I took a look, and it was all a bunch of binary data. I didn't understand it. I shook my head.

"It records all the operations that I perform changes to the data, such as INSERT , UPDATE , DELETE, etc., which can be used when I want to restore the data."

Hearing what he said, I was inspired! Bid farewell to MySQL brother, and went back to study a new plan.

AOF persistence

As you know, I am also based on imperatives, and my daily job is to respond to command requests from business programs.

After coming back, I decided to follow the gourd and learn from the MySQL eldest brother. I recorded all the write commands I executed, wrote a file specifically, and gave this persistence a name: AOF (Append Only File) .

But I encountered the same problem with the RDB scheme. How often should I write the file?

I definitely can't record to the file every time I execute a write command, it will seriously slow down my performance! I decided to prepare a buffer, and then temporarily save the commands to be recorded here, and then choose the opportunity to write to the file. I call this temporary buffer aof_buf .

Just do it, I tried it and found that the data was not written into the file. After inquiring about it, it turns out that the operating system also has a cache area. The data I wrote was cached by him, and it was not written to the file for me. This is not cheating!

It seems that I have to refresh it after I finish writing it. I actually write down the data. After thinking about it, I still provide a parameter for the business program to set when to refresh.

appendfsyncParameters, three values:

always: Refresh once every event cycle

everysec: Synchronously refresh every second

no: I just write, let the operating system decide when to actually write

AOF rewrite

This time I was not as impulsive as before. I decided to run it for a period of time before telling the MySQL elder brother, so as not to be poked by him again.

I tried it for a period of time, and everything worked well, but I found that with the passage of time, the AOF backup file I wrote became bigger and bigger! Not only takes up hard disk space, copying, moving, and loading analysis are both troublesome and time-consuming.

I have to find a way to compress the files . I call this process AOF rewriting .

At the beginning, I planned to analyze the original AOF file, and then remove the redundant instructions to make the AOF file thinner, but I quickly gave up this idea. The workload was too much, and the analysis was quite complicated. Trouble, waste a lot of energy and time.

The original record-by-record method is too stupid. The data is changed over and over, and there are many intermediate states that are useless. Why not just record the final data state?

such as:

RPUSH name_list'Programming Technology Universe'

RPUSH name_list'Play programming handsomely'

RPUSH name_list'Backend Technology School'

Can be combined into one to get:

RPUSH name_list'Programming technology universe''Play programming handsomely''Back- end technology school'

I have the idea of rewriting AOF files, but it still takes a lot of time to do it. I decided to fork a child process to do this in the same way as RDB.

Be cautious like me, and found that after doing this, if I modify the data during the rewriting of the child process, there will be inconsistencies with the rewritten content! My MySQL elder brother will definitely pick you up, and I have to fill this loophole.

So, in addition to the previous aof_buf , I prepared a buffer: AOF rewrite buffer .

From the moment I create the rewrite subprocess, I copy the following write commands to this rewrite buffer. After the subprocess rewrites the AOF file, I put it in this buffer again. The command is written to the new AOF file.

Finally, rename the new AOF file to replace the original bloated and large file, and finally you're done!

After confirming that there was no problem with my thinking, I found my big brother MySQL again with a new plan. I did it all. This time, he must have nothing to say, right?

MySQL eldest brother looked at my plan and smiled with satisfaction, but just asked a question:

This AOF scheme is so good, can the RDB scheme be eliminated?

Unexpectedly, when he asked me this question, I was lost in thought. How do you think I should answer it?