Database Persistence Comparison

Advantages of mysql bin-log

oracle flashback

reids
RDB:
RDB is a very compact file that holds Redis datasets at a point in time. This kind of file is great for backups: for example, you can back up an RDB file every hour for the last 24 hours, and an RDB file every day of the month. This way, you can always restore your dataset to a different version, even if you run into problems. RDB is great for disaster recovery: it's a single file, and it's so compact that it can be sent (after encryption) to another data center, or to Amazon S3. RDB can maximize the performance of Redis: the only thing the parent process has to do when saving the RDB file is to fork a child process, and then the child process will handle all the next save work, and the parent process does not need to perform any disk I/O operations . RDB is faster than AOF when recovering large datasets.

Disadvantages of RDB:
If you need to try to avoid losing data in the event of a server failure, then RDB is not for you. Although Redis allows you to set different save points to control how often the RDB file is saved, it is not an easy operation because the RDB file needs to save the state of the entire dataset. So you may save RDB files at least once every 5 minutes. In this case, in the event of an outage, you could lose several minutes of data. Every time the RDB is saved, Redis will fork() a child process, and the child process will do the actual persistence work. When the data set is large, fork() can be very time-consuming, causing the server to stop processing clients within a certain millisecond; if the data set is very large and CPU time is very tight, this stop time may even be long for a full second. Although AOF rewrites also require fork(), no matter how long the interval between executions of AOF rewrites is, there is no loss of data durability.

Advantages of AOF: (similar to mysql-bin-log real-time logging)
Using AOF persistence will make Redis very durable (much more durable): you can set different fsync strategies, such as no fsync, fsync once per second , or fsync every time a write command is executed. The default policy of AOF is fsync once per second. Under this configuration, Redis can still maintain good performance, and even if there is a failure, only one second of data will be lost ( fsync will be executed in a background thread, so The main thread can continue to work hard on the command request). AOF file is an append only log file, so writing to AOF file does not need to seek, even if the log contains incomplete commands for some reason (such as writing to disk full, write downtime, etc.), the redis-check-aof tool can easily fix this too.
Redis can automatically rewrite AOF in the background when the size of the AOF file becomes too large: The new AOF file after rewriting contains the minimum set of commands needed to restore the current dataset. The entire rewrite operation is absolutely safe, because Redis will continue to append commands to the existing AOF file during the process of creating a new AOF file. Even if there is a shutdown during the rewriting process, the existing AOF file will not be lost. . Once the new AOF file is created, Redis will switch from the old AOF file to the new AOF file and start appending to the new AOF file. The AOF file saves all the write operations performed to the database in an orderly manner, and these write operations are saved in the format of the Redis protocol, so the content of the AOF file is very easy to be read by people, and it is also very easy to parse the file. Exporting AOF files is also very simple: for example, if you accidentally execute the FLUSHALL command, but as long as the AOF file is not overwritten, just stop the server, remove the FLUSHALL command at the end of the AOF file, and restart Redis, The dataset can be restored to the state it was in before FLUSHALL was executed.

Disadvantages of AOF:
For the same dataset, the size of AOF files is usually larger than that of RDB files. Depending on the fsync strategy used, AOF may be slower than RDB. In general, the performance of fsync per second is still very high, and turning off fsync can make AOF as fast as RDB, even under heavy load. However, when dealing with huge write loads, RDB can provide a more guaranteed maximum latency (latency). AOF has had such bugs in the past: due to individual commands, when the AOF file is reloaded, the dataset cannot be restored to the original state when it was saved. (The blocking command BRPOPLPUSH, for example, has caused such a bug.) Tests are added to the test suite for this situation: they automatically generate random, complex data sets and reload them to ensure everything normal. Although this kind of bug is not common in AOF files, it is almost impossible for RDB to have this kind of bug in comparison.

RDB and AOF, which one should I use?
In general, if you want to achieve data security comparable to PostgreSQL, you should use both persistence features. If you care a lot about your data, but can still tolerate data loss within minutes,
then you can just use RDB persistence. There are many users who only use AOF persistence, but we do not recommend this method: because it is very convenient to generate RDB snapshots regularly for database backup,
and the speed of RDB recovery of data sets is faster than that of AOF recovery, In addition, using RDB can avoid the bug of AOF program mentioned earlier. For the reasons mentioned above, in the
future we may integrate AOF and RDB into a single persistence model. (This is a long term plan.)

mongodb (similar to redis aof)
mongodb is different from MySQL. Every update operation of mysql will be directly written to the hard disk, but mongo will not. As an in-memory database, data operations will be written to the memory first, and then persisted to the hard disk, so how does mongo persist? When mongodb is
started, a thread is specially initialized to loop continuously (unless the application crashes), which is used to obtain the data to be persisted from the defer queue within a certain period of time and write it to the journal (log) and disk of the disk. At mongofile (data), of
course, because it is not written to disk when the user adds records, according to the mongodb developers, it will not cause performance loss, because after reading the code, it is found that
when performing CUD operations, the records (Record type) are all put into the defer queue for delayed batch (groupcommit) submission and writing,
but I believe that the time period parameter is a parameter that needs to be carefully considered. The system is 90 milliseconds. If the value is lower, it may be It will cause frequent disk operations, and if it is too high, it will cause data loss when the system is down.


The LSM idea (Log-Structured Merge-Tree) adopted by hbase
hbase is to hold the changes to the data in the memory. After
reaching the specified threadhold, the batch of changes is merged and then written to the disk in batches, so that a single write becomes a Batch writing greatly improves the writing speed, but in this case, it is laborious to read,
requiring the data on the merge disk and the modified data in the memory, which obviously reduces the performance of reading. Mongodb adopts the idea of ​​mapfile+Journal.
If the record is not in memory, load it into memory first, then record the log after changing it in memory, and then write the data file in batches at intervals,
which requires high memory, at least it needs to accommodate Download Hotspot Data and Indexes.


Replica in nosql is similar to mysql-bin-log



http://joshuasabrina.iteye.com/blog/1798239
http://blog.csdn.net/clh604/article/details/19821975
http://my.oschina.net/davehe/blog/174662?fromerr=StmkSgPk
http://blog.csdn.net/likika2012/article/details/38931345

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326944569&siteId=291194637