The basic idea
MongoDB will be used as in-memory database (in-memory database), that is, do not let this MongoDB save data to disk usage, causing more and more people's interest. This usage for the following applications in terms of ultra-practical:
- Slow RDBMS systems placed before write-intensive cache
- Embedded Systems
- PCI-compliant systems without the need to persist data
- You need a lightweight database and library data can be easily removed unit testing (unit testing)
If all this can be achieved it was so elegant: we will be able to skillfully do not involve query / retrieval functions use MongoDB case of disk operations. You may also know that in 99% of cases, disk IO (especially random IO) is the bottleneck of the system, and, if you want to write the data, disk operations can not be avoided.
MongoDB has a very cool design decision that she can use memory mapping file (memory-mapped file) to process the data on the disk file read and write requests. That is to say, MongoDB does for both RAM and disk were treated differently, just as a huge array of file, and then follow the bytes access the data, and the rest were handed over to the operating system (OS) to deal with! This is a design decision that makes MongoDB can be run without modification on in RAM.
Implementation
All this is by using a special type of file system called tmpfs implementation. It looks the same regular file system (FS) as in Linux, but it is entirely in RAM (unless its size exceeds the size of the RAM, in which case it can swap, this is very useful!). My server has 32GB of RAM, let's create a tmpfs 16GB of:
# mkdir /ramdata
# mount -t tmpfs -o size=16000M tmpfs /ramdata/
# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/xvde1 5905712 4973924 871792 86% / none 15344936 0 15344936 0% /dev/shm tmpfs 16384000 0 16384000 0% /ramdata
Then use the appropriate settings to start MongoDB. To reduce the wasted amount of RAM, should smallfiles and noprealloc set to true. Now that is a RAM-based, so utterly without sacrificing performance. At this time, then use the journal would be meaningless, so it should be put nojournal set to true.
dbpath=/ramdata
nojournal = true
smallFiles = true
noprealloc = true
After the start MongoDB, you will find her very well run, the file system files also appeared As expected:
# mongo
MongoDB shell version: 2.3.2
connecting to: test
> db.test.insert({a:1})
> db.test.find() { "_id" : ObjectId("51802115eafa5d80b5d2c145"), "a" : 1 } # ls -l /ramdata/ total 65684 -rw-------. 1 root root 16777216 Apr 30 15:52 local.0 -rw-------. 1 root root 16777216 Apr 30 15:52 local.ns -rwxr-xr-x. 1 root root 5 Apr 30 15:52 mongod.lock -rw-------. 1 root root 16777216 Apr 30 15:52 test.0 -rw-------. 1 root root 16777216 Apr 30 15:52 test.ns drwxr-xr-x. 2 root root 40 Apr 30 15:52 _tmp
Now let's add some data confirmed what its fully functional. We first create a document 1KB, and then add it to the MongoDB 4 million:
> str = ""
> aaa = "aaaaaaaaaa"
aaaaaaaaaa
> for (var i = 0; i < 100; ++i) { str += aaa; } > for (var i = 0; i < 4000000; ++i) { db.foo.insert({a: Math.random(), s: str});} > db.foo.stats() { "ns" : "test.foo", "count" : 4000000, "size" : 4544000160, "avgObjSize" : 1136.00004, "storageSize" : 5030768544, "numExtents" : 26, "nindexes" : 1, "lastExtentSize" : 536600560, "paddingFactor" : 1, "systemFlags" : 1, "userFlags" : 0, "totalIndexSize" : 129794000, "indexSizes" : { "_id_" : 129794000 }, "ok" : 1 }
# echo 3 > /proc/sys/vm/drop_caches
# free
total used free shared buffers cached
Mem: 30689876 6292780 24397096 0 1044 5817368
-/+ buffers/cache: 474368 30215508
Swap: 0 0 0
It can be seen in the RAM 6.3GB been used, there 5.8GB for the file system cache (buffer, buffer). Why even after clearing all caches, the system still has 5.8GB file system cache? ? The reason is that, Linux is very smart, she does not save the duplicate data in tmpfs and cache. awesome! This means that you only copy of the data in RAM. Here we look at access to all of the document, and verify, RAM usage will not change:
> db.foo.find().itcount()
4000000
# free
total used free shared buffers cached
Mem: 30689876 6327988 24361888 0 1324 5818012
-/+ buffers/cache: 508652 30181224
Swap: 0 0 0
# ls -l /ramdata/
total 5808780
-rw-------. 1 root root 16777216 Apr 30 15:52 local.0
-rw-------. 1 root root 16777216 Apr 30 15:52 local.ns
-rwxr-xr-x. 1 root root 5 Apr 30 15:52 mongod.lock
-rw-------. 1 root root 16777216 Apr 30 16:00 test.0
-rw-------. 1 root root 33554432 Apr 30 16:00 test.1 -rw-------. 1 root root 536608768 Apr 30 16:02 test.10 -rw-------. 1 root root 536608768 Apr 30 16:03 test.11 -rw-------. 1 root root 536608768 Apr 30 16:03 test.12 -rw-------. 1 root root 536608768 Apr 30 16:04 test.13 -rw-------. 1 root root 536608768 Apr 30 16:04 test.14 -rw-------. 1 root root 67108864 Apr 30 16:00 test.2 -rw-------. 1 root root 134217728 Apr 30 16:00 test.3 -rw-------. 1 root root 268435456 Apr 30 16:00 test.4 -rw-------. 1 root root 536608768 Apr 30 16:01 test.5 -rw-------. 1 root root 536608768 Apr 30 16:01 test.6 -rw-------. 1 root root 536608768 Apr 30 16:04 test.7 -rw-------. 1 root root 536608768 Apr 30 16:03 test.8 -rw-------. 1 root root 536608768 Apr 30 16:02 test.9 -rw-------. 1 root root 16777216 Apr 30 15:52 test.ns drwxr-xr-x. 2 root root 40 Apr 30 16:04 _tmp # df Filesystem 1K-blocks Used Available Use% Mounted on /dev/xvde1 5905712 4973960 871756 86% / none 15344936 0 15344936 0% /dev/shm tmpfs 16384000 5808780 10575220 36% /ramdata
as predicted! :)
Copy (replication) of it?
Since the server RAM, data will be lost upon reboot, so you might want to copy. Standard replica set (replica set) can be obtained automatically failover (failover), but also can improve the ability to read data (read capacity). If you have to restart the server, it can be from the same replica set another server to read the data in order to reconstruct their own data (resynchronization, resync). Even in the case where a large amount of data and index, this process will be fast enough, because the index operations are performed in RAM :)
It is important, it is to write operation writes a special called oplog Collection, which is located in local in the database. By default, the size of which is 5% of the total amount of data. In my case, oplog will occupy 16GB of 5%, which is 800MB of space. In the case of doubt, the safer approach is to use oplogSize this option to choose a fixed size oplog. If the alternate server downtime than oplog capacity, it must be re-synchronized. It should set the size of 1GB, you can be:
oplogSize = 1000
Fragmentation (sharding) it?
Now that you have all the MongoDB query function, then use it to implement a large-scale service to how to get? You may want to use slices to achieve a large expandable memory database. Configuration server (save the data block allocation) was also used adopt disk-based solutions, because these small number of active servers, clusters rebuilt from scratch old is no fun.
Precautions
RAM is a scarce resource, but in this case you will want the entire data set can be placed in RAM. Although tmpfs exchange ability by means of a magnetic disk (swapping, or in), but it will be very significant performance degradation. In order to make full use of RAM, you should consider:
- Use usePowerOf2Sizes option to standardize storage bucket
- Regularly run compact command or the nodes resynchronize (resync)
- schema is designed to be fairly standardized (in order to avoid a large number of relatively large document appears)
in conclusion
Baby, you are now able to be used as a memory database MongoDB, and it can use all of her features! Performance Well, it should be quite amazing: I tested in the case of a single thread / core, up to 20K per second write speed, and the number of cores increase the number of times the write speed will increase again.