MongoDB4.2 summary

What is MongoDB

MongoDB is a document-oriented NoSQL database for large-capacity data storage

Each database in MongoDB contains collections, and collections contain documents. Each document can have a different number of fields. The size and content of each document can be different from each other. No pre-defined patterns are required. Instead, fields can be created dynamically.

Why use MongoDB

1. Document-oriented stores data in documents, which can adapt to actual business environment and needs.

2. Support full index, including internal objects

3. Support replication and failure recovery

4. The file storage format is BSON (an extension of JSON)

Difference between MongoDB and RDBMS

1. In RDBMS, the table contains columns and rows for storing data, while in MongoDB, this structure is called a collection. The documents contained in the collection contain fields in turn, and fields are key-value pairs.

2. RDBMS normalizes the data , MongoDB does not need to normalize the data first

Affairs

Starting from version 4.2, for cases that require atomicity to update multiple documents or read consistency between multiple documents, MongoDB provides multi-document transactions for replica sets and sharded clusters (single-machine mongodb transactions are not available)

index

Indexes mostly use the B-Tree data structure. Students who don’t understand BTree should first understand the index types as follows:

1. Single key index

2. Composite index

3. Multi-key index

4. Hash index

5. Unique index

6. Full text index ! ! Only one full-text index can be created in a collection! !

Storage engine

MongoDB 4.0 using the default start WiredTiger storage engine, is not recommended MMAPv1 storage engine .

WiredTiger maximizes the use of available memory as a cache to reduce I/O bottlenecks. Two caches are used: WiredTiger cache and file system cache. WiredTiger cache stores uncompressed data and provides memory-like performance. The file system cache of the operating system stores compressed data. When no data is found in the WiredTiger cache, WiredTiger will look for the data in the file system cache. Support memory usage capacity configuration, WiredTiger internal cache size calculation method: 0.5* (server memory – 1 GB)

Wried Tiger characteristics

DocumentLevel Concurrency (document level lock): Multiple write operations can modify different documents in the same collection at the same time, and must be executed serially when modifying the same document.

Snapshots and Checkpoints (snapshots and checkpoints): mongoDB will create a checkpoint every 60 seconds or the log file [jonurnal] reaches 2G (a snapshot [snapshot]), which presents a consistent view of the data in the memory. When Disk writes data, WiredTiger writes all data in the Snapshot to the data file (Disk Files) in a consistent manner

Journal: After opening the journal, each write will record an operation log (the data written can be reconstructed through the journal), and one write will correspond to the modification of data, index, and oplog, and these 3 modifications , Will correspond to a journal operation log

Compression: WiredTiger compresses the collection and index. Compression reduces Disk space consumption, but consumes additional CPU to perform data compression and decompression operations.

Note: The maximum limit of a single document in mongodb is 16M