Understand the NoSQL database MongoDB in one article

1. Introduction to MongoDB

What is MongoDB?

MongoDB is an open source, document-oriented, non-relational database management system first released in 2009. It uses BSON to store data in JSON-like documents, rather than the traditional tabular form of rows and columns.

MongoDB is designed to provide high performance and scalability when processing large amounts of data. It is designed to meet the flexibility, scalability, and data complexity requirements of modern applications.

Features and Benefits of MongoDB

  1. Document-oriented data model: MongoDB uses a binary representation called BSON (Binary JSON) to store data. A document is a JSON-like structure that can contain key-value pairs, arrays, and nested documents. This flexible data model enables MongoDB to easily store and process data of various types and structures.

  2. High performance and scalability: MongoDB uses a memory-mapped storage engine to store data between physical memory and disk for exchange, achieving fast read and write operations. In addition, MongoDB also supports horizontal expansion. By adding more server nodes in the cluster, processing power and storage capacity can be increased to meet the needs of large-scale data and high concurrent access.

  3. Powerful query function: MongoDB provides a rich query language and flexible query methods. It supports various types of query operations such as range query, sorting, aggregation, and grouping. At the same time, MongoDB also supports special types of queries such as full-text search, geospatial query, and graph query to meet data retrieval needs in different application scenarios.

  4. Data replication and fault recovery: MongoDB supports data replication and fault recovery mechanisms to ensure data reliability and high availability. By configuring a replica set (replica set), data can be replicated to multiple nodes, and failover and failover can be performed automatically to ensure that the system can continue to provide services in the event of a failure.

  5. Scalability and elastic scalability: MongoDB achieves horizontal expansion through sharding. Fragmentation divides data into multiple fragments according to certain rules, and stores each fragment on a different server, realizing load balancing and seamless expansion of data.

  6. Security and access control: MongoDB provides security mechanisms to protect the confidentiality and integrity of data. It supports features such as authentication, access control, and data encryption, which can limit user access to the database and protect the security of sensitive data.

2. MongoDB basic concepts and terminology

Basic concepts of document database

A document database is a non-relational database (NoSQL) that organizes and stores data in the form of documents.

  1. Document: A document is the most basic unit in a document database, which can be a JSON format or structured text similar to JSON. Documents do not need to follow a fixed schema and can be semi-structured data. Documents are typically represented as key-value pairs, where keys are field names and values ​​can be of various data types, such as strings, numbers, booleans, arrays, and nested objects.

  2. Collection: A collection is a container for documents that can organize multiple related documents together. A collection is similar to the concept of a table in a relational database, but in a document database, the documents in a collection can have different structures, so there is more flexibility.

  3. Schema free: Document databases are schema free, which means that different documents can have different fields and structures. This makes document databases very suitable for storing semi-structured and changing data, because there is no need for predefined table structures or schemas, and fields can be added or modified flexibly as needed.

  4. Nested documents: Document databases support nested documents, that is, nesting one document within another. This means that data can be organized in a hierarchical manner so that complex data structures can be mapped directly into the database.

  5. Query language: Document-based databases provide a rich query language and flexible query methods. Various types of query operations can be performed on documents, including range query, sorting, aggregation, grouping, etc. The query language usually uses a syntax similar to SQL, but other query languages ​​designed specifically for document databases can also be used.

  6. Scalability: Document databases have good scalability, and storage capacity and processing power can be increased by adding more server nodes. Some document databases also support sharding technology, which divides data horizontally and stores them on different machines to achieve higher throughput and load balancing.

  7. High performance: Since document databases store data in document format and usually use memory-mapping technology to store data in memory, they have fast read and write performance. At the same time, the document database also supports indexing and query optimization technology, which can speed up the query.

In general, document databases store data in units of documents, with flexible data models, free schemas, nested documents, powerful query capabilities, and good scalability. It is suitable for storing semi-structured and variable data, provides high performance and flexibility, and is widely used in various types of applications.

Collections and Documents

In a document database, Collection and Document are two basic concepts used to organize and store data.

  1. Collection:
    A collection is a logical concept in a document database, similar to a table in a relational database. It is a container for a set of related documents and can contain multiple documents. Each collection has a unique name in the database that identifies the collection.
  • Collections have no fixed structure: Unlike tables in relational databases, collections do not require a fixed schema or list of fields to be defined in advance. Documents in a collection can have different structures, with the flexibility to add, modify, and remove fields as needed. This makes collections ideal for storing semi-structured and changing data.

  • Collections have independent permission control: collections can define independent permissions and access controls, and can precisely control the read and write permissions for data in the collection. This makes it easier to manage and protect data in a multi-user or multi-application environment.

  1. Document:
    A document is the basic unit stored in a document database, which can be regarded as a collection of key-value pairs, similar to a row in a relational database. Each document is a structured data object, usually represented using a JSON-like format.
  • Documents are represented using key-value pairs: a document consists of a set of key-value pairs, where the keys are field names and the values ​​can be of various data types, such as strings, numbers, booleans, arrays, nested objects, etc. This makes documents very flexible and can store complex data structures.

  • Documents do not have a fixed schema: unlike rows in relational databases, documents can have different fields and structures. Each document can define its own fields as needed, and fields can be added, deleted or modified at any time. This flexibility allows developers to dynamically adjust the data model to meet changing needs.

  • Documents can be nested: Documents support nested structures, that is, one document is nested within another document. This means that data can be organized and represented in a hierarchical manner, enabling complex data structures to be mapped directly into the database.

Collections and documents are important concepts in document databases, they provide flexible, dynamic and hierarchical data storage. Collections are used to organize multiple related documents, and documents are the most basic unit for storing and representing actual data content. Through the combined use of collections and documents, document databases can meet semi-structured and changeable data storage requirements, and provide high flexibility and scalability.

BSON data format

BSON (Binary JSON) is a binary serialized data format for storing and exchanging data in document databases. It is a lightweight, efficient data representation format, similar to JSON, but stores data in binary form, with the following characteristics:

  1. Binary format: BSON uses binary encoding to represent data. Compared with the plain text JSON format, BSON takes up less space when storing and transmitting data, and is more efficient when parsing and processing data.

  2. Supported data types: BSON supports common data types including strings, integers, floating-point numbers, Boolean values, date and time, regular expressions, arrays, and nested documents. In addition, BSON also supports special types such as binary data, object ID, timestamp, long integer, etc.

  3. Embedded documents and arrays: Similar to JSON, BSON allows nesting of other documents and arrays within documents, making it possible to represent complex data structures. Nested documents and arrays are recursively encoded as nested BSON objects.

  4. Field order: The field order in BSON is meaningful, because BSON data is encoded in the order of the fields. This means that the order of the fields remains unchanged when storing and transferring data, ensuring data consistency.

  5. Indexing and querying: Since BSON data is usually stored on disk in a binary format, document databases can use indexing and query optimization techniques to speed up data access and query operations. For example, an index can be created for a certain field to quickly locate matching documents when querying.

  6. Language support: BSON is widely supported and used as a general data serialization format. Many programming languages ​​and database systems provide libraries and drivers that interact with BSON to facilitate developers to use BSON for data processing in different environments.

In general, BSON is a binary data format for storing and exchanging data in document databases. It has an efficient and compact storage form, supports multiple data types and nested structures, and achieves fast data access through indexing and query optimization. As one of the foundations of document databases, BSON plays an important role in scenarios with large data volumes and complex data structures.

3. CRUD operation of MongoDB

create document

import org.bson.Document;
import com.mongodb.MongoClient;
import com.mongodb.client.MongoCollection;
import com.mongodb.client.MongoDatabase;

public class DocumentCreationExample {
    
    
    public static void main(String[] args) {
    
    
        // 连接到MongoDB数据库
        MongoClient mongoClient = new MongoClient("localhost", 27017);

        // 选择数据库
        MongoDatabase database = mongoClient.getDatabase("mydatabase");

        // 选择集合
        MongoCollection<Document> collection = database.getCollection("mycollection");

        // 创建文档
        Document document = new Document();
        document.append("name", "John Doe");
        document.append("age", 30);
        document.append("email", "[email protected]");

        // 将文档插入集合
        collection.insertOne(document);

        // 打印插入的文档ID
        System.out.println("Inserted document ID: " + document.get("_id"));

        // 关闭数据库连接
        mongoClient.close();
    }
}

read document

import org.bson.Document;
import com.mongodb.MongoClient;
import com.mongodb.client.MongoCollection;
import com.mongodb.client.MongoCursor;
import com.mongodb.client.MongoDatabase;

public class DocumentReadExample {
    
    
    public static void main(String[] args) {
    
    
        // 连接到MongoDB数据库
        MongoClient mongoClient = new MongoClient("localhost", 27017);

        // 选择数据库
        MongoDatabase database = mongoClient.getDatabase("mydatabase");

        // 选择集合
        MongoCollection<Document> collection = database.getCollection("mycollection");

        // 创建查询条件
        Document query = new Document();
        query.append("name", "John Doe");

        // 执行查询操作
        MongoCursor<Document> cursor = collection.find(query).iterator();

        while (cursor.hasNext()) {
    
    
            Document document = cursor.next();

            // 读取文档的字段值
            String name = document.getString("name");
            int age = document.getInteger("age");
            String email = document.getString("email");

            // 打印文档字段值
            System.out.println("Name: " + name);
            System.out.println("Age: " + age);
            System.out.println("Email: " + email);
        }

        // 关闭游标
        cursor.close();

        // 关闭数据库连接
        mongoClient.close();
    }
}

In the above example, we create an Documentobject of query conditions, here namethe field is "John Doe" for query. Next, use findmethods to perform query operations and iterate MongoCursorover the retrieved documents.

For each document, we read the value of the corresponding field using methods such as getStringand getIntegerand then print it out. Finally, the cursor and database connection are closed.

update document

import org.bson.Document;
import com.mongodb.MongoClient;
import com.mongodb.client.MongoCollection;
import com.mongodb.client.MongoDatabase;
import com.mongodb.client.model.Filters;
import com.mongodb.client.model.Updates;

public class DocumentUpdateExample {
    
    
    public static void main(String[] args) {
    
    
        // 连接到MongoDB数据库
        MongoClient mongoClient = new MongoClient("localhost", 27017);

        // 选择数据库
        MongoDatabase database = mongoClient.getDatabase("mydatabase");

        // 选择集合
        MongoCollection<Document> collection = database.getCollection("mycollection");

        // 定义更新条件
        Document query = new Document();
        query.append("name", "John Doe");

        // 定义更新操作
        Document update = new Document();
        update.append("$set", new Document("age", 35));

        // 执行更新操作
        collection.updateOne(query, update);

        // 关闭数据库连接
        mongoClient.close();
    }
}

In the above example, we update Documentthe object of the condition, here namethe field is "John Doe" to match. Next, define the object for the update operation Document, using $setthe operator to ageupdate the field to 35. Finally, use updateOnethe method to perform the update operation.

delete document

import org.bson.Document;
import com.mongodb.MongoClient;
import com.mongodb.client.MongoCollection;
import com.mongodb.client.MongoDatabase;
import com.mongodb.client.model.Filters;

public class DocumentDeleteExample {
    
    
    public static void main(String[] args) {
    
    
        // 连接到MongoDB数据库
        MongoClient mongoClient = new MongoClient("localhost", 27017);

        // 选择数据库
        MongoDatabase database = mongoClient.getDatabase("mydatabase");

        // 选择集合
        MongoCollection<Document> collection = database.getCollection("mycollection");

        // 定义删除条件
        Document query = new Document();
        query.append("name", "John Doe");

        // 执行删除操作
        collection.deleteOne(query);

        // 关闭数据库连接
        mongoClient.close();
    }
}

In the above example, we use deleteOnethe method to perform the delete operation.

If you want to delete multiple documents, you can use deleteManythe method, and provide the appropriate query conditions. Using this method will delete all documents that meet the query criteria.

Four, MongoDB query operation

basic query

query all documents
import org.bson.Document;
import com.mongodb.MongoClient;
import com.mongodb.client.MongoCollection;
import com.mongodb.client.MongoCursor;
import com.mongodb.client.MongoDatabase;

public class GetAllDocumentsExample {
    
    
    public static void main(String[] args) {
    
    
        // 连接到MongoDB数据库
        MongoClient mongoClient = new MongoClient("localhost", 27017);

        // 选择数据库
        MongoDatabase database = mongoClient.getDatabase("mydatabase");

        // 选择集合
        MongoCollection<Document> collection = database.getCollection("mycollection");

        // 执行查询操作
        MongoCursor<Document> cursor = collection.find().iterator();

        // 遍历结果
        while (cursor.hasNext()) {
    
    
            Document document = cursor.next();
            System.out.println(document.toJson());
        }

        // 关闭游标
        cursor.close();

        // 关闭数据库连接
        mongoClient.close();
    }
}
conditional query
  1. Equal: Filters out documents where the field value is equal to the given value.
// 等于条件查询示例
Document query = new Document("name", "Alice");
MongoCursor<Document> cursor = collection.find(query).iterator();
  1. Not Equal: Filters out documents where the field value is not equal to the given value.
// 不等于条件查询示例
Document query = new Document("age", new Document("$ne", 30));
MongoCursor<Document> cursor = collection.find(query).iterator();
  1. Greater Than: Filters out documents with field values ​​greater than a given value.
// 大于条件查询示例
Document query = new Document("age", new Document("$gt", 18));
MongoCursor<Document> cursor = collection.find(query).iterator();
  1. Less Than: Filters out documents with field values ​​less than a given value.
// 小于条件查询示例
Document query = new Document("age", new Document("$lt", 40));
MongoCursor<Document> cursor = collection.find(query).iterator();
  1. Greater Than or Equal: Filters out documents where the field value is greater than or equal to a given value.
// 大于等于条件查询示例
Document query = new Document("age", new Document("$gte", 20));
MongoCursor<Document> cursor = collection.find(query).iterator();
  1. Less Than or Equal: Filters out documents where the field value is less than or equal to a given value.
// 小于等于条件查询示例
Document query = new Document("age", new Document("$lte", 50));
MongoCursor<Document> cursor = collection.find(query).iterator();
  1. Contains (In): Filters out documents where the field value is in the given list of values.
// 包含条件查询示例
List<String> names = Arrays.asList("Alice", "Bob");
Document query = new Document("name", new Document("$in", names));
MongoCursor<Document> cursor = collection.find(query).iterator();
  1. Not In: Filters out documents where the field value is not in the given list of values.
// 不包含条件查询示例
List<String> names = Arrays.asList("Charlie", "Dave");
Document query = new Document("name", new Document("$nin", names));
MongoCursor<Document> cursor = collection.find(query).iterator();
  1. Regular Expression: Use regular expressions to filter out documents that match a pattern.
// 正则表达式条件查询示例
Pattern pattern = Pattern.compile("^A.*e$");
Document query = new Document("name", pattern);
MongoCursor<Document> cursor = collection.find(query).iterator();

projection query

Projection query refers to returning only the required fields during the query process, rather than returning the entire document. This method can improve query efficiency and reduce the amount of data transmitted over the network.

  1. Include: only return the specified field, other fields are not returned.
// 包含投影查询示例
Document query = new Document();
Document projection = new Document("name", 1).append("age", 1);
MongoCursor<Document> cursor = collection.find(query).projection(projection).iterator();

In the above example, only the "name" and "age" fields in the document are returned, other fields will be excluded from the results.

  1. Exclude: Returns all fields except the specified field.
// 排除投影查询示例
Document query = new Document();
Document projection = new Document("name", 0).append("address", 0);
MongoCursor<Document> cursor = collection.find(query).projection(projection).iterator();

In the above example, all fields are returned except the "name" and "address" fields.

  1. Projection of nested fields: The dot "." symbol can be used in the projection operation to represent nested fields.
// 嵌套字段的投影查询示例
Document query = new Document();
Document projection = new Document("name", 1).append("address.city", 1);
MongoCursor<Document> cursor = collection.find(query).projection(projection).iterator();

In the above example, the "name" field and the nested field "address.city" are returned, but no other nested fields are returned.

  1. Projection of array fields: Index numbers can be used in projection operations to denote specific elements in array fields.
// 数组字段的投影查询示例
Document query = new Document();
Document projection = new Document("name", 1).append("hobbies.0", 1);
MongoCursor<Document> cursor = collection.find(query).projection(projection).iterator();

In the above example, the "name" field and the first element in the array field "hobbies" are returned, and no other array elements are returned.

Sorting and paging queries

  1. Sorting query:
    sorting query can arrange the results in ascending or descending order of the specified field.
// 升序排序查询示例
Document query = new Document();
Document sort = new Document("age", 1); // 按照年龄升序排列
MongoCursor<Document> cursor = collection.find(query).sort(sort).iterator();
// 降序排序查询示例
Document query = new Document();
Document sort = new Document("name", -1); // 按照姓名降序排列
MongoCursor<Document> cursor = collection.find(query).sort(sort).iterator();

In the above example, we use sortthe method to sort the query results. 1Indicates ascending order and -1descending order.

  1. Paging query:
    Paging query is used to obtain data from the query results according to the specified number of pages and the number of records per page.
int pageNumber = 1; // 第一页
int pageSize = 10; // 每页10条记录

Document query = new Document();
Document sort = new Document("name", 1); // 按照姓名升序排列
MongoCursor<Document> cursor = collection.find(query).sort(sort)
                                       .skip((pageNumber-1) * pageSize)
                                       .limit(pageSize)
                                       .iterator();

In the above example, we use skipthe method to specify the number of records to skip (that is, the number of records on the previous page), and the limitmethod to specify the number of records per page.

aggregation query

Aggregation query is a query method used to aggregate data in document databases. It can group, filter, calculate and sort data according to specified conditions to generate statistical results or output data according to specific aggregation rules.

To perform aggregation queries, you usually need to use an aggregation pipeline, which is a pipeline consisting of multiple aggregation operations, each of which processes data in sequence and passes the results to the next operation. The following are several common aggregation operations:

  1. $match: Filter documents based on specified criteria.
// 聚合查询示例 - $match操作
List<Bson> pipeline = Arrays.asList(
        Aggregates.match(Filters.eq("status", "active"))
);
AggregateIterable<Document> results = collection.aggregate(pipeline);

In the above example, $matchan action is used to filter out documents with a "status" field value of "active".

  1. $group: Group data according to the specified field, and perform grouping operations, such as counting, summing, etc.
// 聚合查询示例 - $group操作
List<Bson> pipeline = Arrays.asList(
        Aggregates.group("$category", Accumulators.sum("totalQty", "$quantity"))
);
AggregateIterable<Document> results = collection.aggregate(pipeline);

In the above example, $groupoperations are used to group data by the "category" field, and $sumoperators are used to sum the "quantity" field for each grouping.

  1. $project: The projection operation is used to select the fields for the output.
// 聚合查询示例 - $project操作
List<Bson> pipeline = Arrays.asList(
        Aggregates.match(Filters.eq("status", "active")),
        Aggregates.project(Projections.include("name", "price"))
);
AggregateIterable<Document> results = collection.aggregate(pipeline);

In the above example, using $projectthe action to select the output fields, only the "name" and "price" fields are included.

  1. $sort: Sort the results by the specified field.
// 聚合查询示例 - $sort操作
List<Bson> pipeline = Arrays.asList(
        Aggregates.match(Filters.eq("status", "active")),
        Aggregates.sort(Sorts.descending("price"))
);
AggregateIterable<Document> results = collection.aggregate(pipeline);

In the above example, $sortthe operation is used to sort in descending order by the "price" field.

5. Data model design and indexing

Data Modeling Principles

  1. Document design: MongoDB uses documents to represent data, and a document is similar to a row of records in a relational database. When designing documents, you need to consider how to organize related data in a document to meet the query needs of your application.

  2. Redundant data: MongoDB encourages the inclusion of redundant data in documents so that queries can be satisfied more efficiently. This means that related data can be stored in the same document rather than by relating multiple tables. When deciding whether to add redundant data, you need to weigh the data update frequency and query performance.

  3. Embedding and References: In MongoDB, you can choose to embed related data into documents, or use references to relate other documents. Embedding data can improve query performance, but increases the size of the document. Using references can reduce the size of the document, but may require multiple queries to obtain complete information. Depending on the relationship between data and query requirements, there is a trade-off between using embedding and referencing.

  4. Data normalization: Unlike traditional relational databases, MongoDB does not mandate strict data normalization. Data can be organized into a form suitable for querying according to the needs of the application. This means that related information of different entities can be stored in one document to reduce the number of data accesses at query time.

  5. Query optimization: When designing your data model, you need to consider common query operations and optimize your data structures and indexes accordingly. Using appropriate field indexes can improve query performance, and putting frequently used data in one document can reduce the number of queries.

  6. Scalability: MongoDB can achieve horizontal expansion through sharding technology. When data modeling, you can consider how to design the data model to support shard deployment and evenly distribute data to each shard.

  7. Data integrity: MongoDB supports some data integrity constraints, such as unique indexes, composite indexes, validation rules, etc. When modeling, you can use these constraints to ensure the integrity and consistency of your data.

  8. Consider query patterns: When designing your data model, you need to consider frequently performed query operations and optimize your data structures and query patterns accordingly. This may require creating different collections or using different indexes to support certain types of queries.

Embedding vs. Referencing

There are two common data organization methods in MongoDB: Embedding and Referencing. These two approaches have different application scenarios and advantages and disadvantages in data modeling.

  1. Embedding:

    • Embedding is the embedding of one document within another to form a nested structure. This means that a document can contain a complete copy of other documents.
    • advantage:
      • Better performance: Since related data is stored in the same document, all related data can be obtained through a single query operation, avoiding multiple database operations. This can provide high performance when reading data.
      • Data locality: Related data is stored in the same document, making data locality better. Reduce access to other collections or documents when a document needs to be loaded.
      • Redundant data: Related data can be copied into multiple documents to avoid frequent relational query operations. This improves query performance.
    • shortcoming:
      • Redundant data: The embedding method will lead to redundant storage of data. If the embedded data changes, all documents using that data need to be updated.
      • Data Consistency: Due to the existence of redundant data, it may cause data consistency problems. If the redundant data across multiple documents is inconsistent, additional logic may be required to keep the data consistent.
  2. Referencing:

    • References are object IDs that use a field to store references to other documents or references. Through references, relationships can be established between different collections or documents.
    • advantage:
      • Data consistency: The reference method avoids data redundancy and ensures data consistency. If the referenced data changes, only one update is required.
      • Storage space: Compared with the embedding method, the reference method saves storage space because the complete related data does not need to be stored.
    • shortcoming:
      • Query performance: Citations can result in more query operations, especially when obtaining complete information associated with a referenced document, requiring additional query operations.
      • Data access latency: If related data needs to be loaded, multiple query operations may be required, which increases the latency of data access.
      • Complexity: When using the reference method, you need to deal with the logic of association query and parsing the reference data, which may increase the complexity of the code.

The choice of embedding or referencing depends on the specific application scenarios and requirements. In general, embedding is suitable for related data that is often used together, which can improve query performance and data locality; reference is suitable for scenarios with higher data consistency requirements, or needs to support complex query association operations.

index

A MongoDB index is a data structure used to improve query performance, which can speed up read operations in the database. Indexes allow the database to locate and retrieve data more quickly by creating data structures on one or more fields of a collection.

  1. Index type:

    • Single-field index: Create an index on only one field.
    • Composite Index: Create a composite index on multiple fields.
    • Text index: for full-text search.
    • Hash Index: Hash the fields.
    • Geospatial index: used to process geographically related data.
  2. Index principle:

    • Indexes are based on data structures such as B-trees or hash tables, which map the value of a field to its physical storage location.
    • The index uses a tree structure, so that eligible records can be quickly located through an algorithm similar to binary search without scanning the entire collection.
  3. Advantages of indexes:

    • Improve query performance: Using indexes can reduce disk I/O during query and speed up data reading.
    • Accelerated sorting: Indexes can be sorted by a field to improve sorting efficiency.
    • Support for unique constraints: the uniqueness of fields can be ensured through unique indexes.
    • Support for covered queries: including the required fields in the index avoids accessing the actual data.
  4. Create an index:

    • In MongoDB, createIndex()indexes are created using the method.
    • Index management can be done using command line tools, MongoDB Shell, or drivers.
  5. Notes on index use:

    • Indexes need to occupy a certain amount of storage space, so the number and size of indexes need to be weighed.
    • Appropriate indexes need to be designed according to the specific query mode and data access method.
    • Indexes have a certain performance impact on write operations, so read and write needs need to be balanced.
java example
  1. Create a single-field index: A single-field index is the simplest type of index, and it only indexes one field.
MongoCollection<Document> collection = database.getCollection("myCollection");
collection.createIndex(Indexes.ascending("fieldName"));
  1. Create multiple field indexes: Multiple field indexes allow to create compound indexes on multiple fields to support compound queries.
MongoCollection<Document> collection = database.getCollection("myCollection");
collection.createIndex(Indexes.compoundIndex(Indexes.ascending("field1"), Indexes.ascending("field2")));
  1. Create text index: Text index is used for full-text search, create index on text field to improve search performance.
MongoCollection<Document> collection = database.getCollection("myCollection");
collection.createIndex(Indexes.text("textField"));
  1. Create a hash index: The hash index is suitable for evenly distributed data, and the index is created by hashing the field.
MongoCollection<Document> collection = database.getCollection("myCollection");
collection.createIndex(Indexes.hashed("fieldName"));
  1. Create a Geospatial Index: Geospatial indexes are used to store and query geographic location data.
MongoCollection<Document> collection = database.getCollection("myCollection");
collection.createIndex(Indexes.geo2d("locationField"));
  1. Create a unique index: a unique index can guarantee the uniqueness of the field and prevent the insertion of duplicate data.
MongoCollection<Document> collection = database.getCollection("myCollection");
collection.createIndex(Indexes.ascending("fieldName"), new IndexOptions().unique(true));

6. Transaction and high availability of MongoDB

MongoDB transaction management in Java

In Java, MongoDB version 4.0 began to introduce native transaction management support. MongoDB transaction management allows developers to combine multiple operations (such as inserts, updates, and deletes) into an atomic unit of operation that either succeeds at the same time or is rolled back.

  1. Create a transaction:
    Before using a transaction, you need to create a transaction session ( ClientSession) object.
ClientSession session = mongoClient.startSession();
  1. Start a transaction:
    use the transaction session object to start a transaction, and startTransaction()start the transaction by calling the method.
session.startTransaction();
  1. Executing transactional operations:
    In a transaction, multiple database operations can be performed that are committed or rolled back as an atomic unit of operation.
collection.insertOne(session, document);  // 在事务中插入文档
collection.updateOne(session, filter, update);  // 在事务中更新文档
collection.deleteOne(session, filter);  // 在事务中删除文档
  1. Commit transaction:
    If all transaction operations are executed successfully, commitTransaction()the method can be called to commit the transaction.
session.commitTransaction();
  1. Rollback Transaction:
    If an error occurs or the transaction needs to be canceled, abortTransaction()the method can be called to rollback the transaction.
session.abortTransaction();
  1. End transaction:
    After the transaction is completed, the transaction session object needs to be closed.
session.close();

It should be noted that in order to use transaction management, ensure that the version of the Mongo driver is 4.0 or higher, and the replica set of the MongoDB server has enabled the write operation confirmation (write concern) function. In addition, transactions also need to be executed between multiple collections in the same database, and cannot span multiple databases.

Replica Sets and Replica Sets

Replica Set (Replica Set) is a mechanism used in MongoDB to provide data redundancy and high availability. It achieves redundant storage of data by replicating data between multiple MongoDB instances, and allows automatic election of a new primary node when the primary node (Primary) fails, thereby achieving high availability.

  1. The composition of the replica set:

    • Primary node (Primary): handles all write operations and reads the latest data. Each replica set can only have one primary node.
    • Slave node (Secondary): Copy the data on the master node and process read requests. There can be multiple slave nodes.
    • Arbiter: It plays the role of voting in the election, but does not store data. Arbitrator nodes do not participate in the data replication process.
  2. How replica sets work:

    • Data replication: The master node records write operations to the Oplog (operation log), and the slave node replicates data by reading the operation records in the Oplog.
    • Failure recovery: If the master node fails, the remaining slave nodes will elect a new master node to ensure that the replica set is still available.
    • Client access: Clients can directly connect to the master node for write operations and read the latest data, or connect to any slave node for read operations.
  3. Configuration of the replica set:

    • Replica set initialization: To create a replica set, you need to set the same replica set configuration on all nodes, including the node's IP address, port number, and replica set name.
    • Initial configuration process: By starting the MongoDB instance of each node and specifying the same replica set configuration, the node will automatically join the replica set.
  4. Application scenarios of replica sets:

    • Redundant backup: The replica set can realize redundant backup of data by replicating data to different nodes, improving data security and reliability.
    • High availability: When the primary node fails, the replica set can automatically elect a new primary node, thereby achieving high availability of the system and reducing system downtime.
    • Read extension: The client can perform read operations on the slave node, share the load of the master node, and improve the concurrent processing capability of the system.
    • Disaster Recovery: If a node in a replica set fails, the failed node can be quickly recovered and replaced.

Guess you like

Origin blog.csdn.net/u012581020/article/details/132432411