Depending on your requirements for your software, you might prioritize flexibility, scalability, performance, or speed. As a result, developers and businesses often get confused when picking a database for their needs. If you need a database that offers high flexibility and scalability, as well as data aggregation for customer analytics, then MongoDB may be your best choice!
In this article, we will discuss the structure of a MongoDB database and how to create, monitor, and manage your database!
MongoDB database structure
MongoDB is a schema-less NoSQL database. This means you don't specify a structure for the table/database like in SQL databases.
Did you know that NoSQL databases are actually faster than relational databases? This is due to features such as indexing, sharding, and aggregation pipelines. MongoDB is also known for its fast query execution. That's why it's favored by companies like Google, Toyota, and Forbes.
Below, we will explore some of the key features of MongoDB.
document
MongoDB has a document data model that stores data as JSON documents. Documents map naturally to objects in application code, making it more straightforward for developers to work with it.
In a relational database table, you must add a column to add a new field. This is not the case for fields in JSON documents. Fields in a JSON document can vary from document to document, so they are not added to every record in the database.
Documents can store structures like arrays, which can be nested to express hierarchical relationships. Additionally, MongoDB converts documents to Binary JSON (BSON) type. This ensures faster access and adds support for various data types such as strings, integers, booleans, and more
replica set
When you create a new database in MongoDB, the system automatically creates at least two more copies of your data. These replicas are called "replication sets" and data is constantly replicated between them, ensuring increased availability of your data. They also provide protection against downtime during system failures or planned maintenance.
gather
A collection is a set of files associated with a database. They are similar to tables in a relational database.
However, collections are much more flexible. First, they do not depend on a schema. Second, the files do not need to be of the same data type.
To view a list of collections belonging to a database, use listCollections
the command.
aggregation pipeline
You can use this framework to know several operators and expressions. It is flexible because it allows you to process, transform and analyze data of any structure.
Because of this, MongoDB allows for fast data flow and functionality across 150 operators and expressions. It also has several stages, such as the union stage, which provides the flexibility to put the results of multiple sets together.
index
You can index any field in a MongoDB document to increase its efficiency and improve query speed. Indexing saves time by scanning the index to limit the documents that are examined. Isn't this much better than reading every document in the collection?
You can use a variety of indexing strategies, including composite indexes on multiple fields. For example, suppose you have several documents that contain an employee's first and last name in different fields. If you wish to return first and last names, you can create an index that includes "last name" and "first name". This is much better than having an index on "last name" and "first name".
You can leverage tools like Performance Advisor to learn more about which queries could benefit from indexing.
Fragmentation
Sharding distributes a single data set across multiple databases. This data set can then be stored on multiple machines to increase the total storage capacity of the system. This is because it splits larger data sets into smaller chunks and stores them in different data nodes.
MongoDB shards data at the collection level, distributing documents in the collection across shards in the cluster. This ensures scalability, allowing the architecture to handle the largest applications.
Create MongoDB database
You need to first install the appropriate MongoDB package for your operating system. Go to the ' Download MongoDB Community Server ' page. From the available options, select the latest "version", the "package" format as a zip file, and "platform" as your operating system, and click "Download" as shown in the image below.
Download process of MongoDB Community Server (Picture source: MongoDB Community Server )
The process is very simple, so you'll have MongoDB installed on your system in no time!
Once you have completed the installation, open your command prompt and type in mongod -version
to verify it. If you don't get the output below and instead see a string of errors, you may want to reinstall it.
Verify MongoDB version (Image source: configserverfirewall )
Using MongoDB Shell
Before we begin, please make sure:
- Your client has Transport Layer Security and is in your IP allow list.
- You have a user account and password on the desired MongoDB cluster.
- You have MongoDB installed on your device.
Step 1: Access the MongoDB shell
To access MongoDB’s shell, type the following command.
net start MongoDB
This should produce the following output:
MongoDB server initialization (Image source: c-sharpcorner )
The previous command initializes the MongoDB server. To run it we have to type in the command prompt mongo
.
Running the MongoDB server (Image source: bmc )
In the MongoDB shell, we can execute commands to create the database, insert data, edit data, issue management commands and delete data.
Step 2: Create your database
Unlike SQL, MongoDB does not have a database creation command. Instead, there is a use
keyword called , which switches to a specific database. If the database does not exist, it will create a new database, otherwise, it will link to the existing database.
For example, to start a database named "company", type
use Company
Create database in MongoDB
You can type db
to confirm the database you just created in your system. If the new database you created pops up, you have successfully connected to it.
If you want to check for existing databases, enter show dbs
and it will return all databases in your system:
View databases in MongoDB
By default, installing MongoDB creates administrative, configuration, and local databases.
Did you notice that the database we created is not showing up? This is because we haven't saved the value to the database yet! This is our database. We will discuss insertion issues in the database management section.
Using Atlas UI
You can also start using MongoDB’s database service Atlas. While you may have to pay to use some features of Atlas, most database features are available through the free tier. The features of the free tier are more than enough for creating a MongoDB database.
Before we begin, please make sure:
- Your IP is in the allowed list.
- You have a user account and password on the MongoDB cluster you want to use.
To create a MongoDB database with AtlasUI, open a browser window and log in to https://cloud.mongodb.com. From your cluster page, click Browse Collections . If there is no database in the cluster, you can create your database by clicking the Add My Own Data button.
The prompt will ask you to provide a database and collection name. Once you've named them, click Create and you're done! Now you can enter a new file or use the driver to connect to the database.
Manage MongoDB database
In this section, we will introduce some clever ways to manage MongoDB database effectively. You can do this by using MongoDB Compass or through collections.
Use collections
While relational databases have well-defined tables with specified data types and columns, NoSQL has collections rather than tables. These collections don't have any structure, and the documents can vary - you can have different data types and fields without matching the format of another document in the same collection.
To demonstrate, let's create a collection called "Employee" and add a document to it.
db.Employee.insert(
{
"Employeename" : "Chris",
"EmployeeDepartment" : "Sales"
}
)
If the insertion is successful, it will return WriteResult({ "nInserted" : 1 })
:
Insert successfully in MongoDB
Here, "db" refers to the currently connected database. "Employee" is a newly created collection in the company database.
We do not set a primary key here because MongoDB automatically creates a primary key field named "_id" and sets a default value for it.
Run the command below to check the collection in JSON format.
db.Employee.find().forEach(printjson)
Output:
{
"_id" : ObjectId("63151427a4dd187757d135b8"),
"Employeename" : "Chris",
"EmployeeDepartment" : "Sales"
}
Although the "_id" value is automatically assigned, you can change the default primary key value. This time, we will insert another document in the "Employee" database with an "_id" value of "1".
db.Employee.insert(
{
"_id" : 1,
"EmployeeName" : "Ava",
"EmployeeDepartment" : "Public Relations"
}
)
While running db.Employee.find().forEach(printjson)
the command we get the following output:
Files in the collection with their primary keys
In the above output, the "_id" value of "Ava" is set to "1" instead of being automatically assigned a value.
Now that we have successfully added the value to the database, we can check if it shows up under the existing database in our system using the following command:
show dbs
Show list of databases
And that's it! You have successfully created a database in your system!
Using MongoDB Compass
Although we can work with MongoDB server through Mongo shell, it can be tedious at times. In a production environment, you may encounter this situation.
However, there is a Compass tool created by MongoDB (appropriately named Compass) that makes it easier. It has a better GUI and adds features such as data visualization, performance analysis, and CRUD (create, read, update, delete) access to data, databases, and collections.
You can download Compass IDE for your operating system and install it through its simple process.
Next, open the application and create a connection to the server by pasting the connection string. If you can't find it, you can click Fill in connection fields individually . If you didn't change the port number when you installed MongoDB, just click the Connect button and you're good to go. Otherwise, just enter the value you set and click Connect .
MongoDB’s new connection window (Image source: mongodb )
Next, provide the hostname, port, and authentication in the new connection window.
In MongoDB Compass, you can simultaneously create a database and add its first collection. Here's how you do it.
- Click Create Database to open the prompt.
- Enter the name of the database and its first collection.
- Click Create Database .
You can insert more documents into your database by clicking on your database name, then clicking on the name of the collection to see the Documents tab. You can then click the Add Data button to insert one or more documents into your collection.
When adding your documents, you can enter them one at a time or multiple documents in an array. If you are adding multiple documents, make sure the comma-separated documents are enclosed in square brackets. For example:
{ _id: 1, item: { name: "apple", code: "123" }, qty: 15, tags: [ "A", "B", "C" ] },
{ _id: 2, item: { name: "banana", code: "123" }, qty: 20, tags: [ "B" ] },
{ _id: 3, item: { name: "spinach", code: "456" }, qty: 25, tags: [ "A", "B" ] },
{ _id: 4, item: { name: "lentils", code: "456" }, qty: 30, tags: [ "B", "A" ] },
{ _id: 5, item: { name: "pears", code: "000" }, qty: 20, tags: [ [ "A", "B" ], "C" ] },
{ _id: 6, item: { name: "strawberry", code: "123" }, tags: [ "B" ] }
Finally, click " Insert " to add the file to your collection. This is what the body of a file looks like:
{
"StudentID" : 1
"StudentName" : "JohnDoe"
}
Here, the field names are "StudentID " and "StudentName". The field values are "1" and "JohnDoe" respectively.
Useful commands
You can manage these collections through role management and user management commands.
User management commands
MongoDB user management commands include user-related commands. We can create, update and delete users using these commands.
(1)DropUser
This command deletes a user from the specified database. Below is its syntax.
db.dropUser(username, writeConcern)
Here, username
is a required field that contains the user's authentication and access information file. An optional field writeConcern
contains the level of write attention given to the create operation. The level of write attention can writeConcern
be determined by an optional field.
Before giving up a userAdminAnyDatabase
user with a role, make sure that at least one other user has user management rights.
In this example, we will drop the user "user26" in the test database:
use test
db.dropUser("user26", {w: "majority", wtimeout: 4000})
Output:
> db.dropUser("user26", {w: "majority", wtimeout: 4000});
true
(2)createUser
This command creates a new user for the specified database, as shown below:
db.createUser(user, writeConcern)
Here, user
is a required field that contains the authentication and access information for the user to be created. An optional field writeConcern
contains the level of write attention given to the create operation. The level of write attention can writeConcern
be determined by an optional field.
If the user already exists in the database, createUser
a duplicate user error will be returned.
You can create a new user in the test database as follows:
use test
db.createUser(
{
user: "user26",
pwd: "myuser123",
roles: [ "readWrite" ]
}
);
The output is as follows:
Successfully added user: { "user" : "user26", "roles" : [ "readWrite", "dbAdmin" ] }
(3)grantRolesToUser
You can use this command to grant additional roles to users. To use it, you need to remember the following syntax:
db.runCommand(
{
grantRolesToUser: "<user>",
roles: [ <roles> ],
writeConcern: { <write concern> },
comment: <any>
}
)
You can specify user-defined and built-in roles among the roles mentioned above. If you want to specify a grantRolesToUser
role that exists in the same database you are running in, you can use a file to specify the role as follows:
{ role: "<role>", db: "<database>" }
Alternatively, you can simply specify the role by its name. For example:
"readWrite"
If you want to specify a role that exists in a different database, you must use a different file to specify the role.
To grant a role on a database, you need to use grantRole
actions on the specified database.
Here is an example to make it clear to you. Taking the user productUser00 in the product database as an example, its roles are as follows.
"roles" : [
{
"role" : "assetsWriter",
"db" : "assets"
}
]
grantRolesToUser
The action provides "productUser00" with readWrite
a role for the stock database and a read role for the product database.
use products
db.runCommand({
grantRolesToUser: "productUser00",
roles: [
{ role: "readWrite", db: "stock"},
"read"
],
writeConcern: { w: "majority" , wtimeout: 2000 }
})
The user productUser00 in the product database now has the following roles:
"roles" : [
{
"role" : "assetsWriter",
"db" : "assets"
},
{
"role" : "readWrite",
"db" : "stock"
},
{
"role" : "read",
"db" : "products"
}
]
(4)usersInfo
You can use usersInfo
commands to return information about one or more users. Here is the syntax:
db.runCommand(
{
usersInfo: <various>,
showCredentials: <Boolean>,
showCustomData: <Boolean>,
showPrivileges: <Boolean>,
showAuthenticationRestrictions: <Boolean>,
filter: <document>,
comment: <any>
}
)
{ usersInfo: <various> }
In terms of access, users can always view their own information. To see another user's information, the user running the command must have viewUser
permission to include actions in the other user's database.
When running userInfo
the command, depending on the options specified, you can get the following information.
{
"users" : [
{
"_id" : "<db>.<username>",
"userId" : <UUID>, // Starting in MongoDB 4.0.9
"user" : "<username>",
"db" : "<db>",
"mechanisms" : [ ... ], // Starting in MongoDB 4.0
"customData" : <document>,
"roles" : [ ... ],
"credentials": { ... }, // only if showCredentials: true
"inheritedRoles" : [ ... ], // only if showPrivileges: true or showAuthenticationRestrictions: true
"inheritedPrivileges" : [ ... ], // only if showPrivileges: true or showAuthenticationRestrictions: true
"inheritedAuthenticationRestrictions" : [ ] // only if showPrivileges: true or showAuthenticationRestrictions: true
"authenticationRestrictions" : [ ... ] // only if showAuthenticationRestrictions: true
},
],
"ok" : 1
}
Now that you usersInfo
have a general idea of what you can accomplish using commands, the obvious question that might arise is, what command would be handy when viewing a specific user versus multiple users?
Here are two handy examples to illustrate this point.
To view the specific permissions and information of a specific user, rather than the certificate, for the user "Anthony" defined in the "office" database, execute the following command.
db.runCommand(
{
usersInfo: { user: "Anthony", db: "office" },
showPrivileges: true
}
)
If you want to view a user currently in the database, you can only mention the user by name. For example, if you are in the master database and a user named "Timothy" exists in the master database, you can run the following command.
db.getSiblingDB("home").runCommand(
{
usersInfo: "Timothy",
showPrivileges: true
}
)
Next, if you want to see information for different users, you can use an array. You can include optional fields showCredentials
and showPrivileges
, or you can choose not to include them. This is what the command looks like.
db.runCommand({
usersInfo: [ { user: "Anthony", db: "office" }, { user: "Timothy", db: "home" } ],
showPrivileges: true
})
(5)revokeRolesFromUser
You can use revokeRolesFromUser
commands to delete one or more roles from a user in the database where the roles exist. revokeRolesFromUser
The syntax of the command is as follows:
db.runCommand(
{
revokeRolesFromUser: "<user>",
roles: [
{ role: "<role>", db: "<database>" } | "<role>",
],
writeConcern: { <write concern> },
comment: <any>
}
)
In the above mentioned syntax, you can roles
specify user-defined and built-in roles in fields. Similar to grantRolesToUser
the command, you can specify the role you want to revoke in the file, or use its name.
In order to successfully execute revokeRolesFromUser
the command, you need to have an action on the specified database revokeRole
.
Here's an example to illustrate the problem. Entities in the product database productUser00
have the following roles:
"roles" : [
{
"role" : "assetsWriter",
"db" : "assets"
},
{
"role" : "readWrite",
"db" : "stock"
},
{
"role" : "read",
"db" : "products"
}
]
The following revokeRolesFromUser
command will remove two of the user's roles: products
the "read" role and the "assets" assetsWriter
role of the database:
use products
db.runCommand( { revokeRolesFromUser: "productUser00",
roles: [
{ role: "AssetsWriter", db: "assets" },
"read"
],
writeConcern: { w: "majority" }
} )
The user "productUser00" in the product database now has only one remaining role:
"roles" : [
{
"role" : "readWrite",
"db" : "stock"
}
]
Role management commands
Roles grant users access to resources. Administrators can use several built-in roles to control access to the MongoDB system. If these roles do not cover the required permissions, you can even go a step further and create new roles within a database.
(1)DropRole
With dropRole
the command, you can delete a user-defined role from the database in which the command is run. To execute this command, use the following syntax:
db.runCommand(
{
dropRole: "<role>",
writeConcern: { <write concern> },
comment: <any>
}
)
In order to execute successfully, you must have an operation on the specified database dropRole
. The following operation will delete the role from the Products database writeTags
:
use products
db.runCommand(
{
dropRole: "writeTags",
writeConcern: { w: "majority" }
}
)
(2)createRole
You can use createRole
commands to create a role and specify its permissions. This role will apply to the database you choose to run this command on. If the role already exists in the database, createRole
the command returns a duplicate role error.
To execute this command, follow the given syntax.
db.adminCommand(
{
createRole: "<new role>",
privileges: [
{ resource: { <resource> }, actions: [ "<action>", ... ] },
],
roles: [
{ role: "<role>", db: "<database>" } | "<role>",
],
authenticationRestrictions: [
{
clientSource: ["<IP>" | "<CIDR range>", ...],
serverAddress: ["<IP>" | "<CIDR range>", ...]
},
],
writeConcern: <write concern document>,
comment: <any>
}
)
The permissions of a role will apply to the database in which the role is created. Roles can inherit permissions from other roles in their database. For example, a role created on the "admin" database can include permissions that apply to the cluster or to all databases. It can also inherit permissions from roles in other databases.
To create a role in a database, you need two things:
- An action on this database
grantRole
, mentioning the permissions of the new role, and mentioning the role to be inherited. - Perform operations on this database resource
createRole
.
The following createRole
command will create a role on the user database clusterAdmin
.
db.adminCommand({ createRole: "clusterAdmin",
privileges: [
{ resource: { cluster: true }, actions: [ "addShard" ] },
{ resource: { db: "config", collection: "" }, actions: [ "find", "remove" ] },
{ resource: { db: "users", collection: "usersCollection" }, actions: [ "update", "insert" ] },
{ resource: { db: "", collection: "" }, actions: [ "find" ] }
],
roles: [
{ role: "read", db: "user" }
],
writeConcern: { w: "majority" , wtimeout: 5000 }
})
(3)grantRolesToRole
Through grantRolesToRole
the command, you can grant a role to a user-defined role. grantRolesToRole
The command affects the role in the database that executes the command.
The syntax of this grantRolesToRole
command is as follows:
db.runCommand(
{
grantRolesToRole: "<role>",
roles: [
{ role: "<role>", db: "<database>" },
],
writeConcern: { <write concern> },
comment: <any>
}
)
Access permissions grantRolesToUser
are similar to commands – you need an action on the database grantRole
to properly execute the command.
In the following example, you can use grantRolesToUser
the command to update the role in the "products" database productsReader
so that it inherits productsWriter
the role's permissions:
use products
db.runCommand(
{
grantRolesToRole: "productsReader",
roles: [
"productsWriter"
],
writeConcern: { w: "majority" , wtimeout: 5000 }
}
)
(4)revokePrivilegesFromRole
You can use revokePrivilegesFromRole
to remove specified permissions from a user-defined role on the database where the command is executed. In order to execute it correctly, you need to remember the following syntax:
db.runCommand(
{
revokePrivilegesFromRole: "<role>",
privileges: [
{ resource: { <resource> }, actions: [ "<action>", ... ] },
],
writeConcern: <write concern document>,
comment: <any>
}
)
To revoke a permission, the "resource document" pattern must match the permission's "resource" field. The "actions" field can be an exact match or a subset.
For example, consider a role in the product database manageRole
that has the following privileges, specifying the "managers" database as a resource:
{
"resource" : {
"db" : "managers",
"collection" : ""
},
"actions" : [
"insert",
"remove"
]
}
You cannot just undo an "insert" or "remove" operation from a collection in the manager database. The following operations will not result in role changes.
use managers
db.runCommand(
{
revokePrivilegesFromRole: "manageRole",
privileges: [
{
resource : {
db : "managers",
collection : "kiosks"
},
actions : [
"insert",
"remove"
]
}
]
}
)
db.runCommand(
{
revokePrivilegesFromRole: "manageRole",
privileges:
[
{
resource : {
db : "managers",
collection : "kiosks"
},
actions : [
"insert"
]
}
]
}
)
To undo a role manageRole
's "insert" and/or "remove" operations, you need an exact match to the resource file. For example, the following operation only revokes the "remove" operation in the existing permissions.
use managers
db.runCommand(
{
revokePrivilegesFromRole: "manageRole",
privileges:
[
{
resource : {
db : "managers",
collection : ""
},
actions : [ "remove" ]
}
]
}
)
The following operation will remove multiple permissions from the "executive" role in the administrator database:
use managers
db.runCommand(
{
revokePrivilegesFromRole: "executive",
privileges: [
{
resource: { db: "managers", collection: "" },
actions: [ "insert", "remove", "find" ]
},
{
resource: { db: "managers", collection: "partners" },
actions: [ "update" ]
}
],
writeConcern: { w: "majority" }
}
)
(5)rolesInfo
rolesInfo
The command returns permissions and inheritance information for the specified role, including built-in and user-defined roles. You can also use rolesInfo
the command to retrieve all role scopes in a database.
For correct execution, please follow this syntax:
db.runCommand(
{
rolesInfo: { role: <name>, db: <db> },
showPrivileges: <Boolean>,
showBuiltinRoles: <Boolean>,
comment: <any>
}
)
To return information about a role from the current database, you can specify its name as follows.
{ rolesInfo: "<rolename>" }
To return information about a role from another database, you can reference the role with a file that references the role and the database:
{ rolesInfo: { role: "<rolename>", db: "<database>" } }
For example, the following command returns role inheritance information for role executors defined in the managers database.
db.runCommand(
{
rolesInfo: { role: "executive", db: "managers" }
}
)
The next command will return role inheritance information: on the database where the command was executed accountManager
.
db.runCommand(
{
rolesInfo: "accountManager"
}
)
The following command will return the permissions and role inheritance of the role "executive" defined in the administrator database.
db.runCommand(
{
rolesInfo: { role: "executive", db: "managers" },
showPrivileges: true
}
)
To mention multiple roles, you can use an array. You can also mention each role as a string or document in an array.
You should only use strings if the role exists in the database where the command is executed.
{
rolesInfo: [
"<rolename>",
{ role: "<rolename>", db: "<database>" },
]
}
For example, the following command will return information for three roles in three different databases.
db.runCommand(
{
rolesInfo: [
{ role: "executive", db: "managers" },
{ role: "accounts", db: "departments" },
{ role: "administrator", db: "products" }
]
}
)
You can inherit permissions and roles in the following ways.
db.runCommand(
{
rolesInfo: [
{ role: "executive", db: "managers" },
{ role: "accounts", db: "departments" },
{ role: "administrator", db: "products" }
],
showPrivileges: true
}
)
Embed MongoDB documents to improve performance
Document databases like MongoDB let you define your schema according to your needs. To create the best schema in MongoDB, you can nest documents. Therefore, instead of matching your application to a data model, you build a data model that matches your use case.
Embedded files let you store related data that you can access together. When designing a schema for MongoDB, it is recommended that you embed documents by default. Use database-side or application-side connections and references only when worthwhile.
Ensure that the workload can retrieve documents as often as needed. At the same time, the document should also have all the data it needs. This is critical to your application's superior performance:
Below, you'll find some different modes for embedding documents.
Embedded document mode
You can use it to embed even complex substructures into the document in which they are used. Embedding connected data in a single document reduces the number of read operations required to obtain the data. In general, you should structure your schema so that your application receives all the required information in a single read operation. So the rule to remember here is that things used together should be stored together.
embedded subset pattern
Embedded subset mode is a mixed case. You would use this on a separate collection of long lists of related items, some of which you could keep on hand for display.
Here is an example that lists movie reviews.
> db.movie.findOne()
{
_id: 321475,
title: "The Dark Knight"
}
> db.review.find({movie_id: 321475})
{
_id: 264579,
movie_id: 321475,
stars: 4
text: "Amazing"
}
{
_id: 375684,
movie_id: 321475,
stars:5,
text: "Mindblowing"
}
Now, imagine there are a thousand similar reviews, but you only plan to show the most recent two when the movie is shown. In this case, it makes sense to store this subset as a list in the movie document.
> db.movie.findOne({_id: 321475})
{
_id: 321475,
title: "The Dark Knight",
recent_reviews: [
{_id: 264579, stars: 4, text: "Amazing"},
{_id: 375684, stars: 5, text: "Mindblowing"}
]
}
Simply put, if you frequently access a subset of related projects, make sure you embed it.
independent access
You may want to store child files in their collection, separate from their parent collection.
For example, take a company's product line. If your company sells a small set of products, you may want to store them in a company file. But you'll also want to store them in their collections if you want to reuse them across different companies, or access them directly by their stock-keeping units (SKUs).
If you manipulate or access an entity independently, for best practices, make a collection to store it individually.
unbounded list
There is a drawback to storing short lists of relevant information in their documents. If your list continues to grow without limit, you shouldn't put it in a document. This is because you won't be able to support it for very long.
There are two reasons for this. First, MongoDB has a limit on the size of a single document. Second, if you access the document too frequently, you will see the negative consequences of uncontrolled memory usage.
Simply put, if a list starts growing without limit, make a collection to store it individually.
extended reference schema
Extended reference mode is the same as subset mode. It also optimizes your frequent access to information stored on documents.
Here, it is exploited when a document refers to another document that exists in the same collection, rather than a list. At the same time, it also stores some fields of other documents for easy access at any time.
For example:
> db.movie.findOne({_id: 245434})
{
_id: 245434,
title: "Mission Impossible 4 - Ghost Protocol",
studio_id: 924935,
studio_name: "Paramount Pictures"
}
As you can see, "the studio_id" is stored so that you can query more information about the studio that created the video. But for the sake of simplicity, the name of the studio has also been copied into this document.
To regularly embed information from modified documents, remember to update the document in which you copied the information as you modify it. In other words, if you frequently access fields from a referenced file, embed them.
Monitor MongoDB database
You can use some monitoring tools to debug long API calls, slow database queries, long external URL requests, etc. You can even use commands to improve database performance. You can also use them to check the health of your database instance.
Why monitor MongoDB database?
A key aspect of database management planning is monitoring the performance and health of your cluster. MongoDB Atlas handles most of the management work through its fault tolerance/scalability capabilities.
Nonetheless, users need to know how to track the cluster. They should also know how to scale or adapt what they need before they hit a crisis.
By monitoring a MongoDB database, you can.
- Observe resource utilization.
- Know the current capacity of your database.
- React and detect real-time issues to enhance your application stack.
- Observe for performance issues and unusual behavior.
- Align with your governance/data protection and service level agreement (SLA) requirements.
Key indicators to monitor
There are four key aspects you need to remember when monitoring MongoDB.
1. MongoDB hardware indicators
The following are the main indicators of monitoring hardware.
(1) Normalized process CPU
It is defined as the percentage of time the CPU is used by the application to maintain the MongoDB process.
You can expand this to a range of 0-100% by dividing it by the number of CPU cores. It includes the CPU utilized by modules such as kernel and user.
High core CPU may indicate CPU exhaustion through operating system operations. But users related to MongoDB operations can be the root cause of CPU exhaustion.
(2) Normalized system CPU
This is the percentage of time the CPU spent servicing system calls for this MongoDB process. You can expand this to a range of 0-100% by dividing it by the number of CPU cores. It also covers the CPU used by iowait, user, kernel, steal and other modules.
User CPU or high cores may show CPU exhaustion through MongoDB operations (software). High iowait may be related to memory exhaustion leading to CPU exhaustion.
(3) Disk IOPS
Disk IOPS is the average number of IO operations consumed per second on a MongoDB disk partition.
(4) Disk delay
This is the read and write disk latency of disk partitions in MongoDB, measured in milliseconds. High values (>500ms) indicate that the storage layer may impact MongoDB performance.
(5) System memory
Use System Memory to describe the bytes of physical memory used versus the free space available.
The available metric approximates the number of available bytes of system memory. You can use it to execute new applications, no swap required.
(6) Available disk space
This is defined as the total number of bytes of free disk space on a MongoDB disk partition. MongoDB Atlas provides automatic scaling based on this metric.
(7) Exchange usage
You can use the swap usage graph to describe how much memory is placed on the swap device. In this chart, a high usage indicator indicates that the exchange is being exploited. This indicates that memory is under-provisioned for the current workload.
2. Connection and operation indicators of MongoDB cluster
Below are the key metrics for operational and connectivity metrics.
(1) Operation execution time
The average operation time (write and read operations) performed during the selected sample.
(2) Number of operations
This is the average rate of operations performed per second during the selected sample period. The number of operations graph/metric shows the breakdown of operation types and the speed of the instance.
(3) Number of connections
This metric refers to the number of open connections to the instance. High spikes or numbers may point to a suboptimal connection strategy, whether from an unresponsive server or client.
(4) Query target and query executor
This is the average rate per second during the selected sample of scanned files. For the query executor, this is during query plan evaluation and during the query. The query target shows the ratio between the number of files scanned and the number of files returned.
A high numerical ratio points to suboptimal operation. These operations scan large amounts of documents to return smaller portions.
(5)Scan and sequence
It describes the average rate per second over the selected query sample period. It returns sorted results and you cannot use indexes to perform sorting operations.
(6) Queue
A queue can describe the number of operations, whether writes or reads, waiting for a lock. High queues may indicate a suboptimal schema design. It may also indicate conflicting write paths, driving high contention for database resources.
3. MongoDB replica set metrics
The following are the main metrics for replica set monitoring.
(1) Replica set Oplog window
This metric lists the approximate number of hours available in the master's replica set oplog. If a secondary system has more than this amount of lag, it won't be able to keep up and will need a full resynchronization.
(2) Replica set lags
Replica set lag is defined as the approximate number of seconds a secondary node lags behind the primary node in write operations. High replica set lag will point to a secondary node facing difficulty in the replica set. Given the read/write issues with the connection, it may affect the latency of your operations.
(3) Replica set balance
This metric refers to the difference between the primary replica set’s oplog window and the secondary replica set’s lag. If this value is zero, it may cause the secondary system to enter RECOVERING mode.
(4)Opcounters -repl
Opcounters -repl is defined as the average rate of replica set operations performed per second over the selected sample period. With opcounters -graph/metric, you can take a look at the breakdown of operation speed and operation type for a given instance.
(5)Oplog GB/hour
This is defined as the average rate of OPLOG generated by the main system per hour. High unexpected volumes in the oplog may point to a highly inadequate write workload or a schema design issue.
MongoDB performance monitoring tool
MongoDB has built-in user interface tools in Cloud Manager, Atlas, and Ops Manager for performance tracking. It also provides some independent commands and tools to view more raw-based data. We'll talk about some tools you can run from a host with permissions and the appropriate roles to inspect your environment.
mongotop
You can use this command to track the time a MongoDB instance spends writing and reading data in each collection. Use the following syntax:
mongotop <options> <connection-string> <polling-interval in seconds>
rs.status()
This command returns the status of the replica set. It is performed from the perspective of the member executing the method.
mongostat
You can use mongostat
commands to quickly understand the status of your MongoDB server instance. For best output, you can use it to observe specific events of a single instance as it provides a real-time view.
Use this command to monitor basic server statistics such as lock queues, operation breakdown, MongoDB memory statistics, and connections/network.
mongostat <options> <connection-string> <polling interval in seconds>
dbStats
This command returns storage statistics for a specific database, such as the number of indexes and their sizes, total collection data and storage size, and collection-related statistics (number of collections and files).
db.serverStatus()
You can use db.serverStatus()
commands to understand the status of the database. It gives you a file representing the current instance's metric counters. Execute this command every once in a while to compile statistics about the instance.
collStats
collStats
The command collects dbStats
statistics similar to those provided at the collection level. Its output includes the number of objects in the collection, the disk space consumed by the collection, the size of the collection, and information about the index for a specific collection.
You can use all of these commands to provide real-time reporting and monitoring of your database server, allowing you to monitor database performance and errors and assist in making informed decisions to improve your database.
Delete MongoDB database
To drop a database you created in MongoDB, you need to connect to it via the use keyword.
Suppose you create a database called "Engineers". In order to connect to this database, you will use the following command:
use Engineers
Next, enter db.dropDatabase()
to get rid of this database. After execution, here's what you can expect.
{ "dropped" : "Engineers", "ok" : 1 }
You can run showdbs
the command to verify that the database still exists.
summary
In order to squeeze every drop of value out of MongoDB, you must have a deep understanding of the basics. Therefore, it is very crucial to know the MongoDB database well. This requires you to first be familiar with how to create a database.
In this article, we clarify the different ways to create a database in MongoDB, and then detail some neat MongoDB commands to keep you on top of your database. Finally, we wrap up by discussing how to leverage embedded documentation and performance monitoring tools in MongoDB to ensure your workflows are operating at peak efficiency.