1. MongoDB Sharding technology
Sharding is a method used by MongoDB to split a large collection into different servers (or a cluster). Although partitioning originates from relational database partitions, MongoDB partitioning is completely different.
Unlike the MySQL partitioning scheme, the biggest difference between MongoDB is that it can almost automatically complete everything. As long as it tells MongoDB to allocate data, it can automatically maintain the balance of data between different servers.
Second, MongoDB segment introduction
Separate. The database application of high data volume and throughput will cause greater pressure on the performance of the stand-alone. The amount of queries will exhaust the CPU of the stand-alone, and the amount of data will compare with the storage pressure of the stand-alone. In the end, the system's memory will eventually be exhausted and the pressure will be transferred to the disk IO.
In order to solve these problems, there are two basic methods: vertical expansion and horizontal expansion.
Vertical expansion: add more CPU and storage resources to expand capacity.
Horizontal expansion: Distribute the data set on multiple servers. Flat extension means split
3. Shard design ideas
Segmentation provides a method to cope with high throughput and large data volume. Using segments reduces the number of requests that need to be processed per segment, so by horizontally scaling, the cluster can increase its storage capacity and throughput.
For example, when inserting a piece of data, the application only needs to access the segment where the data is stored. Using the segment reduces the data stored in each segment.
For example, if the database has a 1tb data set and there are 4 segments, then each segment may only hold 256GB of data. If there are 40 segments, each segment may only have 25GB of data.
4. The segmentation mechanism provides the following three advantages
1. Abstract the cluster and make the cluster "unable" MongoDB comes with a proprietary routing process called mongos. Mongos is a router that masters a unified route. It will accurately and incorrectly route the requests sent by the client to one or a group of servers in the cluster. At the same time, it will assemble the received responses and send them back to the client. .
2. To ensure that the cluster is always readable and writable MongoDB has multiple ways to ensure the availability and reliability of the cluster. The use of MongoDB's segmentation and replication functions together, while ensuring that data is distributed to multiple servers, also ensures that each minute of data has a corresponding backup, so that when the server is replaced, other slaves You can continue the operation immediately after replacing the broken part.
3. Make the cluster easy to expand When the system needs more space and resources, MongoDB allows us to expand the system capacity as needed.
5. Sharded cluster architecture
Sharded cluster structure
1. mongos: module for data routing and dealing with clients. Mongos itself does not have any data, he does not know how to deal with this data, go to the config server
2. config server: all methods of storing and retrieving data, information of all shard nodes, and some configuration information of sub-functions. It can be understood as the metadata of real data.
3. shard: the real data storage location, storing data in chunks.
Mongos itself does not persist data, all the metadata of the Sharded cluster will be stored in the Config Server, and the user's data conferences will be distributed to each shard.
After Mongos starts, it will load metadata from the configuration server, start providing services, and correctly route user requests to the corresponding fragments.
Six, Mongos routing function
When the data is written, MongoDB Cluster writes the data according to the split key design. . When an external sentence initiates a data query, MongoDB automatically routes to the specified node to return data based on the data distribution.
7. Deploy sub-cluster
mongodb sub-cluster actual combat environment construction instructions
Environmental preparation:
System version: CentOS 7
Software version: MongoDB4.0
Close the firewall and selinux
mkdir /data/mongodb/{28017..28019}
mkdir /data/mongodb/{27017..27019}
mkdir /data/mongodb/{29017..29020}
1.configsvr configuration
1.) Create a configuration file
systemLog:
systemLog: destination: file logAppend: true path: /data/mongodb/28017/mongodb.log storage: dbPath: /data/mongodb/28017 journal: enabled: true processManagement: fork: true net: port: 28017 bindIp: 127.0.0.1 replication: replSetName: testconf sharding: clusterRole: configsvr
The other two nodes modify the corresponding ports to start the instance
mongod -f /data/mongodb/28017/mongodb.conf
2.) The configuration of the sub-cluster and the construction of the replica set
config={_id:"testconf", configsvr:true,members:[ {_id:0,host:"127.0.0.1:28017"}, {_id:1,host:"127.0.0.1:28018"}, {_id:2,host:"127.0.0.1:28019"}] }
rs.initiate(config)
2. router configuration The router page in mongodb is only responsible for providing a hand, does not store any data to create a configuration file mongodb.conf
systemLog: destination: file logAppend: true path: /data/mongodb/27017/mongodb.log processManagement: fork: true net: port: 27017 bindIp: 127.0.0.1 sharding: configDB: testconf/127.0.0.1:28017,127.0.0.1:28018,127.0.0.1:28019
Other nodes modify the most important configuration of the corresponding port router to specify the address of configsvr, use the copy set id + ip port to specify the configuration of multiple routers, and any one can get the data normally
start up:
mongos -f /data/mongodb/27017/mongodb.conf
Router verification needs to wait until the data is built before it can be verified
3. shardsvr configuration
Data page The data of the sub-cluster cluster stores the real data, so the data page must be set to use multiple data sets in the replica set.
systemLog: destination: file logAppend: true path: /data/mongodb/29017/mongodb.log storage: dbPath: /data/mongodb/29017 journal: enabled: true processManagement: fork: true net: port: 29017 bindIp: 127.0.0.1 replication: replSetName: testdata1 sharding: clusterRole: shardsvr
Create another configuration file
cp 29017/mongodb.conf 29018/ sed -i "s/29017/29018/g" 29018/mongodb.conf
Other nodes modify the corresponding port
Create configuration file 29019 / mongodb.conf
systemLog: destination: file logAppend: true path: /data/mongodb/29019/mongodb.log storage: dbPath: /data/mongodb/29019 journal: enabled: true processManagement: fork: true net: port: 29019 bindIp: 127.0.0.1 replication: replSetName: testdata2 sharding: clusterRole: shardsvr
Other nodes modify the corresponding port
cp 29019/mongodb.conf 29020/ sed -i "s/29019/29020/g" 29020/mongodb.conf
Start four instances
mongod -f /data/mongodb/29017/mongodb.conf mongod -f /data/mongodb/29018/mongodb.conf mongod -f /data/mongodb/29019/mongodb.conf mongod -f /data/mongodb/29020/mongodb.conf
4. Configuration page
29017-29018 Create a copy set
mongo 127.0.0.1:29017 config={_id:"testdata1", members:[ {_id:0,host:"127.0.0.1:29017"}, {_id:1,host:"127.0.0.1:29018"} ] }
rs.initiate(config)
29019-29020 Create a copy
mongo 127.0.0.1:29019 config={_id:"testdata2", members:[ {_id:0,host:"127.0.0.1:29019"}, {_id:1,host:"127.0.0.1:29020"} ] }
rs.initiate(config)
5. Use of MongoDB sub-cluster
mongo 127.0.0.1:27017 sh.addShard("testdata1/127.0.0.1:29017,127.0.0.1:29018")
sh.addShard("testdata2/127.0.0.1:29019,127.0.0.1:29020")
sh.status()
5. Verification data (operation on 27017)
1. Try to add data
use mytest; switched to db mytest for (i=1;i<=500;i++){db.myuser.insert({name:"test1",age:i})} db.myuser.find().count()
View information
You can only view the documents stored in a set of copies, because there is no sharding, so the data only exists on one
2. Shard correctly
use admin
db.runCommand({enablesharding:"mytest"})
db.runCommand({shardcollection:"mytest.test1",key:{_id:"hashed"}})
Add data verification
27017 add data
View data
View data
Eight, MongoDB monitoring commands
mongostat
mongostat is MongoDB's own status detection tool, which can monitor the status of MongoDB in real time. View help
mongostat --help
The command is used, -n specifies the number of prints
mongostat --host 127.0.0.1 --port 27017 -n 3
Nine, serverstatus
https://docs.mongodb.com/manual/reference/command/serverStatus/#dbcmd.serverStatus
Calling serverStatus in the MongoDB Shell can get the status information of MongoDB.
It is recommended that you focus on traffic information, connection information, and add, delete, modify, and check information.
> db.serverStatus () \ * View all monitoring status * \ > db.serverStatus (). network \ * View network traffic information * \ > db.serverStatus (). opcounters \ * Count of additions, deletions and changes * \ { "insert": NumberLong (189556), "query": NumberLong (14), "update": NumberLong (0), "delete": NumberLong (0), "getmore": NumberLong (0), "command": NumberLong (190063) } > db.serverStatus (). Connections \ * Statistics Connection * \ {"current": 3, "available": 816, "totalCreated": 8, "active": 1}
⾮ Get in interactive mode
echo 'db.serverStatus().opcounters' |mongo --host 127.0.0.1:27017 MongoDB shell version v4.2.3 connecting to: mongodb://127.0.0.1:27017/? compressors=disabled&gssapiServiceName=mongodb Implicit session: session { "id" : UUID("b1f33ce1-34ec-4fbf-b8f1-f5943302ce50") } MongoDB server version: 4.2.3 { "insert" : NumberLong(189556), "query" : NumberLong(14), "update" : NumberLong(0), "delete" : NumberLong(0), "getmore" : NumberLong(0), "command" : NumberLong(190190) } bye