MongoDB Sharding technology

1. MongoDB Sharding technology

            Sharding is a method used by MongoDB to split a large collection into different servers (or a cluster). Although partitioning originates from relational database partitions, MongoDB partitioning is completely different.

            Unlike the MySQL partitioning scheme, the biggest difference between MongoDB is that it can almost automatically complete everything. As long as it tells MongoDB to allocate data, it can automatically maintain the balance of data between different servers.

 

Second, MongoDB segment introduction

               Separate. The database application of high data volume and throughput will cause greater pressure on the performance of the stand-alone. The amount of queries will exhaust the CPU of the stand-alone, and the amount of data will compare with the storage pressure of the stand-alone. In the end, the system's memory will eventually be exhausted and the pressure will be transferred to the disk IO. 

          In order to solve these problems, there are two basic methods: vertical expansion and horizontal expansion.

         Vertical expansion: add more CPU and storage resources to expand capacity.

         Horizontal expansion: Distribute the data set on multiple servers. Flat extension means split

 

3. Shard design ideas

            Segmentation provides a method to cope with high throughput and large data volume. Using segments reduces the number of requests that need to be processed per segment, so by horizontally scaling, the cluster can increase its storage capacity and throughput.

           For example, when inserting a piece of data, the application only needs to access the segment where the data is stored. Using the segment reduces the data stored in each segment.

           For example, if the database has a 1tb data set and there are 4 segments, then each segment may only hold 256GB of data. If there are 40 segments, each segment may only have 25GB of data.

 

4. The segmentation mechanism provides the following three advantages

             1. Abstract the cluster and make the cluster "unable" MongoDB comes with a proprietary routing process called mongos. Mongos is a router that masters a unified route. It will accurately and incorrectly route the requests sent by the client to one or a group of servers in the cluster. At the same time, it will assemble the received responses and send them back to the client. .

             2. To ensure that the cluster is always readable and writable MongoDB has multiple ways to ensure the availability and reliability of the cluster. The use of MongoDB's segmentation and replication functions together, while ensuring that data is distributed to multiple servers, also ensures that each minute of data has a corresponding backup, so that when the server is replaced, other slaves You can continue the operation immediately after replacing the broken part.

             3. Make the cluster easy to expand When the system needs more space and resources, MongoDB allows us to expand the system capacity as needed.

5. Sharded cluster architecture

        

           Sharded cluster structure

              1. mongos: module for data routing and dealing with clients. Mongos itself does not have any data, he does not know how to deal with this data, go to the config server 

             2. config server: all methods of storing and retrieving data, information of all shard nodes, and some configuration information of sub-functions. It can be understood as the metadata of real data.

             3. shard: the real data storage location, storing data in chunks.

                Mongos itself does not persist data, all the metadata of the Sharded cluster will be stored in the Config Server, and the user's data conferences will be distributed to each shard.

                After Mongos starts, it will load metadata from the configuration server, start providing services, and correctly route user requests to the corresponding fragments.

 

Six, Mongos routing function

         When the data is written, MongoDB Cluster writes the data according to the split key design. . When an external sentence initiates a data query, MongoDB automatically routes to the specified node to return data based on the data distribution.

 

7. Deploy sub-cluster

                mongodb sub-cluster actual combat environment construction instructions

                     Environmental preparation:

                          System version: CentOS 7

                          Software version: MongoDB4.0

                          Close the firewall and selinux

                   

                   mkdir /data/mongodb/{28017..28019}

                   mkdir /data/mongodb/{27017..27019}

                   mkdir /data/mongodb/{29017..29020}

 1.configsvr configuration

                 1.) Create a configuration file                 

                      systemLog:

systemLog:
 destination: file
 logAppend: true
 path: /data/mongodb/28017/mongodb.log
storage:
 dbPath: /data/mongodb/28017
 journal:
   enabled: true
processManagement:
 fork: true
net:
 port: 28017
 bindIp: 127.0.0.1
replication:
 replSetName: testconf
sharding:
 clusterRole: configsvr

    The other two nodes modify the corresponding ports to start the instance

       

       mongod -f /data/mongodb/28017/mongodb.conf

 

    2.) The configuration of the sub-cluster and the construction of the replica set

config={_id:"testconf",
configsvr:true,members:[
{_id:0,host:"127.0.0.1:28017"},
{_id:1,host:"127.0.0.1:28018"},
{_id:2,host:"127.0.0.1:28019"}]
}

   

   rs.initiate(config)

         

 

 2. router configuration The router page in mongodb is only responsible for providing a hand, does not store any data to create a configuration file mongodb.conf

systemLog:
 destination: file
 logAppend: true
 path: /data/mongodb/27017/mongodb.log
processManagement:
 fork: true
net:
 port: 27017
 bindIp: 127.0.0.1
sharding:
 configDB: testconf/127.0.0.1:28017,127.0.0.1:28018,127.0.0.1:28019

          Other nodes modify the most important configuration of the corresponding port router to specify the address of configsvr, use the copy set id + ip port to specify the configuration of multiple routers, and any one can get the data normally

           start up:             

 mongos -f /data/mongodb/27017/mongodb.conf

               Router verification needs to wait until the data is built before it can be verified

 

  3. shardsvr configuration

         Data page The data of the sub-cluster cluster stores the real data, so the data page must be set to use multiple data sets in the replica set.

systemLog:
 destination: file
 logAppend: true
 path: /data/mongodb/29017/mongodb.log
storage:
 dbPath: /data/mongodb/29017
 journal:
    enabled: true
processManagement:
 fork: true
net:
 port: 29017
 bindIp: 127.0.0.1
replication:
 replSetName: testdata1
sharding:
 clusterRole: shardsvr

       Create another configuration file

cp 29017/mongodb.conf  29018/
sed -i "s/29017/29018/g" 29018/mongodb.conf

         Other nodes modify the corresponding port

    Create configuration file 29019 / mongodb.conf

systemLog:
 destination: file
 logAppend: true
 path: /data/mongodb/29019/mongodb.log
storage:
 dbPath: /data/mongodb/29019
 journal:
    enabled: true
processManagement:
 fork: true
net:
 port: 29019
 bindIp: 127.0.0.1
replication:
 replSetName: testdata2
sharding:
 clusterRole: shardsvr

  Other nodes modify the corresponding port

cp 29019/mongodb.conf  29020/
sed -i "s/29019/29020/g" 29020/mongodb.conf

          Start four instances

mongod -f /data/mongodb/29017/mongodb.conf
mongod -f /data/mongodb/29018/mongodb.conf
mongod -f /data/mongodb/29019/mongodb.conf
mongod -f /data/mongodb/29020/mongodb.conf

  

 4. Configuration page

    29017-29018 Create a copy set

mongo 127.0.0.1:29017
 config={_id:"testdata1",
members:[
{_id:0,host:"127.0.0.1:29017"},
{_id:1,host:"127.0.0.1:29018"}
]
}

  

rs.initiate(config)

  

 29019-29020 Create a copy

 mongo 127.0.0.1:29019
config={_id:"testdata2",
members:[
{_id:0,host:"127.0.0.1:29019"},
{_id:1,host:"127.0.0.1:29020"}
]
}

          

 rs.initiate(config)

  

 

5. Use of MongoDB sub-cluster

 mongo 127.0.0.1:27017
 sh.addShard("testdata1/127.0.0.1:29017,127.0.0.1:29018")

  

 sh.addShard("testdata2/127.0.0.1:29019,127.0.0.1:29020")

  

sh.status()

 

5. Verification data (operation on 27017)

    1. Try to add data

use mytest;
switched to db mytest
for (i=1;i<=500;i++){db.myuser.insert({name:"test1",age:i})}
db.myuser.find().count()

    

           View information

             You can only view the documents stored in a set of copies, because there is no sharding, so the data only exists on one

    2. Shard correctly       

 use admin
db.runCommand({enablesharding:"mytest"})

  

 db.runCommand({shardcollection:"mytest.test1",key:{_id:"hashed"}})

  

 

  Add data verification

      27017 add data

      

      View data

       

      View data

       

Eight, MongoDB monitoring commands

             mongostat

                mongostat is MongoDB's own status detection tool, which can monitor the status of MongoDB in real time. View help

 mongostat --help

  The command is used, -n specifies the number of prints

 mongostat --host 127.0.0.1 --port 27017 -n 3

  

 

Nine, serverstatus

      https://docs.mongodb.com/manual/reference/command/serverStatus/#dbcmd.serverStatus

      Calling serverStatus in the MongoDB Shell can get the status information of MongoDB.

      It is recommended that you focus on traffic information, connection information, and add, delete, modify, and check information.    

> db.serverStatus () \ * View all monitoring status * \ 
> db.serverStatus (). network \ * View network traffic information * \ 
> db.serverStatus (). opcounters \ * Count of additions, deletions and changes * \ 
{ 
     "insert": NumberLong (189556), 
     "query": NumberLong (14), 
     "update": NumberLong (0), 
     "delete": NumberLong (0), 
     "getmore": NumberLong (0), 
     "command": NumberLong (190063) 
} 
> db.serverStatus (). Connections \ * Statistics Connection * \ 
{"current": 3, "available": 816, "totalCreated": 8, "active": 1}

  ⾮ Get in interactive mode

   echo 'db.serverStatus().opcounters' |mongo --host 127.0.0.1:27017
MongoDB shell version v4.2.3
connecting to: mongodb://127.0.0.1:27017/?
compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("b1f33ce1-34ec-4fbf-b8f1-f5943302ce50") }
MongoDB server version: 4.2.3
{
     "insert" : NumberLong(189556),
     "query" : NumberLong(14),
     "update" : NumberLong(0),
     "delete" : NumberLong(0),
     "getmore" : NumberLong(0),
     "command" : NumberLong(190190)
}
bye

  

Guess you like

Origin www.cnblogs.com/mo-xiao-tong/p/12755488.html
Recommended