Introduction and construction of MongoDB Sharding Cluster

Introduction to MongoDB Sharding

Sharding is a method used by MongoDB to split a large collection into different servers (or a cluster). Although sharding originated from relational database partitioning, MongoDB sharding is another matter entirely.
Compared with the MySQL partitioning scheme, the biggest difference of MongoDB is that it can do almost everything automatically. As long as you tell MongoDB to allocate data, it can automatically maintain the balance of data among different servers.

The purpose of sharding

Database applications with high data volume and throughput will put a lot of pressure on the performance of a single machine. A large query volume will exhaust the CPU of a single machine. A large amount of data puts a greater pressure on the storage of a single machine, which will eventually exhaust the memory of the system. And shift the pressure to disk IO.
In order to solve these problems, there are two basic methods: vertical expansion and horizontal expansion.

  • Vertical expansion: Add more CPU and storage resources to expand capacity.
  • Horizontal expansion: Distribute the data set on multiple servers. Horizontal expansion is sharding.

Several basic concepts of MongoDB

Various concepts from small to large;

  • Shard key: a field in the document
  • Document doc: a row of data containing shard key
  • Chunk: contains n documents
  • Shard: contains n chunks
  • Cluster cluster: contains n shards

Let me focus on Chunk. Within a shard server, MongoDB still divides the data into chunks, and each chunk represents a part of the data inside the shard server. The generation of chunks has the following two purposes:

  • Splitting: When the size of a chunk exceeds the chunk size in the configuration, the background process of MongoDB will split the chunk into smaller chunks to avoid excessive chunks
  • Balancing: In MongoDB, the balancer is a background process that is responsible for the migration of chunks, thereby balancing the load of each shard server. The system initially has 1 chunk, and the default value of chunk size is 64M. It is best to choose a chunk size suitable for the business on the production database. . MongoDB will automatically split and migrate chunks.

Sharded cluster architecture

Component Description
Config Server Stores data routing information for all nodes and shards in the cluster. Three Config Server nodes need to be configured by default.
Mongos Provide access to external applications, all operations are performed through mongos. Generally there are multiple mongos nodes. Data migration and automatic data balance.
Mongod Store application data records. Generally, there are multiple Mongod nodes to achieve the purpose of data fragmentation.

The official architecture diagram is as follows:
Introduction and construction of MongoDB Sharding Cluster

Configuration process

planning:

10 instances, port number: 38017-38026

  1. configserver:
    3 replication sets (1 master and two slaves, arbiter is not supported) 38018-38020 (replication set name configsvr)

  2. Shard node: sh1: 38021-23 (1 master and two slaves, one of the nodes is arbiter, replica set name sh1)
    sh2: 38024-26 (1 master and two slaves, one of the nodes is arbiter, replica set name sh2)
  3. router (mongos) node,
    a router node: 38017

    Shard replication set configuration:

  4. Directory creation
    mkdir -p /mongodb/38021/conf  /mongodb/38021/log  /mongodb/38021/data
    mkdir -p /mongodb/38022/conf  /mongodb/38022/log  /mongodb/38022/data
    mkdir -p /mongodb/38023/conf  /mongodb/38023/log  /mongodb/38023/data
    mkdir -p /mongodb/38024/conf  /mongodb/38024/log  /mongodb/38024/data
    mkdir -p /mongodb/38025/conf  /mongodb/38025/log  /mongodb/38025/data
    mkdir -p /mongodb/38026/conf  /mongodb/38026/log  /mongodb/38026/data
  5. Modify the configuration file
    sh1:

    cat > /mongodb/38021/conf/mongodb.conf<<EOF 
    systemLog:
    destination: file
    path: /mongodb/38021/log/mongodb.log   
    logAppend: true
    storage:
    journal:
        enabled: true
    dbPath: /mongodb/38021/data
    directoryPerDB: true
    #engine: wiredTiger
    wiredTiger:
        engineConfig:
        cacheSizeGB: 1
        directoryForIndexes: true
        collectionConfig:
        blockCompressor: zlib
        indexConfig:
        prefixCompression: true
    net:
    bindIp: 11.111.24.4,127.0.0.1
    port: 38021
    replication:
    oplogSizeMB: 2048
    replSetName: sh1  #replica set名称
    sharding:
    clusterRole: shardsvr  #固定写法
    processManagement: 
    fork: true
    EOF
    
    cp  /mongodb/38021/conf/mongodb.conf  /mongodb/38022/conf/
    cp  /mongodb/38021/conf/mongodb.conf  /mongodb/38023/conf/
    sed 's#38021#38022#g' /mongodb/38022/conf/mongodb.conf -i
    sed 's#38021#38023#g' /mongodb/38023/conf/mongodb.conf -i

    sh2:

    cat > /mongodb/38024/conf/mongodb.conf<<EOF 
    systemLog:
    destination: file
    path: /mongodb/38024/log/mongodb.log   
    logAppend: true
    storage:
    journal:
        enabled: true
    dbPath: /mongodb/38024/data
    directoryPerDB: true
    wiredTiger:
        engineConfig:
        cacheSizeGB: 1
        directoryForIndexes: true
        collectionConfig:
        blockCompressor: zlib
        indexConfig:
        prefixCompression: true
    net:
    bindIp: 11.111.24.4,127.0.0.1
    port: 38024
    replication:
    oplogSizeMB: 2048
    replSetName: sh2
    sharding:
    clusterRole: shardsvr
    processManagement: 
    fork: true
    EOF
    
    cp  /mongodb/38024/conf/mongodb.conf  /mongodb/38025/conf/
    cp  /mongodb/38024/conf/mongodb.conf  /mongodb/38026/conf/
    sed 's#38024#38025#g' /mongodb/38025/conf/mongodb.conf -i
    sed 's#38024#38026#g' /mongodb/38026/conf/mongodb.conf -i
  6. Start all nodes and build a replication set:

    #启动节点
    mongod -f  /mongodb/38021/conf/mongodb.conf 
    mongod -f  /mongodb/38022/conf/mongodb.conf 
    mongod -f  /mongodb/38023/conf/mongodb.conf 
    mongod -f  /mongodb/38024/conf/mongodb.conf 
    mongod -f  /mongodb/38025/conf/mongodb.conf 
    mongod -f  /mongodb/38026/conf/mongodb.conf  
    #配置复制集sh1
    mongo --port 38021 admin
    config = {_id: 'sh1', members: [
                            {_id: 0, host: '11.111.24.4:38021'},
                            {_id: 1, host: '11.111.24.4:38022'},
                            {_id: 2, host: '11.111.24.4:38023',"arbiterOnly":true}]
            }
    
    rs.initiate(config)
    
    #配置复制集sh2
    mongo --port 38024  admin
    config = {_id: 'sh2', members: [
                            {_id: 0, host: '11.111.24.4:38024'},
                            {_id: 1, host: '11.111.24.4:38025'},
                            {_id: 2, host: '11.111.24.4:38026',"arbiterOnly":true}]
            }
    
    rs.initiate(config)

    config node configuration:

  7. Directory creation
    mkdir -p /mongodb/38018/conf  /mongodb/38018/log  /mongodb/38018/data
    mkdir -p /mongodb/38019/conf  /mongodb/38019/log  /mongodb/38019/data
    mkdir -p /mongodb/38020/conf  /mongodb/38020/log  /mongodb/38020/data
  8. Modify the configuration file

    cat > /mongodb/38018/conf/mongodb.conf <<EOF
    systemLog:
    destination: file
    path: /mongodb/38018/log/mongodb.conf
    logAppend: true
    storage:
    journal:
        enabled: true
    dbPath: /mongodb/38018/data
    directoryPerDB: true
    #engine: wiredTiger
    wiredTiger:
        engineConfig:
        cacheSizeGB: 1
        directoryForIndexes: true
        collectionConfig:
        blockCompressor: zlib
        indexConfig:
        prefixCompression: true
    net:
    bindIp: 11.111.24.4,127.0.0.1
    port: 38018
    replication:
    oplogSizeMB: 2048
    replSetName: configReplSet
    sharding:
    clusterRole: configsvr   #固定写法
    processManagement: 
    fork: true
    EOF
    
    cp /mongodb/38018/conf/mongodb.conf /mongodb/38019/conf/
    cp /mongodb/38018/conf/mongodb.conf /mongodb/38020/conf/
    sed 's#38018#38019#g' /mongodb/38019/conf/mongodb.conf -i
    sed 's#38018#38020#g' /mongodb/38020/conf/mongodb.conf -i
  9. Start the node and configure the replication set

    mongod -f /mongodb/38018/conf/mongodb.conf 
    mongod -f /mongodb/38019/conf/mongodb.conf 
    mongod -f /mongodb/38020/conf/mongodb.conf 
    
    mongo --port 38018 admin
    config = {_id: 'configReplSet', members: [
                            {_id: 0, host: '11.111.24.4:38018'},
                            {_id: 1, host: '11.111.24.4:38019'},
                            {_id: 2, host: '11.111.24.4:38020'}]
            }
    rs.initiate(config) 

    mongos node configuration

  10. Create a directory
    mkdir -p /mongodb/38017/conf  /mongodb/38017/log 
  11. Configuration file
    cat >/mongodb/38017/conf/mongos.conf<<EOF
    systemLog:
    destination: file
    path: /mongodb/38017/log/mongos.log
    logAppend: true
    net:
    bindIp: 11.111.24.4,127.0.0.1
    port: 38017
    sharding:
    configDB: configReplSet/11.111.24.4:38018,11.111.24.4:38019,11.111.24.4:38020
    processManagement: 
    fork: true
    EOF
  12. Start mongos
    mongos -f /mongodb/38017/conf/mongos.conf

    It is recommended to use multiple routers in production to prevent single point problems. The router configuration of all nodes is the same

Sharded cluster operation

Connect to one of the mongos (11.111.24.4), do the following configuration
(1) connect to the admin database of mongs

# su - mongod
$ mongo 11.111.24.4:38017/admin

(2) Add fragments

db.runCommand( { addshard : "sh1/11.111.24.4:38021,11.111.24.4:38022,11.111.24.4:38023",name:"shard1"} )
db.runCommand( { addshard : "sh2/11.111.24.4:38024,11.111.24.4:38025,11.111.24.4:38026",name:"shard2"} )

(3) List shards

mongos> db.runCommand( { listshards : 1 } )

(4) View the overall status

mongos> sh.status();

At this point, the MongoDB Sharding Cluster configuration is complete

Use sharded cluster

RANGE fragmentation configuration and testing

Manually shard the vast table under the test library

1. Activate the database sharding function

mongo --port 38017 admin
admin>  ( { enablesharding : "数据库名称" } )

eg:
admin> db.runCommand( { enablesharding : "test" } )

2. Specify the shard to build a pair of collection shard

eg:范围片键
--创建索引
use test
> db.vast.ensureIndex( { id: 1 } )
--开启分片
use admin
> db.runCommand( { shardcollection : "test.vast",key : {id: 1} } )

3. Set shard verification

admin> use test
test> for(i=1;i<1000000;i++){ db.vast.insert({"id":i,"name":"shenzheng","age":70,"date":new Date()}); }
test> db.vast.stats()

4. Fragmentation result test

shard1:
mongo --port 38021
db.vast.count();

shard2:
mongo --port 38024
db.vast.count();

Hash fragmentation example:

Hash the vast table under the test2 library to
create a hash index
(1) Turn on the sharding function for test2

mongo --port 38017 admin
use admin
admin> db.runCommand( { enablesharding : "test2" } )

(2) Create a hash index for the vast table under the test2 library

use test2
test2> db.vast.ensureIndex( { id: "hashed" } )

(3) Open fragmentation

use admin
admin > sh.shardCollection( "test2.vast", { id: "hashed" } )

(4) Enter 10w lines of data to test

use test2
for(i=1;i<100000;i++){ db.vast.insert({"id":i,"name":"shenzheng","age":70,"date":new Date()}); }

(5) Hash fragmentation result test

mongo --port 38021
use test2
db.vast.count();

mongo --port 38024
use test2
db.vast.count();

Sharding operation

  1. Determine if Shard cluster
    admin> db.runCommand({ isdbgrid: 1})

  2. List all
    shard information admin> db.runCommand({ listshards: 1})

  3. List the databases with fragmentation enabled
    admin> use config
    config> db.databases.find( {"partitioned": true})
    or:
    config> db.databases.find() //List all database fragmentation

  4. 查看分片的片键
    config> db.collections.find().pretty()
    {
    "_id" : "test.vast",
    "lastmodEpoch" : ObjectId("58a599f19c898bbfb818b63c"),
    "lastmod" : ISODate("1970-02-19T17:02:47.296Z"),
    "dropped" : false,
    "key" : {
    "id" : 1
    },
    "unique" : false
    }

  5. View the detailed information of
    shards admin> db.printShardingStatus()
    or
    admin> sh.status() *****

  6. Delete
    shard node (caution) (1) Confirm whether balance is working
    sh.getBalancerState()
    (2) Delete shard2 node (caution)
    mongos> db.runCommand( {removeShard: "shard2"})
    Note: The delete operation must be immediate Trigger the blancer.

Reference: https://www.cnblogs.com/duanxz/p/10730121.html
Official document: https://docs.mongodb.com/manual/sharding/

Guess you like

Origin blog.51cto.com/hbxztc/2642209
Recommended