MongoDB study notes (six, MongoDB copy sets and slice)

table of Contents:

  • MongoDB deployment model
  • MongoDB can copy set
  • MongoDB separate read and write
  • Slice architecture deployment
  • Best Practices

MongoDB deployment models:

Single -> Set replicable -> slice cluster

MongoDB can be copied sets:

The method of copy sets can be distributed among multiple nodes and maintenance data MongoDB; it can copy the data from one node to the other nodes, and when modifying data synchronization.

Prior to version 3.0 called the shots from copying, 3.0 + recommended to copy this set.

 

1. Why use a replication set, what are the benefits?

  • It may be possible to avoid data loss, protect data security, improve system security. (At least three nodes, maximum 50)
  • It has automated disaster recovery mechanism, after the primary node goes down will elect a new master node, to improve the robustness of the system. (7 election node limit)
  • Separate read and write, to improve system performance.

2, may build a copy of the set:

a, installation of three or more MongoDB

b, configure mongodb.conf

Replication:
   // cluster name 
  replSetName: name
   // oplog size 
  oplogSizeMB: 50

c, running on the primary node can copy the initial command sets

// copy set initialization 
rs.initiate ({
     'the _id': 'name' ,
     'Version':. 1 ,
     'Members': [{ '_ ID': 0, Host: 'IP: Port' }] 
}) 

// add child node 
rs.add ( 'IP: Port' ) 
rs.add ( 'IP2: Port')

d, running rs.Status () or rs.isMaster () command to check the status of copy sets.

3, you can copy and principles set architecture:

a, oplog: operation record and save timestamp.

b, data synchronization: holding long polling from the master

  • View the latest timestamp native oplog from node
  • Check the master node oplog in later than this time stamp documents
  • Load these documents, and write according to log execution

c, heartbeat mechanism: once every 2 seconds heartbeat, found to elections and after failover fault.

d, electoral system: When the primary node fails, the remaining node of a new master node based on priority and bully election algorithm between the cluster service is read-only.

MongoDB separate read and write:

For MongoDB, the master node is generally used to write data from node to read data, and the master node is also not fixed (When the master node goes down after the election of a new master node), so customers in the production environment end of the master node can not be directly connected.

So we need to configure the cluster nodes :

<mongo:mongo-client replica-set="ip1:port1,ip2:port2,ip3:port3">
    <mongo:client-options read-preference="SECONDARY_PREFERRED"/>
</mongo:mongo-client>

By read-preference parameter control separate read and write mode, the type of which are the following:

  • PRIMARY (default): read in the master node, the master node if the error is unavailable.
  • PRIMARY_PREFERRED: Preferred master node, if the master node is unavailable to other slave nodes proceeds.
  • SECONDARY: reading from the node, then the error is unavailable.
  • SECONDARY_PREFERRED (recommended): Preferred from the node, if special circumstances are read (but the master node architecture) in the master node.
  • NEAREST: nearest master node.

Slice architecture deployment:

Why use a slice architecture:

  • Massive data growth, the need for greater read and write throughput
  • Single server memory, CPU, there is always a bottleneck

 

Slice architecture of the three main roles:

1, fragmentation: Role fragmentation schema unique stored data, it may be a single server may be a set of replicate (copy generation environment can be recommended set), part of the data stored on each partition only.

2, Route: since only part of the data fragment storage, so the need for a tool (tool mongos) in terms of the processing request corresponding to slice, routed to fill this role.

3, the configuration server: metadata (database, a collection, fragmentation range of positions like log information), the minimum configuration server stores three clusters.

 

Slice architecture built:

1, the configuration of the server fragment (fragment three servers need to be performed)

./mongod --port 27010 --dbpath /usr/local/mongodb/data/db/27010 --logpath /usr/local/mongodb/logs/mongodb0.log --fork -- shardsvr

2, config server cluster configuration

--port 27011 ./mongod / usr / local / MongoDB / Data / DB / 27011 --logpath /usr/local/mongodb/logs/mongodb1.log --fork --logappend --configsvr --replSet = cfrs1 
RS. Initiate ({
     'the _id': 'cfrs1' ,
     'Version':. 1 ,
     'Members' : [{
         'the _id': 0 
    'Host': 'master node ip: the master node Port' 
    }] 
}) 
rs.add ( " from node 1ip: from node 1PORT " ) 
rs.add ( " slave node 2ip: from node 2port ")

3, routing start mongos

./mongos --configdb cfrs1/host1,host2,host3 --port 27016 --logpath /usr/local/mongodb/logs/mongodb6.log --fork --logappend

4, arranged sharding
connected mongos, increase sharding

ADMIN use 
sh.addShard ( "host1" ) 
sh.addShard ( "host2" ) 
sh.addShard ( "host3" )
 // set db promoter fragment 
sh.enableSharding ( 'dbName' )
 // Configure collection shard key 
sh. shardCollection (
     'dbName.collectionName' , 
    { 'fragment key field name': 1 } 
)

 

Some suggestions shard key selection:

1, the point is not recommended:

  • Do not use self-growth field as a shard key, avoid hot issues.
  • Shard key can not use the coarse-grained, avoiding data blocks can not be divided.
  • You can not use completely random fragmentation keys, this will result in poor query performance.

2. Recommended point:

  • Using conventional query-related fields as key fragments, and contains a unique field (e.g., natural key, id, etc.).
  • Index is also important for the partition, have the same index set on each slice, slice becomes the default key index.
  • Fragmented collection allows the creation of a unique index on id and shard key.

Best Practices:

1, try to select a new 64-bit version stable MongoDB.

2, the design pattern data; promote single document design, the relationship as embedded documents or embedded array; when the associated large amount of data, is achieved by considering the associated tables, or custom implementation DBRef association.

3, avoid the use of large amounts of data to skip skip

  • Try to narrow the range of data by the query.
  • The use of the results of a query results as a condition to the next page.

4, avoiding the use of query operators ($ ne, $ nin, $ where, etc.) does not apply the index separately.

5, select the appropriate write strategy based on business scenarios, to find a balance between data security and performance.

6, indexing is very important.

7, Profile production environment is recommended to open, easy to optimize system performance.

8, the production environment is recommended to open auth mode, to ensure system security.

9. Do not MongoDB and other services deployed on the same machine (although MongoDB occupy the maximum memory is configurable).

10, the single must be open journal log, the amount of data is not too large business scenarios, it is recommended to use multi-machine replica set, and open separate read and write.

11, shard key considerations.

Guess you like

Origin www.cnblogs.com/bzfsdr/p/12000811.html