MongoDB · Best practices · count allowed Analysis

background

In general, in addition to the delay may cause other than due to the secondary node query secondary inaccurate data, questions about the accuracy of the count, there are so many words in MongoDB4.0 official document
On a sharded cluster, db.collection.count () without a Result in the predicateA CAN AN Query Inaccurate or IF COUNT A iforphaned Documents exist in the chunk IS Migration Progress.
the To Avoid THESE Situations, ON A sharded Cluster, The use db.collection.aggregate () Method

而MongoDB3.6官方文档却是这么描述的
On a sharded cluster, db.collection.count() can result in an inaccurate count if orphaned documents exist or if a chunk migration is in progress.
To avoid these situations, on a sharded cluster, use the db.collection.aggregate() method

In other words, under MongoDB4.0 cluster fragmentation mode, returning for a full table without a predicate condition count is not accurate operation, including the following two scenarios. In MongoDB4.0 previous version, even without a predicate condition, count values are not allowed in the following two scenarios.
1 there is an orphaned document
2 mongo fragmentation in a cluster ongoing move chunk operation
this paper for both scenarios, analyze the causes and avoidance measures are not allowed to count

orphaned documents are not allowed to lead count

Isolated document definitions and causes

Isolated document is a source chunk migrated due process during the move chunk abnormal shutdown caused by the failure or cleaning migration failure caused by, making this part of the record exists in both the source and target-side, and in the definition of mongo fragmented cluster in a documentation must only belong to a chunk and shard.
Obviously, the document may result in isolation are not allowed to count, if isolated document is too big, it will result in occupying additional disk storage resources.

In general, movechunk operation about the following steps

  • Load balancer slice command to the source transmits movechunk
  • Slice start migrating source data block in the inside, the duration of the migration, the source terminal receives all the access requests, including read and write
  • Slicing setup on the target side corresponding to the index
  • End of the target slice data taken over from the beginning of the copy source chunk of
  • When the target terminal after receiving the last chunk of a document, the target slice open end to receive a delta synchronization process source data generated during the migration chunk
  • When all the data is also incremental synchronization is complete, the source terminal connected to the slice start modify metadata database config, i.e. modify the chunk belongs fragment
  • Once you've modified config metadata, source slice removed before beginning the migration of data chunk
    move_chunk

mongodb designed to achieve the real goal to end copy data from the source process is serial, which is the only migrated one by one chunk, but out of consideration on the migration efficiency, the last step to clean up the source of fragmentation of remaining data operation is asynchronous, is to say when the config after modifying metadata, can immediately enter the next chunk of the migration, the source does not need to wait for complete fragmentation cleaning.
Slicing a clean source old chunk of data in a queue, in some scenarios it may be caused due to the slow accumulation of cleaning, if the primary node Crash time, will produce a document isolated.

Simulations and description

As can be seen from the chunk migration process, if the migration process fails, we do not yet know whether it will clean up the target-side data, in theory, can also cause isolated document on the target slice, but because movechunk serial processing, even if there is, at most a chunk block in question; but if the last step to clean up the old chunk data source fails, will definitely lead to isolated documents at the source, and may produce a large number of documents isolated chunk of the worst case.

So to simulate isolated document is very simple, just need to force the process to kill the master node mongod during heavy movechunk, such as in the case of a shard of existing, add another shard to slice the cluster, then inevitably involves movechunk a large number of operations, is used herein in this way.

//添加sharding前,确认sh.isBalancerRunning()为false,因为movechunk期间count本来也不准
mongos> db.user.count({_id:{$gte:0}})
43937296
mongos> db.user.count()
43937296
//添加分片过程中kill -9 mongod进程,重新拉起各个分片,
mongos> db.user.count({_id:{$gte:0}})
43937296
mongos> db.user.count()
51028273
mongos> db.user.aggregate([{ $count:"myCount"}])
{ "myCount" : 43937296 }

Can be seen from the above, only full table count without a predicate conditions will be allowed to result phenomenon, because in this case, the value of count results directly chunk of metadata tables and information in distributed cluster mode that is, one by one to obtain each slice count value of the chunk table access, and then make a return accumulation, due to the presence of the document and causing isolated returned result is greater than the exact result. The last case in a document produced 7,090,977 isolation.

Avoid and eliminate orphaned documents method

On one hand, reducing the number of documents produced in isolation, by default, to clean up the source data fragmentation is an asynchronous call, but can also command set to synchronous call, that is, after setting if the primary node crash, at most one chunk It may produce isolated, but not recommended, not be very meaningful. Setting method

use config
db.settings.update( { "_id" : "balancer" },{ $set : { "_waitForDelete" : true } },{ upsert : true })

From another viewpoint, if the generated document isolated, MongoDB document provides clear all isolated fragment on a method, executed on each node sharding, as follows

var nextKey = { };
var result;
while ( nextKey != null ) {
  result = db.adminCommand( { cleanupOrphaned: "test.user", startingFromKey: nextKey } );
  if (result.ok != 1)
   print("Unable to complete at this time: failure or timeout.")
  printjson(result);
  nextKey = result.stoppedAtKey;
}

During the move chunk are not allowed to count

Symptom

Or log can be confirmed by mongod sh.isBalancerRunning () command, the table in the Move chunk stage;
for convenience of observation, we will _waitForDelete set to 1, i.e. the migration source deleted immediately after completion chunk end points once the slice data into the chunk migration, the result can be observed count value of a first process of rapid growth, and is a relatively slow process of reduction;
each chunk migration cycle process until sh.isBalancerRunning () after OFF, in a stable and accurate value.

Cause Analysis

  • In the process move chunk, if the move chunk is not completed, then the data on the source and target-side fragments are present
  • When the count meaningless word terms then execute on that fragment data is not completed chunk of migration on source and destination end are included in the statistics, you will see the results there will be a rise in value
  • After the end of the copy data and modifying metadata in the source data slice start cleaning, so at this stage, count values ​​is gradually reduced
  • count value reduction process is relatively slow, it should be due to the clean-up time slice data source points to spend longer than copy data

In the process move chunk, if the non-operation count, Common query certainly can not tolerate such an error, because the migration process according to the previous analysis, during the copy data movechunk source receives all access requests; after modifying metadata during delete the source data, the target side receives all access requests. In other words, common query will need to judge the query chunk indeed belong and belong to only one shard, in full compliance with the config server metadata, so its results are accurate.

If MongoDB4.0 previous version, count operation even with a predicate condition resulting value will be allowed. This is because the count operation principle with the predicate conditions in previous versions 4.0 and common query different, it does not go to check traversal of a chunk really belongs only shard, version 4.0 and later, and its principle is the same as the general query , put an end to the situation resulting values ​​are not allowed.

From the analysis of the design philosophy, since version 4.0 does not guarantee the accuracy of the count without a predicate conditions, it can be considered a kind of compromise on performance and efficiency, because in this count scenario, most of the business does not need to be very precise count results, and more emphasis on "fast count" concept, that is, without traversing the data and returns the value directly from the metadata level; of course, you need an accurate count value, but also can be replaced with aggregate method, we can not think that this is a bug, If the point to be optimized, it may be just two kinds of count method is not compatible with the command display, misleading.

Improvements and workarounds

1 while seeking efficiency and accuracy, load balancing can be provided a window, the window is prohibited move chunk other than the
accuracy of scene data 2 emphasized using db.collection.aggregate () method instead of COUNT
. 3 for operation with COUNT predicate condition will mongo upgrade to 4.0 or
4 for the case of a large number of isolated documents, isolated documents do cleanup

Guess you like

Origin yq.aliyun.com/articles/704434