_id and ObjectId in MongoDB

_id and ObjectId
Documents stored in MongoDB must have a "_id" key. The value of this key can be of any type and defaults to an ObjectId object.
Within a collection, each collection has a unique "_id" value to ensure that each document in the collection can be uniquely identified. If there are
two collections, both collections can have a key of "_id" with a value of "123", but each collection can only have one
document whose "_id" is 123.
1.ObjectId
ObjectId is "_id" the default type. It is designed to be lightweight, and different machines can easily generate it in the same way that is globally unique.
This is the main reason MongoDB uses ObjectId over other more conventional practices such as auto-incrementing primary keys, because synchronizing auto-incrementing primary key values ​​across multiple
servers is laborious and time-consuming. MongoDB was designed from the ground up to be a distributed database, and handling multiple
nodes is a core requirement. We'll see later that the ObjectId type is much easier to generate in a sharded environment.

ObjectId uses 12 bytes of storage space, with two hexadecimal digits per byte, and is a 24-bit string. Because it looks very long
, many people find it difficult to handle. But the key is to know that this long ObjectId is twice as long as the actual stored data.

If you create multiple ObjectIds in quick succession, you will find that only the last few digits change each time. Also the middle digits will change (if
you pause for a few seconds during creation). This is due to the way the ObjectId is created. 12 bytes are generated as follows:
      0|1|2|3 | 4|5|6 | 7|8 | 9|10|11
        timestamp|machine|PID|counter
The first 4 bytes are the timestamp from the standard epoch in seconds. This brings some useful properties.

The timestamp, combined with the following 5 bytes, provides second-level uniqueness.
Since the timestamp comes first, this means that the ObjectIds will be roughly in the order they were inserted. This is useful for things like using it as an index
for efficiency, but this is not guaranteed, just "approximately". These 4 bytes also imply when the document was created. Most drivers expose
a method to get this information from the ObjectId.

Because the current time is used, many users worry about time synchronization with the server. In fact, this is not necessary, because the actual value of the timestamp is not
important , as long as it is always increasing (once per second).

The next three bytes are the unique identifier of the host on which it is located. Usually a hash of the machine's hostname. This ensures that different hosts generate different
ObjectIds without conflict.

In order to ensure that the ObjectId generated by multiple concurrent processes on the same machine is unique. The last 3 bytes are an automatically incremented counter to
ensure that the ObjectIds generated by the same process in the same second are also different. A maximum of 256 (16777216) distinct ObjectIds are allowed per process in the same second.

2. Automatically generate _id
As mentioned earlier, if there is no "_id" key when inserting a document, the system will automatically create one for you. This can be done by the MongoDB server, but
is usually done by the driver on the client side. The reasons are as follows:
Although ObjectId is designed to be lightweight and easy to generate, it still generates overhead after all. Generated on the client side embodies the design of MongoDB
Philosophy: If you can move from the server to the driver, move it as much as possible. The reasoning behind this philosophy is that even with a scalable database like
MongoDB , it is much easier to extend the application layer than the database layer. By handing the transaction to the client, the burden of database expansion is reduced.

The ObjectId is generated on the client side, and the driver can provide a richer API. For example, a driver can have its own insert method that can return the
generated ObjectId or insert it directly into the document. If the driver allowed the server to generate ObjectIds, a separate query would be required to
determine the "_id" value in the inserted document.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326344100&siteId=291194637