The requirement of grouping and obtaining the Top N data in each group is often encountered in the actual development process. For example, it is often encountered in shopping websites to display a list of stores, and each store list contains multiple product information of the store. Of course, it is the easiest way to display the store list and obtain the specified number of products in the store, but it requires a lot of resources.
In this article, we will use a simple example to show how to implement grouping and get Top N data in MongoDB.
Example
First, we have a data collection of user information in MongoDB user
, which stores the following pieces of data.
[
{
"name": "刘大", "age": 28, "status": "active" },
{
"name": "陈二", "age": 25, "status": "active" },
{
"name": "张三", "age": 25, "status": "active" },
{
"name": "李四", "age": 25, "status": "active" },
{
"name": "王五", "age": 23, "status": "active" },
{
"name": "赵六", "age": 23, "status": "active" },
{
"name": "孙七", "age": 23, "status": "inactive" },
{
"name": "周八", "age": 23, "status": "active" }
]
On the basis of the above data, we are going to extract the first two (subject to the documents added first) at each age active
, and output the grouping in the form of age from young to oldest.
First of all, we used $match
operators to filter and remove active
the documents whose status is not . According to the above requirements, we need to sort by age from young to oldest, even age
in ascending order (ascending order is 1
represented in MongoDB ). In addition, in order to achieve that each group can get the first two documents added, we also added an createdAt
ascending sort based on . age
The sorting can also be performed $group
later, but here we are directly combined with the time sorting to perform.
After filtering and sorting, we need to use $group
operators to group according to the specified field. According to the requirements, we need to use age
as the basis for grouping, so in the implementation we will _id
set it to $age
. In the grouping, we want to get the arrays in each grouping, so we use $push
arithmetic to save each document (using the $$ROOT
representative root document) to products
it. After the grouping is completed, products
all the documents in the group are saved in each group . In order to obtain the TopN elements, we need $project
to $slice
limit the number of documents returned by using in the group .
db.user.aggregate([
{
$match: {
status: 'active',
},
},
{
$sort: {
age: 1,
createdAt: 1,
},
},
{
$group: {
_id: '$age'
persons: {
$push: '$$ROOT',
},
},
},
{
$project: {
_id: 0,
age: "$_id",
persons: {
$slice: [
'$persons',
2,
],
},
},
},
]);
Executing the query, you can get the following return results:
[{
"age": 23,
"persons": [
{
"name": "王五", "age": 23, "status": "active" },
{
"name": "赵六", "age": 23, "status": "active" }
]
}, {
"age": 25,
"persons": [
{
"name": "陈二", "age": 25, "status": "active" },
{
"name": "张三", "age": 25, "status": "active" }
]
}, {
"age": 28,
"persons": [
{
"name": "刘大", "age": 28, "status": "active" }
]
}]
Return results without grouping
The above output result still maintains the form of grouping. If you need to convert the result into an array of documents, you can use additional $unwind
and $replaceRoot
operators. For example, the following example:
db.user.aggregate([
// $match, $sort, $group, $project
{
"$unwind": "$persons"
},
{
"$replaceRoot": {
"newRoot": "$persons"
},
},
])
The result obtained after the query is executed is:
[
{
"name": "王五", "age": 23, "status": "active" },
{
"name": "赵六", "age": 23, "status": "active" },
{
"name": "陈二", "age": 25, "status": "active" },
{
"name": "张三", "age": 25, "status": "active" },
{
"name": "刘大", "age": 28, "status": "active" }
]