mongoDB aggregation operations
Article directory
Commonly used pipelines for mongoDB aggregation are
- $match: Filter the pipeline to filter the data, and only output the documents that meet the conditions
- $group: Group the documents in the collection, which can be used for statistical results
- $project map pipeline, map output
- $sort: Sort pipeline, sort the input documents and output them
- $limit: limit the pipeline, limit the documents returned by the aggregation pipeline
- $skip: skip the pipeline, skip the specified number of documents, and return the remaining documents
1. Prepare a set of data
db.data.insertMany([{name:"Tom", city:"cityA",type:"aaa",num:609,age:18},
{name : "allen", city :"cityC", type: "bbb", num : 549,age:20},
{name :"jerry", city :"cityA", type :"bbb", num : 593,age:22},
{name :"frank", city : "cityB", type:"aaa", num : 657,age:21},
{name :"jack", city : "cityC", type:"aaa", num : 620,age:18},
{name :"alice", city : "cityB", type:"ccc", num : 584,age:20},
{name :"marry", city:"cityA", type:"bbb", num : 599,age:22}
])
db.data.find()
2. $group grouping pipeline
2.1 Statistical single group
Group cities and find the average of num for each group.
db.data.aggregate({$group:{_id:'$city',avg_num:{$avg:'$num'}}})
2.2 Statistics on multiple groups
db.data.aggregate({$group:{_id:'$city',avg_num:{$avg:'$num'},avg_age:{$avg:'$age'}}})
3. $match filter pipeline
The result of its action can be passed to the latter pipeline.
Group cities whose city is not "cityC" (filter out ""cityC"), and average num for each group.
db.data.aggregate({$match:{city:{$ne:"cityC"}}},{$group:{_id:'$city',avg_num:{$avg:'$num'}}})
Group cities with age ≥ 20 (filter out ""cityC"), and find the average of num in each group.
db.data.aggregate({$match:{age:{$gte:20}}},{$group:{_id:'$city',avg_num:{$avg:'$num'}}})
in,
-
_id is the basis for grouping
-
avg_num is the newly defined field name
-
$avg is the expression for the evaluation method, here is the expression for averaging
-
group by specified by '$city'
-
'$num' specifies the required value of the field
Expand the number of statistics
If you want to count the number of each group of data, you can use $sum to achieve.
The function of $sum is to find the sum of a certain field. When counting with $sum, such as counting the number of people in each city, it can be written as:
db.data.aggregate({$group:{_id:'$city',count:{$sum:1}}})
That is, the field, the constant 1, is counted. If written as {$sum:2}, the count result is 4,4,6.
4. $project mapping pipeline
db.data.aggregate({$group:{_id:'$city',avg_num:{$avg:'$num'},avg_age:{$avg:'$age'}}},{$project:{avg_num:1}})
As shown in the figure, the result no longer displays avg_age, only _id and avg_num are displayed.
5.$sort $skip $limit
- Sort by age in descending order, skip the first one, and take the first three data
db.data.aggregate({$sort:{age:-1}},{$skip:1},{$limit:3})
- Sort by age in descending order, take the first three data, skip the first,
db.data.aggregate({$sort:{age:-1}},{$limit:3},{$skip:1})
- Take the first three data, skip the first one, and then sort in descending order
db.data.aggregate({$limit:3},{$skip:1},{$sort:{age:-1}})
The way of writing the pipeline does not consider the problem of priority, and it is executed sequentially from left to right.
6. Common expressions supplement
$sum: calculate the sum, $sum:1 means double count
$avg: calculate the average
$min: get the minimum value
$max: get the maximum value
$push: insert the value into an array in the result document Medium
$first: Get the first document data according to the order of resource documents
$last: Get the last document data according to the order of resource documents