db.t_user_task.aggregate([
{
$group: {
_id: {
uid: '$uid',
taskId: '$taskId'
},
count: {
$sum: 1
},
dups: {
$addToSet: '$_id'
}
}
},
{
$match: {
count: {
$gt: 1
}
}
}
]).forEach(function(doc){
doc.dups.shift();db.t_user_task.remove({
_id: {
$in: doc.dups
}
});
})
1. Group according to uid and taskId and count the number, $group will only return the fields that participate in the grouping, use $addToSet to add the _id field to the returned result array
2. Use $match to match data with a number greater than 1
3.doc.dups.shift(); means to delete from the first value of the array; the function is to kick out one of the _id of the duplicate data, so that the subsequent delete statement will not delete all the data
4. Use a forEach loop to delete data based on _id
The $addToSet operator adds a value to an array only if the value does not already exist in the array. If the value already exists in the array, $addToSet returns without modifying the array.
Note: The camel case of forEach and $addToSet cannot be written in lowercase, because mongodb is strictly case-sensitive , mongodb is strictly case-sensitive , mongodb is strictly case-sensitive , and the important thing is said three times!
db.t_user_task.aggregate([ {$match: { startTime: { $gt: 20180205 }} }, { $group: { _id: {uid: '$uid',taskId: '$taskId'},count: {$sum: 1}, dups: {$addToSet: '$_id'}}}, {$match: {count: {$gt: 1}}} ])
db.t_user_task.aggregate([ { $group: { _id: {uid: '$uid',taskId: '$taskId'},count: {$sum: 1}, dups: {$addToSet: '$_id'}}}, {$match: {count: {$gt: 1}}} ]).forEach(function(doc){doc.dups.shift();db.t_user_task.remove({_id: {$in: doc.dups}});})
If the amount of data is okay, just wait patiently