New to Mongo, have found lots of examples of removing dupes from arrays of strings using the aggregation framework, but am wondering if possible to remove dupes from array of objects based on a field in the object. Eg
{
"_id" : ObjectId("5e82661d164941779c2380ca"),
"name" : "something",
"values" : [
{
"id" : 1,
"val" : "x"
},
{
"id" : 1,
"val" : "x"
},
{
"id" : 2,
"val" : "y"
},
{
"id" : 1,
"val" : "xxxxxx"
}
]
}
Here I'd like to remove dupes based on the id
field. So would end up with
{
"_id" : ObjectId("5e82661d164941779c2380ca"),
"name" : "something",
"values" : [
{
"id" : 1,
"val" : "x"
},
{
"id" : 2,
"val" : "y"
}
]
}
Picking the first/any object with given id works. Just want to end up with one per id. Is this doable in aggregation framework? Or even outside aggregation framework, just looking for a clean way to do this. Need to do this type of thing across many documents in collection, which seems like a good use case for aggregation framework, but as I mentioned, newbie here...thanks.
Well, you may get desired result 2 ways.
Classic
Flatten - Remove duplicates (pick first occurrence) - Group by
db.collection.aggregate([
{
$unwind: "$values"
},
{
$group: {
_id: "$values.id",
values: {
$first: "$values"
},
id: {
$first: "$_id"
},
name: {
$first: "$name"
}
}
},
{
$group: {
_id: "$id",
name: {
$first: "$name"
},
values: {
$push: "$values"
}
}
}
])
Modern
We need to use $reduce operator.
Pseudocode:
values : {
var tmp = [];
for (var value in values) {
if !(value.id in tmp)
tmp.push(value);
}
return tmp;
}
db.collection.aggregate([
{
$addFields: {
values: {
$reduce: {
input: "$values",
initialValue: [],
in: {
$concatArrays: [
"$$value",
{
$cond: [
{
$in: [
"$$this.id",
"$$value.id"
]
},
[],
[
"$$this"
]
]
}
]
}
}
}
}
}
])