MongoDB polymerization statistical calculations - $ SUM expression

We usually measured by the sum of the expressions $ sum. Because the array of field MongoDB document, it is possible to simply calculate the sum of two types: the sum of 1, all the statistics meet the conditions of a field of the document; 2, document statistics for each array fields inside respective data values ​​and . In both cases it can be done by $ sum expression. In both cases the statistical polymerization, polymerization corresponding frame $ group $ project steps and procedures.

1.$group

Direct look at an example of it.

Case 1

Mycol test data set as follows:

{
  title: 'MongoDB Overview',
  description: 'MongoDB is no sql database',
  by_user: 'runoob.com',
  url: 'http://www.runoob.com',
  tags: ['mongodb', 'database', 'NoSQL'],
  likes: 100
},
{
  title: 'NoSQL Overview',
  description: 'No sql database is very fast',
  by_user: 'runoob.com',
  url: 'http://www.runoob.com',
  tags: ['mongodb', 'database', 'NoSQL'],
  likes: 10
},
{
  title: 'Neo4j Overview',
  description: 'Neo4j is no sql database',
  by_user: 'Neo4j',
  url: 'http://www.neo4j.com',
  tags: ['neo4j', 'database', 'NoSQL'],
  likes: 750
}

Now we have the number of articles written by more than a collection of computing each author, using aggregate () is calculated

db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$sum : 1}}}])

Query results are as follows:

/* 1 */
{
    "_id" : "Neo4j",
    "num_tutorial" : 1
},

/* 2 */
{
    "_id" : "runoob.com",
    "num_tutorial" : 2
}

Case 2

Each author is like the sum of statistics, the computational expressions:

db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$sum : "$likes"}}}])

Query results are as follows;

/* 1 */
{
    "_id" : "Neo4j",
    "num_tutorial" : 750
},

/* 2 */
{
    "_id" : "runoob.com",
    "num_tutorial" : 110
}

Case 3

Some simple examples above, we'll look rich, sales data test set as follows:

{ "_id" : 1, "item" : "abc", "price" : 10, "quantity" : 2, "date" : ISODate("2014-01-01T08:00:00Z") }
{ "_id" : 2, "item" : "jkl", "price" : 20, "quantity" : 1, "date" : ISODate("2014-02-03T09:00:00Z") }
{ "_id" : 3, "item" : "xyz", "price" : 5, "quantity" : 5, "date" : ISODate("2014-02-03T09:05:00Z") }
{ "_id" : 4, "item" : "abc", "price" : 10, "quantity" : 10, "date" : ISODate("2014-02-15T08:00:00Z") }
{ "_id" : 5, "item" : "xyz", "price" : 5, "quantity" : 10, "date" : ISODate("2014-02-15T09:05:00Z") }

The goal is to be done, based on the date grouping, daily sales statistics, aggregate formula is:

db.sales.aggregate(
  [
    {
      $group:
        {
          _id: { day: { $dayOfYear: "$date"}, year: { $year: "$date" } },
          totalAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } },
          count: { $sum: 1 }
        }
    }
  ]
)

Query results are:

{ "_id" : { "day" : 46, "year" : 2014 }, "totalAmount" : 150, "count" : 2 }
{ "_id" : { "day" : 34, "year" : 2014 }, "totalAmount" : 45, "count" : 2 }
{ "_id" : { "day" : 1, "year" : 2014 }, "totalAmount" : 20, "count" : 1 }

Case 4

Above, we can see $ group, we have used _id, using grouping, if so, we need not require grouping, how should I do it?

E.g. We now want to set sales statistics total number of items sold.

If directly remove _id group stages, as follows:

db.sales.aggregate(
  [
    {
      $group:
        {
         
          totalAmount: { $sum: "$quantity" }
        }
    }
  ]
)
   

The error:

{
    "message" : "a group specification must include an _id",
    "ok" : 0,
    "code" : 15955,
    "codeName" : "Location15955",
    "name" : "MongoError"
}

We still need to add on _id, but you can add a constant, constant in time according to the packet, which can be _id: "0" can be _id: "a", _id: "b", also can make _id: "x ", _id:" y "and so on.

E.g:

 db.sales.aggregate(
  [
    {
      $group:
        {
          _id : "Total"
          totalAmount: { $sum: "$quantity" }
        }
    }
  ]
)

Query results:

{
    "_id" : "Total",
    "totalAmount" : 28
}

2. $ project stage

Case 5

Suppose there exists a set of students, the data structure is as follows:

{ "_id": 1, "quizzes": [ 10, 6, 7 ], "labs": [ 5, 8 ], "final": 80, "midterm": 75 }
{ "_id": 2, "quizzes": [ 9, 10 ], "labs": [ 8, 8 ], "final": 95, "midterm": 80 }
{ "_id": 3, "quizzes": [ 4, 5, 5 ], "labs": [ 6, 5 ], "final": 78, "midterm": 70 }

Now demand is the usual statistical test scores sum, sum test scores, the end of which the sum of the scores of each student.

db.students.aggregate([
  {
    $project: {
      quizTotal: { $sum: "$quizzes"},
      labTotal: { $sum: "$labs" },
      examTotal: { $sum: [ "$final", "$midterm" ] }
    }
  }
])

Its output query results are as follows:

{ "_id" : 1, "quizTotal" : 23, "labTotal" : 13, "examTotal" : 155 }
{ "_id" : 2, "quizTotal" : 19, "labTotal" : 16, "examTotal" : 175 }
{ "_id" : 3, "quizTotal" : 14, "labTotal" : 11, "examTotal" : 148 }

Guess you like

Origin www.linuxidc.com/Linux/2019-08/160341.htm