按条件分组并计数

32

我正在尝试对一组文档进行分组,并根据它们的值计数:

{ item: "abc1", value: 1 }
{ item: "abc1", value: 1 }
{ item: "abc1", value: 11 }
{ item: "xyz1", value: 2 }
我想按item分组,并返回value大于10的次数和小于10的次数。
{ item: "abc1", countSmaller: 2, countBigger: 1 }
{ item: "xyz1", countSmaller: 1, countBigger: 0 }
5个回答

47

你需要的是聚合框架中的 $cond 运算符。一个实现你所需的方式是:

db.foo.aggregate([
    {
        $project: {
            item: 1,
            lessThan10: {  // Set to 1 if value < 10
                $cond: [ { $lt: ["$value", 10 ] }, 1, 0]
            },
            moreThan10: {  // Set to 1 if value > 10
                $cond: [ { $gt: [ "$value", 10 ] }, 1, 0]
            }
        }
    },
    {
        $group: {
            _id: "$item",
            countSmaller: { $sum: "$lessThan10" },
            countBigger: { $sum: "$moreThan10" }
        }
    }
])

注意:我假设value是数字类型而非字符串。

输出:

{
        "result" : [
                {
                        "_id" : "xyz1",
                        "countSmaller" : 1,
                        "countBigger" : 0
                },
                {
                        "_id" : "abc1",
                        "countSmaller" : 2,
                        "countBigger" : 2
                }
        ],
        "ok" : 1
}  

请注意 value 字段是一个字符串,因此您可能希望将该键值转换为数字。 - chridam
1
@chridam,感谢您的评论。我已在我的答案中添加了一条注释,关于我假设value字段为数字的说明。 我会把这部分留给OP去练习 :) - Anand Jayabalan
我的错,我没有看到那个注释,它在代码之间被丑陋地藏起来了 :P - chridam

4
您需要使用 $cond 运算符。这里,0 表示小于 10 的值,1 表示大于 10 的值。但是这并不能完全给出您所期望的输出。也许会有人发布更好的答案。
db.collection.aggregate(
    [
        {
            "$project": 
                {
                    "item": 1, 
                    "value": 
                        {
                            "$cond": [ { "$gt": [ "$value", 10 ] }, 1, 0 ] 
                        }
                 }
         }, 
         {
             "$group": 
                 {
                     "_id": { "item": "$item", "value": "$value" },                       
                     "count": { "$sum": 1 }
                 }
         }, 
         {
             "$group": 
                 { 
                     "_id": "$_id.item", 
                     "stat": { "$push": { "value": "$_id.value", "count": "$count" }}
                 }
          }
    ]
)

输出:

{
        "_id" : "abc1",
        "stat" : [
                {
                        "value" : 1,
                        "count" : 2
                },
                {
                        "value" : 0,
                        "count" : 2
                }
        ]
}
{ "_id" : "xyz1", "stat" : [ { "value" : 0, "count" : 1 } ] }

您需要将您的值转换为整数或浮点数。 在此处 可以查看如何进行转换。请注意保留原有的HTML标签。

3
如果有人正在寻找针对此场景的Java代码(根据我的需求更新字段):最初的回答。
Aggregation aggregation = Aggregation.newAggregation(
                Aggregation.project("environment").and("success").applyCondition(ConditionalOperators.when(Criteria.where("deploymentStatus").is("SUCCESS"))
                        .then(1)
                        .otherwise(0)).and("failed").applyCondition(ConditionalOperators.when(Criteria.where("deploymentStatus").is("FAILURE"))
                        .then(1)
                        .otherwise(0)),
                Aggregation.group("environment").sum("success").as("success").sum("failed").as("failed"));

2

在$group阶段根据条件计算数据,我们可以利用"$accumulator"运算符,在MongoDb 5.0版本中已有所改变。

因此,根据您的需求,我们可以使用以下聚合阶段进行实现 -

db.products.aggregate([
  {
    $group: {
      _id: "$item",
      totalCounts: { $sum: 1 },
      countsMeta: {
        $accumulator: {
          init: function () {
            // Set the initial state
            return { countSmaller: 0, countBigger: 0 };
          },
          accumulate: function (state, value) {
            // Define how to update the state
            return value < 10
              ? { ...state, countSmaller: state.countSmaller + 1 }
              : { ...state, countBigger: state.countBigger + 1 };
          },
          accumulateArgs: ["$value"], // Pass the desired argument to the accumulate function
          merge: function (state1, state2) {
            /* 
                Executed when the operator performs a merge,
                Merge may happen in two cases : 
                 1). $accumulator is run on a sharded cluster. The operator needs to merge the 
                     results from each shard to obtain the final result.
                 2). A single $accumulator operation exceeds its specified memory limit. 
                     If you specify the allowDiskUse option, the operator stores the 
                     in-progress operation on disk and finishes the operation in memory. 
                     Once the operation finishes, the results from disk and memory are 
                     merged together using the merge function.
                The merge function always merges two states at a time. In the event that more 
                than two states must be merged, the resulting merge of two states is merged 
                with a single state. This process repeats until all states are merged.     
            */

            return {
              countSmaller: state1.countSmaller + state2.countSmaller,
              countBigger: state1.countBigger + state2.countBigger,
            };
          },
          finalize: function (state) {
            // After collecting the results from all documents,
            return state;
          },
          lang: "js",
        },
      },
    },
  },
]);

这个执行结果如下所示:

Result Image Snapshow

如果需要更多关于阶段和操作符的信息,请参考以下链接:

https://www.mongodb.com/docs/manual/reference/operator/aggregation/accumulator/

希望这能帮到你或其他人。谢谢!
祝编码愉快 :-)

0

Anand的答案非常相似,但有一个步骤:

$group: {
    _id: "$item",
    countSmaller: { $sum: {$cond: [ { $lt: ["$value", 10 ] }, 1, 0]} },
    countBigger: { $sum: {$cond: [ { $gt: [ "$value", 10 ] }, 1, 0]} }
}

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接