在MongoDB聚合框架中将列合并为一个

Question

在MongoDB聚合框架中将列合并为一个

4

能否根据多个列的值进行分组?

假设我正在按天存储人与人之间的互动，并使用以下方式跟踪发送方和接收方以及计数。

db.collection = 
[
    { from : 'bob',   to : 'mary',   day : 1,  count : 2 },
    { from : 'bob',   to : 'steve',  day : 2,  count : 1 },
    { from : 'mary',  to : 'bob',    day : 1,  count : 3 },
    { from : 'mary',  to : 'steve',  day : 3,  count : 1 },
    { from : 'steve', to : 'bob',    day : 2,  count : 2 },
    { from : 'steve', to : 'mary',   day : 1,  count : 1 }
]

这使我能够通过在from:上进行分组并对count:求和，从而获取与任何人互动的'bob'的所有互动。

现在我想获取用户的所有互动，所以基本上跨from:和to:值进行分组。基本上，对于每个名称汇总count:，无论它是在from:还是to:中。 [更新] 期望的输出应为：

[
    { name : 'bob',   count : 8 },
    { name : 'mary',  count : 7 },
    { name : 'steve', count : 3 }
]

最简单的方法是创建一个新的列 names: 并将 from: 和 to: 存储在其中，然后使用 $unwind，但这似乎很浪费。

有什么提示吗？

谢谢

- user99168

谢谢。我已经更新了我的问题，并附上了样本输出。 - user99168

首先，在期望的输出中，名称应该是“steve”，计数为5，对吧？而且你可以在聚合框架中完成这个操作，而不需要改变你的模式（虽然这并不美观）。 - Asya Kamsky

2个回答

0

$unwind 可能会很耗费资源。用查询的方式会更简单吧？

db.collection = 
[
    { name : 'bob',   to : 'mary',   day : 1,  count : 2 },
    { name : 'mary',  from : 'bob',  day : 1,  count : 2 },
    { name : 'bob',   to : 'steve',  day : 2,  count : 1 },
    { name : 'bob',   from : 'steve',day : 2,  count : 1 },
    { name : 'mary',  to : 'bob',    day : 1,  count : 3 },
    { name : 'mary',  from : 'bob',  day : 1,  count : 3 },
    { name : 'mary',  to : 'steve',  day : 3,  count : 1 },
    { name : 'mary',  from : 'steve' day : 3,  count : 1 },
    { name : 'steve', to : 'bob',    day : 2,  count : 2 },
    { name : 'steve', from : 'bob',  day : 2,  count : 2 },
    { name : 'steve', to : 'mary',   day : 1,  count : 1 }
    { name : 'steve', from : 'mary', day : 1,  count : 1 }
]

[更新]

使用您现有的结构，以下是如何使用Map-Reduce进行操作的步骤，但此方法并非用于实时结果。总体而言，速度可能会较慢，但效率很高，比在AF中执行大量$unwind操作更高效。

db.so.drop();
db.so.insert(
[
    { from: 'bob', to: 'mary', day: 1, count: 2 },
    { from: 'bob', to: 'steve', day: 2, count: 1 },
    { from: 'mary', to: 'bob', day: 1, count: 3 },
    { from: 'mary', to: 'steve', day: 3, count: 1 },
    { from: 'steve', to: 'bob', day: 2, count: 2 },
    { from: 'steve', to: 'mary', day: 1, count: 1 }
]);

db.runCommand(
    {
        "mapreduce": "so", // don't need the collection name here if it's above
        "map": function(){
            emit(this.from, {count: this.count});
            emit(this.to, {count: this.count});
        },
        "reduce": function (name, values) {
            var result = { count: 0 };
            values.forEach(function (v) {
                result.count += v.count;
            });

            return result;
        },
        query: {},
        out: { inline: 1 },
    }
);

它产生的结果是：

{
    "results" : [
            {
                "_id" : "bob",
                "value" : {
                    "count" : 8
                }
            },
            {
                "_id" : "mary",
                "value" : {
                    "count" : 7
                }
            },
            {
                "_id" : "steve",
                "value" : {
                    "count" : 5
                }
            }
    ],
    "timeMillis" : 1,
    "counts" : {
        "input" : 6,
        "emit" : 12,
        "reduce" : 3,
        "output" : 3
    },
        "ok" : 1
}

- cirrus

是的，那样做可以行得通，但我们不以这种方式存储数据，会有一些记录重复，并且我宁愿不转换数据模型。不过这将是最后的选择。 - user99168

我看不到在您的结构中使用AF完成此操作的方法，但是您可以使用Map-Reduce来实现，假设您不需要实时、按需结果？ - cirrus

除此之外，下一个最简单的方法可能是使用AF对数据进行两次遍历，并将每次遍历中“from”和“to”的总和结果相加。 - cirrus

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Asya Kamsky · Accepted Answer

可以在多个列之间按值进行分组吗？

是的，在MongoDB中可以跨不同列对值进行分组。

通过MapReduce很容易实现。但即使您没有存储参与者数组，使用聚合框架也可以实现这一点（如果您有包含两个参与者名称的名称数组，则只需进行$unwind和$group操作-非常简单，我认为比MapReduce或当前模式下必须使用的管道更优雅）。

以下是适用于您当前模式的管道：

db.collection.aggregate( [
{
    "$group" : {
        "_id" : "$from",
        "sum" : {
            "$sum" : "$count"
        },
        "tos" : {
            "$push" : {
                "to" : "$to",
                "count" : "$count"
            }
        }
    }
}
{ "$unwind" : "$tos" }
{
    "$project" : {
        "prev" : {
            "id" : "$_id",
            "sum" : "$sum"
        },
        "tos" : 1
    }
}
{
    "$group" : {
        "_id" : "$tos.to",
        "count" : {
            "$sum" : "$tos.count"
        },
        "prev" : {
            "$addToSet" : "$prev"
        }
    }
}
{ "$unwind" : "$prev" }
{
    "$group" : {
        "_id" : "1",
        "t" : {
            "$addToSet" : {
                "id" : "$_id",
                "c" : "$count"
            }
        },
        "f" : {
            "$addToSet" : {
                "id" : "$prev.id",
                "c" : "$prev.sum"
            }
        }
    }
}
{ "$unwind" : "$t" }
{ "$unwind" : "$f" }
{
    "$project" : {
        "name" : {
            "$cond" : [
                {
                    "$eq" : [
                        "$t.id",
                        "$f.id"
                    ]
                },
                "$t.id",
                "nobody"
            ]
        },
        "count" : {
            "$add" : [
                "$t.c",
                "$f.c"
            ]
        },
        "_id" : 0
    }
}
{ "$match" : { "name" : { "$ne" : "nobody" } } }
]);

在您提供的样例输入中，输出结果为：

{
    "result" : [
        {
            "name" : "bob",
            "count" : 8
        },
        {
            "name" : "mary",
            "count" : 7
        },
        {
            "name" : "steve",
            "count" : 5
        }
    ],
    "ok" : 1
}