MongoDB管道聚合:对子文档进行排序

3

在使用MongooseJs时,我尝试按嵌套数组进行排序时遇到了一些小问题。

a)一个产品包含多个任务,每个任务都有子任务。
b)任务和每个子任务都有顺序(task.order和task.subtask.order)。

以下是一个示例产品文档:

db.products.find({_id: ObjectId("554a13d4b692088a38f01f3b")})

结果:

{
"_id" : ObjectId("554a13d4b692088a38f01f3b"),
"title" : "product title",
"order" : 3,
"description" : "Description here ",
"status" : "live",
"tasks" : [ 
    {
        "title" : "task 1",
        "description" : "task 1 desc",
        "order" : 10,
        "_id" : ObjectId("554a13d4b692088a38f01f3a"),
        "status" : "live",
        "subTasks" : [ 
            {
                "title" : "task 1 sub 1",
                "content" : "aaa",
                "order" : -2,
                "_id" : ObjectId("554a13d4b692088a38f01f5a"),
                "status" : "live"
            }, 
            {
                "title" : "task 1 sub 2",
                "content" : "aaa",
                "order" : 1,
                "_id" : ObjectId("554a13d4b692088a38f01f3a"),
                "status" : "live"
            }, 
            {
                "title" : "task 1 sub 4",
                "content" : "aaa",
                "order" : 8,
                "_id" : ObjectId("554a13d4b692088a38f01f4a"),
                "status" : "live"
            }, 
            {
                "title" : "task 1 sub 3 ",
                "content" : "aaa",
                "order" : 2,
                "_id" : ObjectId("5550d0a61662211332d9a973"),
                "status" : "live"
            }
        ]
    }, 
    {
        "title" : "task 2",
        "description" : "task desc 2",
        "order" : 1,
        "_id" : ObjectId("5550855f9ee2db4e3958d299"),
        "status" : "live",
        "subTasks" : [ 
            {
                "title" : "task 2 sub 1",
                "content" : "bbb",
                "order" : 1,
                "_id" : ObjectId("55508f459ee2db4e3958d29a"),
                "status" : "live"
            }
        ]
    }, 
    {
        "title" : "task 3",
        "description" : "task 3 desc",
        "order" : 2,
        "_id" : ObjectId("5551b844bb343a620f85f323"),
        "status" : "live",
        "subTasks" : [ 
            {
                "title" : "task 3 sub 2",
                "content" : "cccc",
                "order" : 0,
                "_id" : ObjectId("5551b88abb343a620f85f324"),
                "status" : "live"
            }, 
            {
                "title" : "task 3 sub 4",
                "content" : "cccc",
                "order" : 1,
                "_id" : ObjectId("5551b8f1bb343a620f85f325"),
                "status" : "hidden"
            }, 
            {
                "title" : "task 3 sub 3",
                "content" : "ccc",
                "order" : 2,
                "_id" : ObjectId("5551ba40bb343a620f85f327"),
                "status" : "hidden"
            }, 
            {
                "title" : "task 3 sub 1",
                "content" : "cccc",
                "order" : -1,
                "_id" : ObjectId("5551bcb8c31283c051d30b7c"),
                "status" : "hidden"
            }
        ]
    }
]

我正在使用Mongodb聚合管道来对任务及其子任务进行排序。以下是我的部分代码:

}

    db.products.aggregate([
    {
        $project: {
            "tasks" : 1
        }
    },
    {
        $match: {
            _id: ObjectId("554a13d4b692088a38f01f3b")
        }
    },
    {
        $unwind: "$tasks"
    },
    {
        $project: {
            "tasks": 1,
            "subTasks": 1
        }
    },
    {
        $unwind: "$tasks.subTasks"
    },
    {
        $sort: {
            "tasks.subTasks.order": 1
        }
    },
    {
        $sort: {
            "tasks.order": 1
        }
    }
])

结果:

{
"result": [
    {
        "_id": ObjectId("554a13d4b692088a38f01f3b"),
        "tasks": {
            "title": "task 2",
            "description": "task desc 2",
            "order": 1,
            "_id": ObjectId("5550855f9ee2db4e3958d299"),
            "status": "live",
            "subTasks": {
                "title": "task 2 sub 1",
                "content": "bbb",
                "order": 1,
                "_id": ObjectId("55508f459ee2db4e3958d29a"),
                "status": "live"
            }
        }
    },
    {
        "_id": ObjectId("554a13d4b692088a38f01f3b"),
        "tasks": {
            "title": "task 3",
            "description": "task 3 desc",
            "order": 2,
            "_id": ObjectId("5551b844bb343a620f85f323"),
            "status": "live",
            "subTasks": {
                "title": "task 3 sub 1",
                "content": "cccc",
                "order": -1,
                "_id": ObjectId("5551bcb8c31283c051d30b7c"),
                "status": "hidden"
            }
        }
    },
    {
        "_id": ObjectId("554a13d4b692088a38f01f3b"),
        "tasks": {
            "title": "task 3",
            "description": "task 3 desc",
            "order": 2,
            "_id": ObjectId("5551b844bb343a620f85f323"),
            "status": "live",
            "subTasks": {
                "title": "task 3 sub 2",
                "content": "cccc",
                "order": 0,
                "_id": ObjectId("5551b88abb343a620f85f324"),
                "status": "live"
            }
        }
    },
    {
        "_id": ObjectId("554a13d4b692088a38f01f3b"),
        "tasks": {
            "title": "task 3",
            "description": "task 3 desc",
            "order": 2,
            "_id": ObjectId("5551b844bb343a620f85f323"),
            "status": "live",
            "subTasks": {
                "title": "task 3 sub 4",
                "content": "cccc",
                "order": 1,
                "_id": ObjectId("5551b8f1bb343a620f85f325"),
                "status": "hidden"
            }
        }
    },
    {
        "_id": ObjectId("554a13d4b692088a38f01f3b"),
        "tasks": {
            "title": "task 3",
            "description": "task 3 desc",
            "order": 2,
            "_id": ObjectId("5551b844bb343a620f85f323"),
            "status": "live",
            "subTasks": {
                "title": "task 3 sub 3",
                "content": "ccc",
                "order": 2,
                "_id": ObjectId("5551ba40bb343a620f85f327"),
                "status": "hidden"
            }
        }
    },
    {
        "_id": ObjectId("554a13d4b692088a38f01f3b"),
        "tasks": {
            "title": "task 1",
            "description": "task 1 desc",
            "order": 10,
            "_id": ObjectId("554a13d4b692088a38f01f3a"),
            "status": "live",
            "subTasks": {
                "title": "task 1 sub 1",
                "content": "aaa",
                "order": -2,
                "_id": ObjectId("554a13d4b692088a38f01f5a"),
                "status": "live"
            }
        }
    },
    {
        "_id": ObjectId("554a13d4b692088a38f01f3b"),
        "tasks": {
            "title": "task 1",
            "description": "task 1 desc",
            "order": 10,
            "_id": ObjectId("554a13d4b692088a38f01f3a"),
            "status": "live",
            "subTasks": {
                "title": "task 1 sub 2",
                "content": "aaa",
                "order": 1,
                "_id": ObjectId("554a13d4b692088a38f01f3a"),
                "status": "live"
            }
        }
    },
    {
        "_id": ObjectId("554a13d4b692088a38f01f3b"),
        "tasks": {
            "title": "task 1",
            "description": "task 1 desc",
            "order": 10,
            "_id": ObjectId("554a13d4b692088a38f01f3a"),
            "status": "live",
            "subTasks": {
                "title": "task 1 sub 3 ",
                "content": "aaa",
                "order": 2,
                "_id": ObjectId("5550d0a61662211332d9a973"),
                "status": "live"
            }
        }
    },
    {
        "_id": ObjectId("554a13d4b692088a38f01f3b"),
        "tasks": {
            "title": "task 1",
            "description": "task 1 desc",
            "order": 10,
            "_id": ObjectId("554a13d4b692088a38f01f3a"),
            "status": "live",
            "subTasks": {
                "title": "task 1 sub 4",
                "content": "aaa",
                "order": 8,
                "_id": ObjectId("554a13d4b692088a38f01f4a"),
                "status": "live"
            }
        }
    }
],
"ok": 1

}

Expected result:

{
"_id": ObjectId("554a13d4b692088a38f01f3b"),
"title": "product title",
"order": 3,
"description": "Description here ",
"status": "live",
"tasks": [
    {
        "title": "task 2",
        "description": "task desc 2",
        "order": 1,
        "_id": ObjectId("5550855f9ee2db4e3958d299"),
        "status": "live",
        "subTasks": [
            {
                "title": "task 2 sub 1",
                "content": "bbb",
                "order": 1,
                "_id": ObjectId("55508f459ee2db4e3958d29a"),
                "status": "live"
            }
        ]
    },
    {
        "title": "task 3",
        "description": "task 3 desc",
        "order": 2,
        "_id": ObjectId("5551b844bb343a620f85f323"),
        "status": "live",
        "subTasks": [
            {
                "title": "task 3 sub 1",
                "content": "cccc",
                "order": -1,
                "_id": ObjectId("5551bcb8c31283c051d30b7c"),
                "status": "hidden"
            },
            {
                "title": "task 3 sub 2",
                "content": "cccc",
                "order": 0,
                "_id": ObjectId("5551b88abb343a620f85f324"),
                "status": "live"
            },
            {
                "title": "task 3 sub 3",
                "content": "ccc",
                "order": 2,
                "_id": ObjectId("5551ba40bb343a620f85f327"),
                "status": "hidden"
            }{
                "title": "task 3 sub 4",
                "content": "cccc",
                "order": 1,
                "_id": ObjectId("5551b8f1bb343a620f85f325"),
                "status": "hidden"
            }
        ]
    }{
        "title": "task 1",
        "description": "task 1 desc",
        "order": 10,
        "_id": ObjectId("554a13d4b692088a38f01f3a"),
        "status": "live",
        "subTasks": [
            {
                "title": "task 1 sub 1",
                "content": "aaa",
                "order": -2,
                "_id": ObjectId("554a13d4b692088a38f01f5a"),
                "status": "live"
            },
            {
                "title": "task 1 sub 2",
                "content": "aaa",
                "order": 1,
                "_id": ObjectId("554a13d4b692088a38f01f3a"),
                "status": "live"
            },
            {
                "title": "task 1 sub 3 ",
                "content": "aaa",
                "order": 2,
                "_id": ObjectId("5550d0a61662211332d9a973"),
                "status": "live"
            },
            {
                "title": "task 1 sub 4",
                "content": "aaa",
                "order": 8,
                "_id": ObjectId("554a13d4b692088a38f01f4a"),
                "status": "live"
            }
        ]
    }
]

我离成功很近了,所有的排序似乎都正常工作。我只需要帮助将子任务放回父任务中。非常感谢任何帮助。

谢谢


你期望或希望从聚合中得到什么输出? - chridam
嗨,已将预期结果添加到问题中。谢谢。 - Sharry
1个回答

5

你在聚合管道的开始就犯了一个错误。

$project: {
            "tasks" : 1
        }

如果你不备份数据,那么你可能会失去所有的数据。因此,首先需要进行备份:

$project: {
             tasks: 1,
             doc: {
                     title: "$title", 
                     order: "$order", 
                     description: "$description", 
                     status: "$status"
                  }
          }

接下来,按照你在问题中所做的方式执行$unwind

{$unwind: "$tasks"}, {$unwind: "$tasks.subTasks"}

然后进行排序。您需要使用复合键进行排序,否则按tasks. subTasks. order排序将不会保持当您按tasks.order排序时。

{$sort: {"tasks.order": -1, "tasks.subTasks.order": 1}}

然后就是困难的部分了。你需要将结果进行 $group ,第一步是将 subTasks 进行 $push,但首先需要保留任务属性:

$project: {
             doc: 1, 
             task_id: "$tasks._id", 
             tasks_doc: {
                           title: "$tasks.title", 
                           description: "$tasks.description", 
                           order: "$tasks.order", 
                           status: "$tasks.status"
                        }, 
             subTasks: "$tasks.subTasks"
          }

...收集子任务:

$group: {
           _id: {
                   _id: "$_id", 
                   task_id: "$task_id", 
                   doc: "$doc", 
                   task_doc: "$tasks_doc"
                }, 
           subTasks: {
                        $push: "$subTasks"
                     }
        }

对于tasks也是如此。请注意,在进行$group操作期间,您还需要将task_doc属性投影回来:

$group: {
           _id: {
                   _id: "$_id._id", 
                   doc: "$_id.doc"
                }, 
           tasks: {
                     $push: {
                               _id: "$_id.task_id", 
                               title: "$_id.task_doc.title", 
                               description: "$_id.task_doc.description",
                               order: "$_id.task_doc.order", 
                               status: "$_id.task_doc.status" 
                               subTasks: "$subTasks"
                            }
                  }
        }

然后将根doc属性投射回去:

$project: {
             _id: "$_id._id", 
             title: "$_id.doc.title", 
             description: "$_id.doc.description", 
             order: "$_id.doc.order", 
             status: "$_id.doc.status", 
             tasks: 1
          }

基本上就是这样了。这里是完整的原始聚合管道,您可以测试并查看是否得到所需结果:

[
 {$match: {_id: ObjectId("554a13d4b692088a38f01f3b")}}, 
 {$project: {tasks: 1, doc: {title: "$title", order: "$order", description: "$description", status: "$status"}}}, 
 {$unwind: "$tasks"}, 
 {$unwind: "$tasks.subTasks"}, 
 {$sort: {"tasks.order": -1, "tasks.subTasks.order": 1}}, 
 {$project: {doc: 1, task_id: "$tasks._id", tasks_doc: {title: "$tasks.title", description: "$tasks.description", order: "$tasks.order", status: "$tasks.status"}, subTasks: "$tasks.subTasks"}}, 
 {$group: {_id: {_id: "$_id", task_id: "$task_id", doc: "$doc", task_doc: "$tasks_doc"}, subTasks: {$push: "$subTasks"}}}, 
 {$group: {_id: {_id: "$_id._id", doc: "$_id.doc"}, tasks: {$push: {_id: "$_id.task_id", title: "$_id.task_doc.title", description: "$_id.task_doc.description", order: "$_id.task_doc.order", status: "$_id.task_doc.status", subTasks: "$subTasks"}}}}, 
 {$project: {_id: "$_id._id", title: "$_id.doc.title", description: "$_id.doc.description", order: "$_id.doc.order", status: "$_id.doc.status", tasks: 1}}
]

更新

如果一个数组字段为空或不存在(为null),则对该字段执行的$unwind操作将返回空结果。解决这种情况的方法是最初将null/空字段设置为某个值,例如"<empty-array>"。请注意,在对其进行$unwind之前,必须为每个数组进行此$projection。

查看这个答案了解如何使用$ifNull运算符。还可以在此处查看$size运算符。

处理完这部分后,需要使用$cond运算符检查"<empty-array>"值来$group回结果。


2
虽然这是一个很好的答案(+1),但我想补充一些东西。首个错误不在管道中,而是在建模上。嵌入的过度使用,在 OP 的数据模型中有很好的例证,可能会导致产品达到 BSON 文档的 16MB 大小限制,而且频繁和昂贵的文档迁移非常可能,甚至可以确定。 - Markus W Mahlberg
谢谢您指出这个问题。由于我们保证初始文档小于16MB,我认为所有聚合管道的中间文档也将符合大小限制,因为我们不会添加任何数据,而且我们还会拆分初始文档。所以我认为大小限制没有问题,或者我有什么遗漏吗? - bagrat
你是指最初的数据模型吗? - bagrat
我指的是最初的数据模型 ;) - Markus W Mahlberg
3
我会有一个产品集合和一个任务集合,并使用物化路径来引用父级,以及一个字段来引用相应的产品。 - Markus W Mahlberg
显示剩余8条评论

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接