使用正则表达式和排序的简单Mongodb前缀查询速度较慢

Question

使用正则表达式和排序的简单Mongodb前缀查询速度较慢

9

我被这个简单的前缀查询卡住了。虽然Mongo文档中提到，通过使用前缀正则表达式格式(/^a/)，您可以获得相当不错的性能，但是当我尝试对结果进行排序时，查询速度非常慢： 940毫秒

db.posts.find({hashtags: /^noticias/ }).limit(15).sort({rank : -1}).hint('hashtags_1_rank_-1').explain()

{
"cursor" : "BtreeCursor hashtags_1_rank_-1 multi",
"isMultiKey" : true,
"n" : 15,
"nscannedObjects" : 142691,
"nscanned" : 142692,
"nscannedObjectsAllPlans" : 142691,
"nscannedAllPlans" : 142692,
"scanAndOrder" : true,
"indexOnly" : false,
"nYields" : 1,
"nChunkSkips" : 0,
"millis" : 934,
"indexBounds" : {
    "hashtags" : [
        [
            "noticias",
            "noticiat"
        ],
        [
            /^noticias/,
            /^noticias/
        ]
    ],
    "rank" : [
        [
            {
                "$maxElement" : 1
            },
            {
                "$minElement" : 1
            }
        ]
    ]
},
"server" : "XRTZ048.local:27017"
}

然而，同一个查询的未排序版本非常快：

0 毫秒

db.posts.find({hashtags: /^noticias/ }).limit(15).hint('hashtags_1_rank_-1').explain()

{
"cursor" : "BtreeCursor hashtags_1_rank_-1 multi",
"isMultiKey" : true,
"n" : 15,
"nscannedObjects" : 15,
"nscanned" : 15,
"nscannedObjectsAllPlans" : 15,
"nscannedAllPlans" : 15,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
    "hashtags" : [
        [
            "noticias",
            "noticiat"
        ],
        [
            /^noticias/,
            /^noticias/
        ]
    ],
    "rank" : [
        [
            {
                "$maxElement" : 1
            },
            {
                "$minElement" : 1
            }
        ]
    ]
},
"server" : "XRTZ048.local:27017"

如果我去掉正则表达式和排序，查询也会变得很快：

0毫秒

db.posts.find({hashtags: 'noticias' }).limit(15).sort({rank : -1}).hint('hashtags_1_rank_-1').explain()

{
"cursor" : "BtreeCursor hashtags_1_rank_-1",
"isMultiKey" : true,
"n" : 15,
"nscannedObjects" : 15,
"nscanned" : 15,
"nscannedObjectsAllPlans" : 15,
"nscannedAllPlans" : 15,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
    "hashtags" : [
        [
            "noticias",
            "noticias"
        ]
    ],
    "rank" : [
        [
            {
                "$maxElement" : 1
            },
            {
                "$minElement" : 1
            }
        ]
    ]
},
"server" : "XRTZ048.local:27017"

似乎同时使用正则表达式和排序会使Mongo扫描大量记录。但是，如果不使用正则表达式，仅进行排序，则只会扫描15个记录。这里出了什么问题？

- jaime

1

Jaime，我认为“scanAndOrder”是导致速度变慢的原因。你可能想看看Andre的答案，这可能与你的问题相似，如果不是完全相同的话。 - slee

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Andre de Frere · Accepted Answer

在解释输出中的scanAndOrder: true表示查询需要检索文档并在输出返回之前在内存中对其进行排序。这是一项昂贵的操作，将会对查询的性能产生影响。同时，scanAndOrder: true的存在以及解释输出中nscanned和n的差异表明该查询未使用最佳索引。在这种情况下，它似乎需要执行集合扫描。您可以通过将索引键包含在sort条件中来缓解此问题。根据我的测试：

db.posts.find({hashtags: /^noticias/ }).limit(15).sort({hashtags:1, rank : -1}).explain()

不需要进行扫描和排序，直接返回您正在查找的记录数n和nscanned。这也意味着在hashtags键上排序，这可能有用，也可能无用，但应该增加查询的性能。