如何在Django中更快地检索计数对象？

Question

如何在Django中更快地检索计数对象？

pythondjangopostgresql

3

我的目标是优化在Django模型中对象数量的检索。

我有两个模型:

用户
潜在客户

这是一种一对多的关系。一个用户可以创建多个潜在客户，而一个潜在客户只能由一个用户创建。

我试图获取用户在过去的24小时内创建的潜在客户。

在我的PostgreSQL数据库中，潜在客户模型大约有700万行，而用户仅有2000行。

我的当前代码花费太长时间来获取所需的结果。

我尝试使用filter()和count()：

import datetime

# get the date but 24 hours earlier
date_example = datetime.datetime.now() - datetime.timedelta(days = 1)

# Filter Prospects that are created by user_id_example
# and filter Prospects that got a date greater than date_example (so equal or sooner)
today_prospects = Prospect.objects.filter(user_id = 'user_id_example', create_date__gte = date_example)

# get the count of prospects that got created in the past 24 hours by user_id_example
# this is the problematic call that takes too long to process
count_total_today_prospects = today_prospects.count()

我的代码能够运行，但它需要太长的时间（5分钟）。因为它在检查整个数据库，而不是只检查我认为的：用户在最近24小时内创建的潜在客户。

我还尝试使用annotate，但它同样很慢，因为它最终执行的操作与常规的.count()相同：

today_prospects.annotate(Count('id'))

我该如何更优化地获取计数？

- RobZ

如果您为“create_date”字段添加了“db_index=True”，会发生什么？添加后，您需要首先迁移数据库。 - Willem Van Onsem

目前我的代码是这样的：create_date = models.DateTimeField(auto_now = True)我应该将其改为create_date = models.DateTimeField(auto_now = True, db_index = True)然后呢？同样的.count方法会更快吗？@WillemVanOnsem - RobZ

@RobZ 不，你仍然需要使用适当的列索引更新你的数据库。 - freakish

2个回答

-2

Django 的文档：

调用 count() 会在后台执行一个 SELECT COUNT(*)，因此你应该始终使用 count() 而不是将所有记录加载到 Python 对象中并在结果上调用 len()（除非你需要将对象加载到内存中，这种情况下 len() 将更快）。请注意，如果你想要查询集中的项目数量并且还从中检索模型实例（例如，通过迭代它），那么使用 len(queryset) 更高效，这样不像 count() 会导致额外的数据库查询。如果已经完全检索了查询集，则 count() 将使用该长度而不是执行额外的数据库查询。

查看此链接：https://docs.djangoproject.com/en/3.2/ref/models/querysets/#count。

尝试使用 len()。

- MatiYo

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Ralf · Accepted Answer

假设您还没有建立该索引，我建议添加一个包含用户和日期字段的索引（请确保按照此顺序排列，先用户后日期，因为对于用户字段您需要进行精确匹配，但对于日期字段您只有一个起始点）。这应该可以加快查询速度。

例如：

class Prospect(models.Model):
    ...

    class Meta:
        ...
        indexes = [
            models.Index(fields=['user', 'create_date']),
        ]
        ...

这应该创建一个新的迁移文件（运行makemigrations和migrate），将索引添加到数据库中。

之后，您的代码应该运行得更快：

count_total_today_prospects = Prospect.objects\
    .filter(user_id='user_id_example', create_date__gte=date_example)\
    .count()