以下代码实现了Erwin在上面的答案(使用
BETWEEN
)的Django QuerySet:
一个用于任意Django QuerySet的实用函数如下所示。它默认假定“id”是用于
between
子句的合适字段。
def chunked_queryset(qs, batch_size, index='id'):
"""
Yields a queryset split into batches of maximum size 'batch_size'.
Any ordering on the queryset is discarded.
"""
qs = qs.order_by()
min_max = qs.aggregate(min=models.Min(index), max=models.Max(index))
min_id, max_id = min_max['min'], min_max['max']
for i in range(min_id, max_id + 1, batch_size):
filter_args = {'{0}__range'.format(index): (i, i + batch_size - 1)}
yield qs.filter(**filter_args)
使用方法如下:
for chunk in chunked_queryset(SomeModel.objects.all(), 20):
for item in chunk:
pass
您可以改变界面,这样您就不需要额外的嵌套循环,而是可以使用for item in chunked_queryset(qs)
:
def chunked_queryset(qs, batch_size, index='id'):
"""
Yields a queryset that will be evaluated in batches
"""
qs = qs.order_by()
min_max = qs.aggregate(min=models.Min(index), max=models.Max(index))
min_id, max_id = min_max['min'], min_max['max']
for i in range(min_id, max_id + 1, batch_size):
filter_args = {'{0}__range'.format(index): (i, i + batch_size - 1)}
for item in qs.filter(**filter_args):
yield item
sort
子句的情况下以相同的顺序返回。这正确吗?此外,如果我在我的Meta
类中有默认排序,我是否可以在查询中将其删除? - Joebetween
子句只有在ID已经排序或每次执行整个表扫描时才能正常工作,是这样吗? - JoeCLUSTER
感兴趣。如果id
被索引,那么我的查询将导致索引扫描,但仅当您读取的块是表的一小部分时才会如此。查询规划器决定哪个更快:索引扫描还是序列扫描。只需使用EXPLAIN ANALYZE
进行测试,亲自体验一下即可。 - Erwin Brandstetter