EntityFramework的.Take()随着时间的推移性能降低

Question

EntityFramework的.Take()随着时间的推移性能降低

4

我编写了一个函数，可以让我一次对特定数量的实体运行函数，并动态调整查询时间。然而，随着它继续处理实体，即使只处理一个实体，每个查询所需的时间也会逐渐增加。

public async Task Work(Expression<Func<dbase, bool>> predicate, Action<CollectionsMax, dbase> action)
{
    try
    {
        using (var cmax = _cmax)
        {
            cmax.Configuration.AutoDetectChangesEnabled = false;
            double count = await cmax.dbases.CountAsync(predicate);
            var takeAmount = 1;
            var taken = 0;

            var takeTimer = new Stopwatch();
            while (taken != (int) count)
            {
                cmax.Configuration.AutoDetectChangesEnabled = false;
                takeTimer.Reset();

                takeTimer.Start();
                IQueryable<dbase> query = cmax.dbases.Where(predicate)
                    .OrderBy(o => o.id)
                    .Skip(taken)
                    .Take(Math.Min(takeAmount, (int) count - taken));

                var take = await query.ToListAsync();

                takeTimer.Stop();
                Console.WriteLine("Took {0} and that took {1}ms", take.Count, takeTimer.ElapsedMilliseconds);
                taken += take.Count;
                if (takeTimer.ElapsedMilliseconds < 2000)
                {
                    takeAmount = takeAmount + 5;
                }
                if (takeTimer.ElapsedMilliseconds > 2000)
                {
                    takeAmount = takeAmount - 5;
                }
                if (takeAmount < 1)
                    takeAmount = 1;

                Parallel.ForEach(take, obj => action(_max, obj));

            }
        }
    }
    catch (Exception ex)
    {
        Console.WriteLine("Error: " + ex.Message);
    }
}

输出：http://puu.sh/hxMSD/68dbf4f079.png

- Adam Reed

请展示涉及的SQL语句。 - Rick James

3个回答

1

问题在以下部分：

IQueryable<dbase> query = cmax.dbases.Where(predicate)
    .OrderBy(o => o.id)
    .Skip(taken)
    .Take(Math.Min(takeAmount, (int) count - taken));

当taken != 0时，服务器需要执行查询，然后运行整个列表并丢弃前taken行。问题不在于linq，而在于sql引擎，因为它不能利用索引。问题不在于Take，而在于Skip。Joel On Software使用C语言的strlen函数更一般地解释了这个问题。

通常只读取前几页，所以这不是问题。但如果需要读取更多页面，则问题会变得更糟。这就是为什么需要获取后续页面的系统不使用Skip，而是使用排序和过滤的原因。在您的情况下，考虑到您按id排序，您可以这样说：

IQueryable<dbase> query = cmax.dbases.Where(predicate)
    .OrderBy(o => o.id)
    .Where(o => i.id > LastTakenId)
    .Take(Math.Min(takeAmount, (int) count - taken));


    ...
    LastTakenId = take.Max(o => o.id);

PS：如果您使用以下的Take重载，也可以稍微加快速度：

int toTake = Math.Min(takeAmount, (int) count - taken);
IQueryable<dbase> query = cmax.dbases.Where(predicate)
    .OrderBy(o => o.id)
    .Where(o => i.id > LastTakenId)
    .Take(() => toTake);

因为这样可以让 EF 引擎缓存生成的 SQL，然后仅替换新的取数数量。请参见性能注意事项第4.2节。

- felipe

0

从数据库中加载的实体越多，您的上下文跟踪的实体也就越多，这也会减慢所有操作。

- jjj

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- mellamokb · Accepted Answer

在这段代码中：

IQueryable<dbase> query = cmax.dbases.Where(predicate)
  .OrderBy(o => o.id)
  .Skip(taken)
  .Take(Math.Min(takeAmount, (int) count - taken));

Skip不是免费的。它不会记住上一次评估此查询时已处理了先前的实体。

因此，更好的思考方式是您的输出为：

Read 1 and that took 457ms
Read 7, skipping the first 1 and that took 172ms
Read 18, skipping the first 7 and that took 266ms
Read 34, skipping the first 18 and that took 378ms
etc..

那么它需要更长的时间就有更多的意义了。