加速Linq的group by语句

Question

加速Linq的group by语句

3

我有这样一张表格

UserID   Year   EffectiveDate   Type    SpecialExpiryDate
     1   2015   7/1/2014        A   
     1   2016   7/1/2015        B       10/1/2015

表格中没有 ExpriyDate ，因为它只有效期一年，所以可以通过在生效日期上加一年来计算到期日期。

我想要的结果是这样的（当前年份的生效日期和下一年的到期日期）

UserID   EffectiveDate   ExpiryDate
     1    7/1/2014        7/1/2016

如果用户类型为B，则会有一个特殊的到期日期，因此对于这个人，结果将是：

UserID   EffectiveDate   ExpiryDate
     1    7/1/2014        10/1/2015

这是我写的代码：

var result = db.Table1
            .Where(x => x.Year>= 2015 && (x.Type == "A" || x.Type == "B"))
            .GroupBy(y => y.UserID)
            .OrderByDescending(x => x.FirstOrDefault().Year)
            .Select(t => new
                         {
                             ID = t.Key,
                             Type = t.FirstOrDefault().Type,
                             EffectiveDate = t.FirstOrDefault().EffectiveDate,
                             ExpiryDate = t.FirstOrDefault().SpecialExpiryDate != null ? t.FirstOrDefault().SpecialExpiryDate : (t.Count() >= 2 ? NextExpiryDate : CurrentExpiryDate)
                          }
                    );

代码可以得到我需要的结果，但问题在于结果集中有大约10000条记录，需要5到6秒钟的时间。这个项目是用于Web搜索API的，所以我想加快速度，有没有更好的方法来查询？

编辑

抱歉，我犯了一个错误，在选择子句中应该是

EffectiveDate = t.LastOrDefault().EffectiveDate 但在C#的Linq中，它不支持将此LastOrDefault函数转换为SQL，这引起了新的问题，获取组中第二个项的最简单方法是什么？

- pita

@Habib，这是一个API，所以我认为我必须一次性返回所有数据。 - pita

你能将多少逻辑推送到你的数据库中？当处理大量数据时，LINQ并不像大多数服务器/大型机那样出色。 - ryanyuyu

@EZI，那么如何获取每个组的计数？ - pita

然后在 Select 中返回一个匿名对象，例如 new{Item=x.FirstOrDefault(), Count=x.Count() }。 - EZI

在查询的末尾添加 .AsNoTracking()。这可能会加快查询速度。 - Bill Blankenship

显示剩余5条评论

2个回答

2

您可以使用数据库中的View来动态生成计算数据。以下是伪代码示例：

Create View vwUsers AS 
    Select 
        UserID, 
        Year, 
        EffectiveDate, 
        EffectiveData + 1 as ExpiryDate,   // <-- 
        Type, 
        SpecialExpiryDate
    From 
        tblUsers

只需将您的 LINQ 查询连接到该查询即可。

- oɔɯǝɹ

1

谢谢，实际上我在SQL数据库中创建了一个存储过程，代码类似于select ..... from Users join (select MIN(EffectiveDate), MAX(some calculation to get ExpiryDate))。 - pita

@pita - 在分组之前进行OrderByDescending。之后.Take(1)应该就能起作用了。 - Enigmativity

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Enigmativity · Accepted Answer

试试这个：

var result =
    db
        .Table1
        .Where(x => x.Year>= 2015 && (x.Type == "A" || x.Type == "B"))
        .GroupBy(y => y.UserID)
        .SelectMany(y => y.Take(1), (y, z) => new
        {
            ID = y.Key,
            z.Type,
            z.EffectiveDate,
            ExpiryDate = z.SpecialExpiryDate != null
                ? z.SpecialExpiryDate 
                : (t.Count() >= 2 ? NextExpiryDate : CurrentExpiryDate),
            z.Year,
        })
        .OrderByDescending(x => x.Year);

< p > .SelectMany(y => y.Take(1) 实际上执行了你代码中的 .FirstOrDefault() 部分。通过这样做，你只需要执行一次而不是多次，可以极大地提高速度。

在我进行的一个类似结构的查询测试中，使用你的方法时运行了以下子查询：

SELECT t0.increment_id
FROM sales_flat_order AS t0
GROUP BY t0.increment_id

SELECT t0.hidden_tax_amount
FROM sales_flat_order AS t0
WHERE ((t0.increment_id IS NULL AND @n0 IS NULL) OR (t0.increment_id = @n0))
LIMIT 0, 1
-- n0 = [100000001]

SELECT t0.customer_email
FROM sales_flat_order AS t0
WHERE ((t0.increment_id IS NULL AND @n0 IS NULL) OR (t0.increment_id = @n0))
LIMIT 0, 1
-- n0 = [100000001]

SELECT t0.hidden_tax_amount
FROM sales_flat_order AS t0
WHERE ((t0.increment_id IS NULL AND @n0 IS NULL) OR (t0.increment_id = @n0))
LIMIT 0, 1
-- n0 = [100000002]

SELECT t0.customer_email
FROM sales_flat_order AS t0
WHERE ((t0.increment_id IS NULL AND @n0 IS NULL) OR (t0.increment_id = @n0))
LIMIT 0, 1
-- n0 = [100000002]

每个记录号需要进行两个子查询。

如果我采用我的方法，我会得到这个单一的查询：

SELECT t0.increment_id, t1.hidden_tax_amount, t1.customer_email
FROM (
  SELECT t2.increment_id
  FROM sales_flat_order AS t2
  GROUP BY t2.increment_id
  ) AS t0
CROSS APPLY (
  SELECT t3.customer_email, t3.hidden_tax_amount
  FROM sales_flat_order AS t3
  WHERE ((t3.increment_id IS NULL AND t0.increment_id IS NULL) OR (t3.increment_id = t0.increment_id))
  LIMIT 0, 1
  ) AS t1

我的方法应该会快得多。