使用分组的Linq to SQL性能

3
我的问题与Linq to SQL性能有关,我有一个SQL字符串并将其转换为Linq to SQL:
SQL查询:
SELECT CONVERT(VARCHAR(10), ClockIn, 103) AS ClockDate, MIN(ClockIn) AS ClockIn, MAX(ClockOut) AS ClockOut, SUM(DATEDIFF(MINUTE, ClockIn, ClockOut)) AS [TotalTime]
FROM TimeLog
WHERE (EmployeeId = 10)
GROUP BY CONVERT(VARCHAR(10), ClockIn, 103)
ORDER BY ClockIn DESC

LINQ查询:
From u In objDC.TimeLogs
Where u.EmployeeId = 10
Group By Key = New With {u.ClockIn.Year, u.ClockIn.Month, u.ClockIn.Day} Into G = Group
Order By G.First.ClockIn Descending
Select New With {.ClockDate = Key.Day & "/" & Key.Month & "/" & Key.Year,
 .ClockIn = G.Min(Function(p) p.ClockIn),
 .ClockOut = G.Max(Function(p) p.ClockOut),
 .TotalTime = G.Sum(Function(p) SqlMethods.DateDiffMinute(p.ClockIn, p.ClockOut))}

LINQ在SQL Profiler中生成的查询字符串如下所示:
SELECT [t4].[value] AS [ClockDate], [t4].[value2] AS [ClockIn2], [t4].[value22] AS [ClockOut], [t4].[value3] AS [TotalTime]
 FROM (
 SELECT ((((CONVERT(NVarChar,[t3].[value32])) + '/') + (CONVERT(NVarChar,[t3].[value222]))) + '/') + (CONVERT(NVarChar,[t3].[value22])) AS [value], [t3].[value] AS [value2], [t3].[value2] AS [value22], [t3].[value3], [t3].[value22] AS [value222], [t3].[value222] AS [value2222], [t3].[value32]
 FROM (
 SELECT MIN([t2].[ClockIn]) AS [value], MAX([t2].[ClockOut]) AS [value2], SUM([t2].[value]) AS [value3], [t2].[value2] AS [value22], [t2].[value22] AS [value222], [t2].[value3] AS [value32]
 FROM (
 SELECT DATEDIFF(Minute, [t1].[ClockIn], [t1].[ClockOut]) AS [value], [t1].[EmployeeId], [t1].[value] AS [value2], [t1].[value2] AS [value22], [t1].[value3], [t1].[ClockIn], [t1].[ClockOut]
 FROM (
 SELECT DATEPART(Year, [t0].[ClockIn]) AS [value], DATEPART(Month, [t0].[ClockIn]) AS [value2], DATEPART(Day, [t0].[ClockIn]) AS [value3], [t0].[ClockIn], [t0].[ClockOut], [t0].[EmployeeId]
 FROM [dbo].[TimeLog] AS [t0]
 ) AS [t1]
 ) AS [t2]
 WHERE [t2].[EmployeeId] = 10
 GROUP BY [t2].[value2], [t2].[value22], [t2].[value3]
 ) AS [t3]
 ) AS [t4]
 ORDER BY (
 SELECT [t6].[ClockIn]
 FROM (
 SELECT TOP (1) [t5].[ClockIn]
 FROM [dbo].[TimeLog] AS [t5]
 WHERE ((([t4].[value222] IS NULL) AND (DATEPART(Year, [t5].[ClockIn]) IS NULL)) OR (([t4].[value222] IS NOT NULL) AND (DATEPART(Year, [t5].[ClockIn]) IS NOT NULL) AND ((([t4].[value222] IS NULL) AND (DATEPART(Year, [t5].[ClockIn]) IS NULL)) OR (([t4].[value222] IS NOT NULL) AND (DATEPART(Year, [t5].[ClockIn]) IS NOT NULL) AND ([t4].[value222] = DATEPART(Year, [t5].[ClockIn])))))) AND ((([t4].[value2222] IS NULL) AND (DATEPART(Month, [t5].[ClockIn]) IS NULL)) OR (([t4].[value2222] IS NOT NULL) AND (DATEPART(Month, [t5].[ClockIn]) IS NOT NULL) AND ((([t4].[value2222] IS NULL) AND (DATEPART(Month, [t5].[ClockIn]) IS NULL)) OR (([t4].[value2222] IS NOT NULL) AND (DATEPART(Month, [t5].[ClockIn]) IS NOT NULL) AND ([t4].[value2222] = DATEPART(Month, [t5].[ClockIn])))))) AND ((([t4].[value32] IS NULL) AND (DATEPART(Day, [t5].[ClockIn]) IS NULL)) OR (([t4].[value32] IS NOT NULL) AND (DATEPART(Day, [t5].[ClockIn]) IS NOT NULL) AND ((([t4].[value32] IS NULL) AND (DATEPART(Day, [t5].[ClockIn]) IS NULL)) OR (([t4].
 [value32] IS NOT NULL) AND (DATEPART(Day, [t5].[ClockIn]) IS NOT NULL) AND ([t4].[value32] = DATEPART(Day, [t5].[ClockIn])))))) AND ([t5].[EmployeeId] = 10)
 ) AS [t6]
 ) DESC

LINQ to SQL 查询速度太慢了,与手写的 SQL 查询相比,生成的查询执行计划只有 7%,而 Linq 生成的查询执行计划却达到了 97%。

我的 Linq to SQL 查询出了什么问题?还是说这是 Linq 的性能和限制问题?


欢迎来到泄漏抽象的世界。 - Robaticus
3
性能差异如此之大,您是否考虑使用存储过程作为查询方式?使用存储过程返回行集的方法(LINQ to SQL):http://msdn.microsoft.com/en-us/library/bb386975.aspx - Andrew Morton
感谢@AndrewMorton,我可以直接使用datacontext中编写的查询,无需存储过程,但我的问题是关于Linq如何有用并可以替代常规查询。似乎不行,您必须监视生成的查询并有时将Linq替换为字符串查询。 - Sameh
糟糕,根据我链接中的最后一个答案,您可以优化查询,我认为 OrderBy G.First... 是问题所在(访问子查询)。 - Guillaume86
考虑LINQ的目的。它是一个抽象层,可以简化数据库访问代码并加速应用程序开发。它不是SQL的替代品。对于95%的常见情况(简单CRUD操作),它表现良好并节省了大量时间,但在某些特殊情况下直接使用SQL更为可取。 - mellamokb
显示剩余3条评论
2个回答

4
我认为问题在于你在OrderBy G.First语句中访问了每个组的行,从而触发了Linq-to-SQL中的N+1行为。你可以尝试使用以下语句来解决这个问题:
var query = objDC.TimeLogs
            .Where(c => c.EmployeeId == 10)
            .GroupBy(c => c.ClockIn.Date)
            .OrderBy(g => g.Key)
            .Select(g => new
            {
                Date = g.Key,
                ClockIn = g.Min(c => c.ClockIn),
                ClockOut = g.Max(c => c.ClockOut),
            })
            .Select(g => new 
            {
                g.Date,
                g.ClockIn,
                g.ClockOut,
                TotalTime = g.ClockOut - g.ClockIn
            });

非常感谢Guillaume,他解决了这个问题。我同意你的看法,问题与G.First有关。根据你的答案,我改变了我的Linq查询。我将发布新的Linq查询,我得到了相同的结果,但查询速度更快,分析器为编写的查询给出了55%,新生成的查询给出了45%,甚至比原始字符串查询还要快。非常感谢你的帮助。 - Sameh

0

再次感谢Guillaume的建议,这是基于linq查询。

非常感谢Guillaume,它解决了问题。我同意您的看法,问题与G.First有关。

根据您的答案,我修改了Linq查询,如下:

From u In objDC.TimeLogs
Where u.EmployeeId = 10
Group By key = New With {u.ClockIn.Date} Into G = Group
Order By key.Date Descending
Select New With {
    .ClockDate = key.Date,
    .ClockIn = G.Min(Function(p) p.ClockIn),
    .ClockOut = G.Max(Function(p) p.ClockOut),
    .TotalTime = G.Sum(Function(p) SqlMethods.DateDiffMinute(p.ClockIn, p.ClockOut)) / 60}

我得到了相同的结果,但查询速度更快,分析器为编写的查询给出了55%,新生成的查询给出了45%,甚至比原始字符串查询还要快。

非常感谢您的帮助。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接