SQL:Last_Value()返回错误结果(但First_Value()正常工作)

15

我在SQL Server 2012中有一张如快照所示的表:

enter image description here

然后我使用Last_Value()和First Value来获取每个EmpID在不同YearMonth中的AverageAmount。脚本如下:

SELECT A.EmpID,  
       First_Value(A.AverageAmount) OVER (PARTITION BY A.EmpID Order by A.DimYearMonthKey asc) AS  '200901AvgAmount', 
       Last_Value(A.AverageAmount) OVER (PARTITION BY A.EmpID Order by A.DimYearMonthKey asc) AS '201112AvgAmount'

FROM  Emp_Amt  AS A

然而,这个查询的结果是:

result

"201112AvgAmount" 列显示了每个 EmpID 的不同值,而 "200901AvgAmount" 列则显示了正确的值。

我的 SQL 脚本有什么问题吗?我在网上做了很多研究,但仍然找不到答案......

3个回答

22
这是一个快速查询,以说明其行为:
select 
  v,

  -- FIRST_VALUE() and LAST_VALUE()
  first_value(v) over(order by v) f1,
  first_value(v) over(order by v rows between unbounded preceding and current row) f2,
  first_value(v) over(order by v rows between unbounded preceding and unbounded following) f3,
  last_value (v) over(order by v) l1,
  last_value (v) over(order by v rows between unbounded preceding and current row) l2,
  last_value (v) over(order by v rows between unbounded preceding and unbounded following) l3,

  -- For completeness' sake, let's also compare the above with MAX()
  max        (v) over() m1,
  max        (v) over(order by v) m2,
  max        (v) over(order by v rows between unbounded preceding and current row) m3,
  max        (v) over(order by v rows between unbounded preceding and unbounded following) m4
from (values(1),(2),(3),(4)) t(v)

上述查询的输出结果可以在此处查看(SQLFiddle 这里):
| V | F1 | F2 | F3 | L1 | L2 | L3 | M1 | M2 | M3 | M4 |
|---|----|----|----|----|----|----|----|----|----|----|
| 1 |  1 |  1 |  1 |  1 |  1 |  4 |  4 |  1 |  1 |  4 |
| 2 |  1 |  1 |  1 |  2 |  2 |  4 |  4 |  2 |  2 |  4 |
| 3 |  1 |  1 |  1 |  3 |  3 |  4 |  4 |  3 |  3 |  4 |
| 4 |  1 |  1 |  1 |  4 |  4 |  4 |  4 |  4 |  4 |  4 |

很少有人考虑到应用于带有ORDER BY子句的窗口函数的隐式框架。在这种情况下,窗口默认为框架RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW。(RANGE不完全等同于ROWS,但这是另一回事)。可以这样想:
  • v = 1的行上,有序窗口的框架跨越v IN (1)
  • v = 2的行上,有序窗口的框架跨越v IN (1, 2)
  • v = 3的行上,有序窗口的框架跨越v IN (1, 2, 3)
  • v = 4的行上,有序窗口的框架跨越v IN (1, 2, 3, 4)

如果要防止这种行为,您有两个选择:

  • 对于ordered窗口函数,请使用显式的ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING子句
  • 在那些允许省略它们的窗口函数中不使用ORDER BY子句(如MAX(v) OVER()

更多细节在这篇关于LEAD()LAG()FIRST_VALUE()LAST_VALUE()的文章中进行了解释


16

你的脚本没有问题,这是SQL Server中分区工作的一种方式 :/。如果你将LAST_VALUE更改为MAX,则结果将是相同的。解决方案如下:

您的脚本没有问题,这是SQL Server中分区的工作机制。如果将LAST_VALUE更改为MAX,结果将相同。解决方法如下:

SELECT A.EmpID,  
       First_Value(A.AverageAmount) OVER (PARTITION BY A.EmpID Order by A.DimYearMonthKey asc) AS  '200901AvgAmount', 
       Last_Value(A.AverageAmount) OVER (PARTITION BY A.EmpID Order by A.DimYearMonthKey ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS '201112AvgAmount'  
FROM  Emp_Amt  AS A

这里有一篇很棒的文章,链接。祝好运!


非常感谢!不仅解决了问题,还告诉我原因。我之前浏览过链接的帖子,但没有考虑分区机制。 - Echo

0

最简单的方法是使用 first_value 重复您的查询,在第一个情况下将顺序设置为升序,在第二个情况下将顺序设置为降序。

SELECT A.EmpID,  
       First_Value(A.AverageAmount) OVER (PARTITION BY A.EmpID Order by A.DimYearMonthKey asc) AS  '200901AvgAmount', 
       First_Value(A.AverageAmount) OVER (PARTITION BY A.EmpID Order by A.DimYearMonthKey desc) AS '201112AvgAmount'

FROM  Emp_Amt  AS A

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接