我正在处理一些以1分钟间隔存储的数据,数据看起来像这样:
CREATE TABLE #MinuteData
(
[Id] INT ,
[MinuteBar] DATETIME ,
[Open] NUMERIC(12, 6) ,
[High] NUMERIC(12, 6) ,
[Low] NUMERIC(12, 6) ,
[Close] NUMERIC(12, 6)
);
INSERT INTO #MinuteData
( [Id], [MinuteBar], [Open], [High], [Low], [Close] )
VALUES ( 1, '2015-01-01 17:00:00', 1.557870, 1.557880, 1.557870, 1.557880 ),
( 2, '2015-01-01 17:01:00', 1.557900, 1.557900, 1.557880, 1.557880 ),
( 3, '2015-01-01 17:02:00', 1.557960, 1.558070, 1.557960, 1.558040 ),
( 4, '2015-01-01 17:03:00', 1.558080, 1.558100, 1.558040, 1.558050 ),
( 5, '2015-01-01 17:04:00', 1.558050, 1.558100, 1.558020, 1.558030 ),
( 6, '2015-01-01 17:05:00', 1.558580, 1.558710, 1.557870, 1.557950 ),
( 7, '2015-01-01 17:06:00', 1.557910, 1.558120, 1.557910, 1.557990 ),
( 8, '2015-01-01 17:07:00', 1.557940, 1.558250, 1.557940, 1.558170 ),
( 9, '2015-01-01 17:08:00', 1.558140, 1.558200, 1.558080, 1.558120 ),
( 10, '2015-01-01 17:09:00', 1.558110, 1.558140, 1.557970, 1.557970 );
SELECT *
FROM #MinuteData;
DROP TABLE #MinuteData;
这些值跟踪货币汇率,因此对于每个一分钟间隔(柱形图),有一个表示该分钟开始的开盘价
和该分钟结束的收盘价
。最高价
和最低价
分别表示每个单独分钟内的最高和最低汇率。
期望输出
我希望将这些数据重新格式化为5分钟间隔,以产生以下输出:
MinuteBar Open Close Low High
2015-01-01 17:00:00.000 1.557870 1.558030 1.557870 1.558100
2015-01-01 17:05:00.000 1.558580 1.557970 1.557870 1.558710
这会从5分钟内的第一分钟中取出Open
值,从5分钟内的最后一分钟中取出Close
值。High
和Low
的值代表了在5分钟内最高的high
和最低的low
汇率。
当前解决方案
我有一个解决方案可以实现这个功能(如下所示),但它似乎不够优雅,因为它依赖于id
值和自连接。此外,我打算在更大的数据集上运行它,因此如果可能的话,我希望以更有效的方式来实现它:
-- Create a column to allow grouping in 5 minute Intervals
SELECT Id, MinuteBar, [Open], High, Low, [Close],
DATEDIFF(MINUTE, '2015-01-01T00:00:00', MinuteBar)/5 AS Interval
INTO #5MinuteData
FROM #MinuteData
ORDER BY minutebar
-- Group by inteval and aggregate prior to self join
SELECT Interval ,
MIN(MinuteBar) AS MinuteBar ,
MIN(Id) AS OpenId ,
MAX(Id) AS CloseId ,
MIN(Low) AS Low ,
MAX(High) AS High
INTO #DataMinMax
FROM #5MinuteData
GROUP BY Interval;
-- Self join to get the Open and Close values
SELECT t1.Interval ,
t1.MinuteBar ,
tOpen.[Open] ,
tClose.[Close] ,
t1.Low ,
t1.High
FROM #DataMinMax t1
INNER JOIN #5MinuteData tOpen ON tOpen.Id = OpenId
INNER JOIN #5MinuteData tClose ON tClose.Id = CloseId;
DROP TABLE #DataMinMax
DROP TABLE #5MinuteData
重新尝试
我不再使用上述查询,而是尝试使用FIRST_VALUE和LAST_VALUE,因为它似乎是我需要的,但我无法通过正在进行的分组操作使其正常工作。可能有比我正在尝试做的更好的解决方案,所以我愿意听取建议。目前我正在尝试这样做:
SELECT MIN(MinuteBar) MinuteBar5 ,
FIRST_VALUE([Open]) OVER (ORDER BY MinuteBar) AS Opening,
MAX(High) AS High ,
MIN(Low) AS Low ,
LAST_VALUE([Close]) OVER (ORDER BY MinuteBar) AS Closing ,
DATEDIFF(MINUTE, '2015-01-01 00:00:00', MinuteBar) / 5 AS Interval
FROM #MinuteData
GROUP BY DATEDIFF(MINUTE, '2015-01-01 00:00:00', MinuteBar) / 5
如果我删除这些行,查询将与FIRST_VALUE
和LAST_VALUE
相关,并出现以下错误:
由于未包含在聚合函数或GROUP BY子句中,所以选择列表中的列“#MinuteData.MinuteBar”无效。
开盘价
和收盘价
是你遇到困难的地方,因为它们应该只是区间内的第一个和最后一个值,而不管它们的具体数值如何?我理解得对吗? - Xedni