假设我有两个表格 (SQL Fiddle)。其中一个记录了各种时间戳的值,另一个指示要采样最近值的ID和日期时间。使用类似于Kevin Meade的NEAREST NEIGHBOR PREFERENCE LOW(但在SQL Server 2008中),我想找到最接近目标(人口普查)日期但不是在(人口普查)日期之后的指定ID的(非空值)。如果有匹配人口普查日期的行,则使用该行(除非它具有空值)。如果没有在人口普查日期之前的行,则找到最接近人口普查日期但不在其之前的行,并使用该行。
这里是Meade先生的Oracle代码,用于在给定ID的一个表中实现类似功能:
第一个表:
CREATE TABLE Recorded_Vent_Types
([PAT_ENC_CSN_ID] int, [RECORDED_TIME] datetime, [MEAS_VALUE] varchar(9));
INSERT INTO Recorded_Vent_Types
([PAT_ENC_CSN_ID], [RECORDED_TIME], [MEAS_VALUE])
VALUES
(11117777, '2013-06-08 19:36:00.000', 'SIMV/PRVC'),
(11117777, '2013-06-08 22:21:00.000', 'PRVC/AC'),
(11117777, '2013-06-09 00:10:00.000', NULL),
(11117777, '2013-06-09 03:00:00.000', 'SIMV/PRVC'),
(11117777, '2013-06-09 23:59:00.000', 'SIMV/PRVC'),
(11117777, '2013-06-10 00:00:00.000', 'NAVA'),
(11117777, '2013-06-10 00:20:00.000', 'PS'),
(11117777, '2013-06-10 00:25:00.000', NULL),
(555999, '2013-06-08 00:36:00.000', NULL),
(555999, '2013-06-08 22:21:00.000', 'PRVC/AC'),
(555999, '2013-06-09 00:10:00.000', 'SIMV/PRVC'),
(555999, '2013-06-11 23:15:00.000', 'BIVENT'),
(555999, '2013-06-12 00:00:00.000', NULL),
(555999, '2013-06-12 00:20:00.000', 'PS');
第二个表格:
CREATE TABLE Census
([PAT_ENC_CSN_ID] int, [CENSUS_TIME] datetime);
INSERT INTO Census
([PAT_ENC_CSN_ID], [CENSUS_TIME])
VALUES
(11117777, '2013-06-08 00:00:00'),
(11117777, '2013-06-09 00:00:00'),
(11117777, '2013-06-10 00:00:00'),
(11117777, '2013-06-11 00:00:00'),
(555999, '2013-06-08 00:00:00'),
(555999, '2013-06-09 00:00:00'),
(555999, '2013-06-11 00:00:00'),
(555999, '2013-06-12 00:00:00');
这里是Meade先生的Oracle代码,用于在给定ID的一个表中实现类似功能:
select *
from claim_history
where claim_id = 1
and status_date =
(
select min(status_date)
from (
select max(status_date) status_date
from claim_history
where claim_id = 1
and status_date <= sysdate-3
union all
select min(status_date)
from claim_history
where claim_id = 1
and status_date > sysdate-3
)
)
/
我的预期结果集:
PAT_ENC_CSN_ID CENSUS_TIME RECORDED_TIME MEAS_VALUE
555999 June, 08 2013 00:00:00+0000 June, 08 2013 22:21:00+0000 PRVC/AC
555999 June, 09 2013 00:00:00+0000 June, 08 2013 22:21:00+0000 PRVC/AC
555999 June, 11 2013 00:00:00+0000 June, 09 2013 00:10:00+0000 SIMV/PRVC
555999 June, 12 2013 00:00:00+0000 June, 11 2013 23:15:00+0000 BIVENT
11117777 June, 08 2013 00:00:00+0000 June, 08 2013 19:36:00+0000 SIMV/PRVC
11117777 June, 09 2013 00:00:00+0000 June, 08 2013 22:21:00+0000 PRVC/AC
11117777 June, 10 2013 00:00:00+0000 June, 10 2013 00:00:00+0000 NAVA
11117777 June, 11 2013 00:00:00+0000 June, 10 2013 00:20:00+0000 PS
@Gordon Linoff给了我一个想法,使用人口普查时间和记录时间之间的日期差的绝对值。这使我修改了@bobs的解决方案here。
SELECT * FROM
(
SELECT rvt.PAT_ENC_CSN_ID, CENSUS_TIME, RECORDED_TIME, MEAS_VALUE, ABS(DATEDIFF(s, c.CENSUS_TIME, RECORDED_TIME)) diff,
ROW_NUMBER() OVER (PARTITION BY rvt.PAT_ENC_CSN_ID, c.CENSUS_TIME ORDER BY ABS(DATEDIFF(s, c.CENSUS_TIME, RECORDED_TIME))) AS SEQUENCE
FROM Recorded_Vent_Types rvt join Census c on rvt.PAT_ENC_CSN_ID=c.PAT_ENC_CSN_ID
WHERE MEAS_VALUE IS NOT NULL
) as m
WHERE SEQUENCE = 1
ORDER BY PAT_ENC_CSN_ID,CENSUS_TIME
;
但这将返回最接近的记录时间(绝对),不会优先考虑人口普查时间之前的记录时间。
PAT_ENC_CSN_ID CENSUS_TIME RECORDED_TIME MEAS_VALUE
555999 June, 08 2013 00:00:00+0000 June, 08 2013 22:21:00+0000 PRVC/AC
555999 June, 09 2013 00:00:00+0000 June, 09 2013 00:10:00+0000 SIMV/PRVC
555999 June, 11 2013 00:00:00+0000 June, 11 2013 23:15:00+0000 BIVENT
555999 June, 12 2013 00:00:00+0000 June, 12 2013 00:20:00+0000 PS
11117777 June, 08 2013 00:00:00+0000 June, 08 2013 19:36:00+0000 SIMV/PRVC
11117777 June, 09 2013 00:00:00+0000 June, 08 2013 22:21:00+0000 PRVC/AC
11117777 June, 10 2013 00:00:00+0000 June, 10 2013 00:00:00+0000 NAVA
11117777 June, 11 2013 00:00:00+0000 June, 10 2013 00:20:00+0000 PS