我想使用
例如:我有以下数据表:
输出应该包含2行,1. ID = 2且colID =“temperature”,2. ID = 3且colID =“pressure”,因为在最近的3天中,ID = 2的温度重复了3次,而ID = 3的压力重复了。
我可以想到用基础R和循环来实现这个,但是对于正确的
data.table
检测“冻结”的传感器数据。为此,我可以检查最近 3 天的数据,并查看给定列的数据是否相同。输出应该是冻结的传感器的 UID,以及冻结的传感器流的名称(例如“温度”,“压力”等)。例如:我有以下数据表:
library(data.table)
dt <- data.table(UID = c(1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3),
report_date = rep(seq.Date(as.Date("2019/1/1"), as.Date("2019/1/6"), 'days'), 3),
temperature = c(101,103,105,101,105,104,105,103,101,102,102, 102,103,101,102,105,106,111),
pressure = c(10.2,10.4,10.7,10.5,10.1,10.2,10.1,10.5,10.11,10.22,10.11,10.11,10.1,10.1,10.1,10.3,10.3,10.3),
voltage = c(0.023,0.025,0.011,0.013,0.0254,0.028,0.028,0.077,0.026,0.0236,0.0258,0.0214,0.089,0.034,0.0387,0.0654,0.0246,0.02133))
dt
UID report_date temperature pressure voltage
1: 1 2019-01-01 101 10.20 0.02300
2: 1 2019-01-02 103 10.40 0.02500
3: 1 2019-01-03 105 10.70 0.01100
4: 1 2019-01-04 101 10.50 0.01300
5: 1 2019-01-05 105 10.10 0.02540
6: 1 2019-01-06 104 10.20 0.02800
7: 2 2019-01-01 105 10.10 0.02800
8: 2 2019-01-02 103 10.50 0.07700
9: 2 2019-01-03 101 10.11 0.02600
10: 2 2019-01-04 102 10.22 0.02360
11: 2 2019-01-05 102 10.11 0.02580
12: 2 2019-01-06 102 10.11 0.02140
13: 3 2019-01-01 103 10.10 0.08900
14: 3 2019-01-02 101 10.10 0.03400
15: 3 2019-01-03 102 10.10 0.03870
16: 3 2019-01-04 105 10.30 0.06540
17: 3 2019-01-05 106 10.30 0.02460
18: 3 2019-01-06 111 10.30 0.02133
输出应该包含2行,1. ID = 2且colID =“temperature”,2. ID = 3且colID =“pressure”,因为在最近的3天中,ID = 2的温度重复了3次,而ID = 3的压力重复了。
UID colID
1. 2 temperature
2. 3 pressure
我可以想到用基础R和循环来实现这个,但是对于正确的
data.table
方法感到困惑。