如何在dplyr中仅将函数应用于每个组的第一行?

3

数据

我有3个变量。 Vehicle.ID2 是一对车辆的唯一标识符,dV 是前后车速度差,dA 是加速度差,其保持不变一段时间。因此,我的分组变量是 Vehicle.ID2dA。以下是仅针对 1 个 Vehicle.ID2 的原始数据的几行:

    veh <- structure(list(Vehicle.ID2 = c("907-904", "907-904", "907-904", 
"907-904", "907-904", "907-904", "907-904", "907-904", "907-904", 
"907-904", "907-904", "907-904", "907-904", "907-904", "907-904", 
"907-904", "907-904", "907-904", "907-904", "907-904", "907-904"
), dA = c(0.43024, 0.43024, 0.43024, 0.43024, 0.43024, 0.43024, 
0.43024, 0.43024, 0.43024, 0.43024, 0.43024, -0.3162, -0.3162, 
-0.3162, -0.3162, -0.3162, -0.3162, -0.3162, -0.3162, -0.3162, 
-0.3162), dV = c(-0.0427200000000001, 0.11031, 0.22627, 0.30058, 
0.33838, 0.35264, 0.35803, 0.36481, 0.37677, 0.39292, 0.40961, 
0.42206, 0.42557, 0.416090000000001, 0.39003, 0.34668, 0.296580000000001, 
0.268000000000001, 0.29681, 0.399859999999999, 0.554639999999999
)), class = "data.frame", .Names = c("Vehicle.ID2", "dA", "dV"
), row.names = c(NA, -21L))

目标

我想创建一个新的列OC_DV。最初,OC_DV的所有值都是"no"。我可以通过以下方式实现:

veh$OC_DV <- "no"  

现在,首先我想通过变量 Vehicle.ID2dA 来拆分数据。然后对于每个组,我想查看 dV 的第一个值的符号是否与 dV 的最后一个值的符号相匹配。根据符号匹配或不匹配的条件,我想仅修改 OC_DV 的第一个值。以下是代码:
OC_DV[1] <- ifelse(sign(head(dV,1))== sign(tail(dV,1)),  "no",
                     ifelse(sign(head(dV,1))==-1 & sign(tail(dV,1))==1, "OPDV",
                            ifelse(sign(head(dV,1))==1 & sign(tail(dV,1))==-1,"CLDV","no")))  

问题

我尝试使用mutatedo,但它们会产生错误:

    veh <- veh %>% 
  group_by(Vehicle.ID2, dA) %>%
  mutate(OC_DV[1] = ifelse(sign(head(dV,1))== sign(tail(dV,1)),  "no",
                           ifelse(sign(head(dV,1))==-1 & sign(tail(dV,1))==1, "OPDV",
                                  ifelse(sign(head(dV,1))==1 & sign(tail(dV,1))==-1,"CLDV","no")))
  )
Error: unexpected '=' in:
"  group_by(Vehicle.ID2, dA) %>%
  mutate(OC_DV[1] ="



 veh <- veh %>% 
  group_by(Vehicle.ID2, dA) %>%
  do(OC_DV[1] = ifelse(sign(head(dV,1))== sign(tail(dV,1)),  "no",
                           ifelse(sign(head(dV,1))==-1 & sign(tail(dV,1))==1, "OPDV",
                                  ifelse(sign(head(dV,1))==1 & sign(tail(dV,1))==-1,"CLDV","no")))
  )
Error: unexpected '=' in:
"  group_by(Vehicle.ID2, dA) %>%
  do(OC_DV[1] ="

如果我删除[1],则不会出现错误,但是组中的所有值都会更改:

    veh %>% 
  group_by(Vehicle.ID2, dA) %>%
  mutate(OC_DV = ifelse(sign(head(dV,1))== sign(tail(dV,1)),  "no",
                                ifelse(sign(head(dV,1))==-1 & sign(tail(dV,1))==1, "OPDV",
                                       ifelse(sign(head(dV,1))==1 & sign(tail(dV,1))==-1,"CLDV","no")))
  )

我该怎么做才能只更改第一个值?

期望输出:

    structure(list(Vehicle.ID2 = c("907-904", "907-904", "907-904", 
"907-904", "907-904", "907-904", "907-904", "907-904", "907-904", 
"907-904", "907-904", "907-904", "907-904", "907-904", "907-904", 
"907-904", "907-904", "907-904", "907-904", "907-904", "907-904"
), dA = c(0.43024, 0.43024, 0.43024, 0.43024, 0.43024, 0.43024, 
0.43024, 0.43024, 0.43024, 0.43024, 0.43024, -0.3162, -0.3162, 
-0.3162, -0.3162, -0.3162, -0.3162, -0.3162, -0.3162, -0.3162, 
-0.3162), dV = c(-0.0427200000000001, 0.11031, 0.22627, 0.30058, 
0.33838, 0.35264, 0.35803, 0.36481, 0.37677, 0.39292, 0.40961, 
0.42206, 0.42557, 0.416090000000001, 0.39003, 0.34668, 0.296580000000001, 
0.268000000000001, 0.29681, 0.399859999999999, 0.554639999999999
), OC_DV = c("OPDV", "no", "no", "no", "no", "no", "no", "no", 
"no", "no", "no", "no", "no", "no", "no", "no", "no", "no", "no", 
"no", "no")), class = "data.frame", .Names = c("Vehicle.ID2", 
"dA", "dV", "OC_DV"), row.names = c(NA, -21L))

1
你的 dput 抛出错误... - Sotos
1
同时,请删除行首的 >+。我们无法按照那种方式复制和粘贴代码。 - Axeman
1
@Sotos和@Axeman,我已经修复了dput并删除了>+ - umair durrani
2个回答

4

这个有效:

稍微更清晰的条件函数:

fun <- function(x) {
  switch(
    paste(sign(head(x,1)), sign(tail(x,1))),
    '-1 1' = 'OPDV',
    '1 -1' = 'CLDV',
    'no'
   )
}

然后我们将该函数仅应用于组中的第一行。

veh %>% 
  group_by(Vehicle.ID2, dA) %>% 
  mutate(OC_DV = if_else(row_number() == 1, fun(dV), 'no'))

4
另一个使用了许多“mutate”的想法。
library(dplyr) 
veh %>% 
   group_by(Vehicle.ID2, dA) %>% 
   mutate(id = seq(dV)) %>% 
   mutate(OC_DV = fun1(dV)) %>% 
   mutate(OC_DV = ifelse(id == 1, OC_DV, 'no'))

在哪里,

fun1 <- function(x){ifelse(sign(head(x,1))== sign(tail(x,1)),  "no",
                           ifelse(sign(head(x,1))==-1 & sign(tail(x,1))==1, "OPDV",
                                  ifelse(sign(head(x,1))==1 & sign(tail(x,1))==-1,"CLDV","no")))}

1
所有这些变化可以合并成一个,您可以引用早期行中的变量。 - Axeman

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接