根据多列创建数据框的最后一列中的值

3

我有一个数据框,看起来像这样(但有更多的变量/列)

set.seed(5)
id<-seq(5)*floor(runif(5,min=1000, max=10000))
vals1<-c("Y","N","N","N","N")
vals2<-c("N","N","N","N","N")
vals3<-c("N","N","N","Y","N")
df<-data.frame(id,vals1,vals2,vals3)

我希望在框架中创建一个最终列,使其生成以下逻辑的最终标志:如果任何id的值为“Y”,则最终标志为“Y”,否则为“N”。因此,对于这个数据帧,第一个和第四个id(2801、14236)在最后一列中有一个“Y”,其余的都是“n”。我尝试了一些方法,如apply和if...else,但都没有成功。

3个回答

3

通过将“N”分配给每一行来初始化。在下一步中,对于具有“Y”的行(使用apply检查),分配“Y”

df$final = "N"
df$final[apply(df, 1, function(a) "Y" %in% a)] = "Y"

这太完美了!谢谢。 - user2900006

2
以下是您信件编码的解决方案。
set.seed(5)
id <- seq(5) * floor(runif(5, min=1000, max=10000))
vals1 <- c("Y","N","N","N","N")
vals2 <- c("N","N","N","N","N")
vals3 <- c("N","N","N","Y","N")

df <- data.frame(id, vals1, vals2, vals3)

# If you really want to use the letter encoding, my solution works as below
df$Final <- apply(df[,2:4], MARGIN = 1, FUN = function(x) {any(x == 'Y')})

然而,我认为你应该使用布尔值(TRUE/FALSE)来实现这个功能。

结合使用applyany可以很好地实现。

set.seed(5)
id <- seq(5) * floor(runif(5, min=1000, max=10000))
vals1 <- c("Y","N","N","N","N")
vals2 <- c("N","N","N","N","N")
vals3 <- c("N","N","N","Y","N")

df <- data.frame(id, vals1, vals2, vals3)

# Convert your labels into booleans:
df[,2:4] <- df[,2:4] == 'Y'

# Then summarise across rows
df$Final <- apply(df[,2:4], MARGIN = 1, FUN = function(x) {any(x)})

1
有些类似于 @d.b 的回答:
df$final <- apply(df, 1, function(x) c("N","Y")[any(x == "Y")+1])

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接