逐行检查若干列的条件

Question

逐行检查若干列的条件

4

数据示例：

df <- data.frame("a" = c(1,2,3,4), "b" = c(4,3,2,1), "x_ind" = c(1,0,1,1), "y_ind" = c(0,0,1,1), "z_ind" = c(0,1,1,1) )

> df
  a b x_ind y_ind z_ind
1 1 4     1     0     0
2 2 3     0     0     1
3 3 2     1     1     1
4 4 1     1     1     1

我想要添加一个新列，检查以"_ind"结尾的所有列的整行是否所有值都等于1。如果是，则返回1，否则返回0。因此结果数据框如下所示：

  a b x_ind y_ind z_ind keep
1 1 4     1     0     0    0
2 2 3     0     0     1    0
3 3 2     1     1     1    1
4 4 1     1     1     1    1

我可以使用df %>% select(contains("_ind"))来选择列，但我不确定如何进行逐行操作，以检查每行中的每个值是否都包含1，并将列附加回原始数据框。

任何帮助都将受到赞赏！正在使用Dplyr，但欢迎任何解决方案。

- user33484

3个回答

2

您可以在df等于1时使用rowSums函数，即：

rowSums(df[grepl('_ind', names(df))] == 1) == ncol(df[grepl('_ind', names(df))])
#[1] FALSE FALSE  TRUE  TRUE

如果您正在继续使用dplyr，您可以执行以下操作：

df %>% 
 select(contains("_ind")) %>% 
 mutate(new = rowSums(. == 1) == ncol(.))

#  x_ind y_ind z_ind   new
#1     1     0     0 FALSE
#2     0     0     1 FALSE
#3     1     1     1  TRUE
#4     1     1     1  TRUE

#OR you can filter directly

df %>% 
 select(contains("_ind")) %>% 
 filter(rowSums(. == 1) == ncol(.))

#  x_ind y_ind z_ind
#1     1     1     1
#2     1     1     1

如果您想保留原始列，可以使用以下方法：

 df %>% 
  filter_at(vars(ends_with('_ind')), all_vars(. == 1))

#  a b x_ind y_ind z_ind
#1 3 2     1     1     1
#2 4 1     1     1     1

注意: 当我们使用(.)时，点号指的是结果数据框。在这种情况下，它指的是条件中指定的列（即以_ind结尾的列）

同样地，在基本 R 中，

df[rowSums(df[grepl('_ind', names(df))] == 1) == ncol(df[grepl('_ind', names(df))]),]
#  a b x_ind y_ind z_ind
#3 3 2     1     1     1
#4 4 1     1     1     1

- Sotos

谢谢，"." 是什么意思/作用？ - user33484

我如何在保留原始列的同时执行此操作（即，我希望在最后取消选择）？ - user33484

@user33484 见编辑 - Sotos

1

你可以使用 apply 和 endsWith 来获取以 _ind 结尾的列，并测试它们是否等于 1。

df$keep <- +(apply(df[,endsWith(colnames(df), "_ind")]==1, 1, all))
df
#  a b x_ind y_ind z_ind keep
#1 1 4     1     0     0    0
#2 2 3     0     0     1    0
#3 3 2     1     1     1    1
#4 4 1     1     1     1    1

或者使用rowSums。

df$keep <- +(rowSums(df[,endsWith(colnames(df), "_ind")]!=1) == 0)

- GKi

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Ronak Shah · Accepted Answer

您可以在新版的dplyr中使用rowwise和c_across：

library(dplyr)
df %>% rowwise() %>% mutate(keep = +all(c_across(ends_with('ind')) == 1))


#      a     b x_ind y_ind z_ind  keep
#  <dbl> <dbl> <dbl> <dbl> <dbl> <int>
#1     1     4     1     0     0     0
#2     2     3     0     0     1     0
#3     3     2     1     1     1     1
#4     4     1     1     1     1     1