从数据框中删除符合多个条件的行。

Question

从数据框中删除符合多个条件的行。

3

我希望删除数据框中包含特定模式的行，并且如果可能的话，使用tidyverse语法。

我希望删除第一列包含“cat”并且col2：4中包含以下任何单词的行：dog，fox或cow。对于此示例，这将从原始数据中删除行1和4。

这是一个样本数据集：

df <- data.frame(col1 = c("cat", "fox", "dog", "cat", "pig"),
                 col2 = c("lion", "tiger", "elephant", "dog", "cow"),
                 col3 = c("bird", "cow", "sheep", "fox", "dog"),
                 col4 = c("dog", "cat", "cat", "cow", "fox"))

我尝试了许多不同的方法，但经常遇到问题。这是我的最新尝试：

filtered_df <- df %>%
  filter(!(animal1 == "cat" & !any(cowfoxdog <- across(animal2:animal4, ~ . %in% c("cow", "fox", "dog")))))

这会返回以下错误:

Error in `filter()`:
! Problem while computing `..1 = !...`.
Caused by error in `FUN()`:
! only defined on a data frame with all numeric variables

- TheGoat

2个回答

1

一种方法是使用filter()函数，它可以根据逻辑运算符过滤满足您条件的行：

library(tidyverse)

pattern1<-c("cat")
pattern2<-c("dog", "fox", "cow")

df %>% 
  filter(!(col1 == pattern1 & 
             (col2 %in% pattern2 | 
              col3 %in% pattern2 | 
              col4 %in% pattern2))
         )


  col1     col2  col3 col4
1  fox    tiger   cow  cat
2  dog elephant sheep  cat
3  pig      cow   dog  fox

- SALAR

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- zephryl · Accepted Answer

您可以使用if_any()函数。为了进行更加完备的测试，我先添加一行满足col1 == "cat"而且"dog"，"fox"或"cow" 不会出现在第二到第四列中。

library(dplyr)

df <- df %>% 
  add_row(col1 = "cat", col2 = "sheep", col3 = "lion", col4 = "tiger")

df %>% 
  filter(!(col1 == "cat" & if_any(col2:col4, \(x) x %in% c("dog", "fox", "cow"))))

  col1     col2  col3  col4
1  fox    tiger   cow   cat
2  dog elephant sheep   cat
3  pig      cow   dog   fox
4  cat    sheep  lion tiger