使用dplyr在R中基于分组数据进行条件筛选

Question

使用dplyr在R中基于分组数据进行条件筛选

4

考虑以下数据集：

data <- tibble(
  group = rep(1:4, 40),
  year = rep(1980:2019, 4),
  col= rnorm(160)
)

我希望过滤数据，使得：

获取值在组1和2中大于零，在组3和4中小于零的子集，其中“col”是列名。

- Raed Hamed

3

library(dplyr)

data %>%
  filter((group %in% 1:2 & col > 0) | (group %in% 3:4 & col < 0))

这段代码使用了dplyr包中的filter()函数，对数据集进行筛选。根据条件，筛选出group为1或2且col大于0的行，或者group为3或4且col小于0的行。 - PaulS

1

在您的情况下，您可以使用@PaulS的代码的较短版本，如：data％>％filter（group <3＆col> 0 | group> 2＆col <0）。 - lovalery

2个回答

3

这里还有一种方法，直接使用数学方法进行选择，而不是使用%in%

data %>% filter(col * sign((group < 3) - 0.5) > 0)
#> # A tibble: 76 x 3
#>    group  year    col
#>    <int> <int>  <dbl>
#>  1     2  1985  2.20 
#>  2     3  1986 -0.205
#>  3     4  1991 -2.10 
#>  4     3  1994 -0.113
#>  5     2  1997  1.90 
#>  6     1  2000  1.37 
#>  7     3  2002 -0.805
#>  8     4  2003 -0.535
#>  9     1  2004  0.792
#> 10     3  2006 -1.28 
#> # ... with 66 more rows

- Allan Cameron

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- edv · Accepted Answer

其中一种实现方式是：

data %>% filter(col > 0 & group %in% c(1,2) | col < 0 & group %in% c(3,4))