在数据框中基于列条件筛选子集/过滤行

Question

在数据框中基于列条件筛选子集/过滤行

87

给定一个数据框 "foo"，如何选择只有那些符合条件 foo$location = "there" 的行？

foo = data.frame(location = c("here", "there", "here", "there", "where"), x = 1:5, y = 6:10)
foo
#   location x  y
# 1     here 1  6
# 2    there 2  7
# 3     here 3  8
# 4    there 4  9
# 5    where 5 10

期望的结果是"bar"：

#   location x y
# 2    there 2 7
# 4    there 4 9

- wishihadabettername

3个回答

6

除了以上的回答，您还可以对列进行索引，而不是指定列名，这在某些情况下也很有用。假设您的位置是第一个字段，则应该如下所示：

    bar <- foo[foo[ ,1] == "there", ]

这很有用，因为您可以对列值执行操作，例如循环遍历特定列（您也可以通过索引行号执行相同操作）。

如果您需要对多个列执行某些操作，这也很有用，因为您可以指定一系列列：

    foo[foo[ ,c(1:N)], ]

或者具体列，正如你所期望的那样。

    foo[foo[ ,c(1,5,9)], ]

- DryLabRebel

2

另一种选项可能是使用dplyr中的函数filter。以下是一个可重现的示例：

foo = data.frame(location = c("here", "there", "here", "there", "where"), x = 1:5, y = 6:10)
library(dplyr)
filter(foo, location == "there")
#>   location x y
#> 1    there 2 7
#> 2    there 4 9

^{使用reprex v2.0.2于2022年9月11日创建}

- Quinten

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- JoFrhwld · Accepted Answer

这里有两种主要的方法。我更喜欢这个，因为它更易读：

bar <- subset(foo, location == "there")

请注意，您可以使用&和|将许多条件串联在一起，以创建复杂的子集。

第二种方法是索引法。您可以使用数字或布尔切片对R中的行进行索引。 foo$location == "there"返回一个与foo的行数相同的T和F值的向量。您可以使用此方法仅返回满足条件的行。

foo[foo$location == "there", ]