R中类似于SQL的'WHERE'子句的功能是什么？

Question

R中类似于SQL的'WHERE'子句的功能是什么？

8

我有一个数据集分配给一个名为“temps”的变量，其中包含列“日期”、“温度”和“国家”。
我想做类似于SQL的操作：

SELECT * FROM temps WHERE country != 'mycountry'

我该如何在R中进行类似的选择？

- Lasitha Yapa

2

library(dplyr) ; temps %>% filter(country != 'mycountry') ... 或者查看 sqldf，如果你喜欢。 - alistaire

1

@alistaire 我不认为我想要一个外部库来完成这项工作。 - Lasitha Yapa

3个回答

5

这应该可以解决问题。

temps2 <- temps[!temps$country %in% "mycountry",]

- milan

3

以下是基于下面注意事项中的输入数据的sqldf和基本R方法，包括源代码和示例输出。

1) sqldf

library(sqldf)
sqldf("SELECT * FROM temps WHERE country != 'mycountry'")
##   country value
## 1   other     2

2) 基本R

subset(temps, country != "mycountry")
##   country value
## 2   other     2

注意：上述使用的测试数据在此处显示。下次请在问题中提供这样的可重现示例数据。

# test data
temps <- data.frame(country = c("mycountry", "other"), value = 1:2)

- G. Grothendieck

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- akrun · Accepted Answer

我们可以在base R中使用类似的语法。

temps[temps$country != "mycountry",]

基准测试

set.seed(24)
temps1 <- data.frame(country = sample(LETTERS, 1e7, replace=TRUE),
                  val = rnorm(1e7))
system.time(temps1[!temps1$country %in% "A",])
#  user  system elapsed 
#   0.92    0.11    1.04 
system.time(temps1[temps1$country != "A",])
#   user  system elapsed 
#   0.70    0.17    0.88

如果我们使用软件包解决方案

library(sqldf)
system.time(sqldf("SELECT * FROM temps1 WHERE country != 'A'"))
#   user  system elapsed 
# 12.78    0.37   13.15 

library(data.table)
system.time(setDT(temps1, key = 'country')[!("A")])
#   user  system elapsed 
#  0.62    0.19    0.37