如何在R中向量化子集函数？

Question

如何在R中向量化子集函数？

rdplyr

6

我已经成功地对一些函数进行了向量化处理，这对于编写简洁的代码、避免循环和提高速度非常有帮助。

然而，我无法将任何基于函数输入对数据框进行子集化筛选（subset）的函数向量化。

示例

例如，当该函数接收到元素时，它能够正常工作。

test_funct <- function(sep_wid, sep_len) {
    iris %>% filter(Sepal.Width > sep_wid & Sepal.Length < sep_len) %>% .$Petal.Width %>% sum
}

test_funct(4, 6)

# [1] 0.7 # This works nicely

但是在试图将向量作为此函数的输入提供时：

sep_wid_vector <- c(4, 3.5, 3)
sep_len_vector <- c(6, 6, 6.5)


test_funct(sep_wid_vector, sep_len_vector)

[1] 9.1

但期望的输出是与输入向量长度相同的向量，就好像函数在每个向量的第一个元素上运行，然后是第二个元素，第三个元素等等。

# 0.7    4.2     28.5

为了方便起见，这里的输出好像是分别运行的

test_funct(4, 6) # 0.7
test_funct(3.5, 6) # 4.2
test_funct(3, 6.5) # 28.5

我该如何将一个根据其输入子集数据的函数向量化，以便它可以接收向量输入？

- stevec

3个回答

5

您可以使用向量化:

tv <- Vectorize(test_funct)

tv(sep_wid_vector, sep_len_vector)
# [1]  0.7  4.2 28.5

这基本上是对mapply的封装。请注意，在底层运行的是*apply函数，这也有点像循环。

- thothal

2

这里有一种使用 sapply 的方法。最初的回答。

# function using sapply
test_funct <- function(sep_wid, sep_len) {
  sapply(seq_along(sep_wid), function(x) {
    sum(iris$Petal.Width[iris$Sepal.Width > sep_wid[x] & iris$Sepal.Length < sep_len[x]])
  })
}

# testing with single value
test_funct(4,6)
[1] 0.7

# testing with vectors
test_funct(sep_wid_vector, sep_len_vector)
[1]  0.7  4.2 28.5

- cropgen

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Tarquinnn · Accepted Answer

问题在于filter使用向量输入，因此它会循环使用Sepal.width和Sepal.length比较中的向量。

一种解决方法是使用purrr包中的map2函数：

map2_dbl(sep_wid_vector, sep_len_vector, test_funct)

当然，您可以将此内容封装在函数中。您可能还想考虑将数据框作为函数参数传入。