我是从
列出所有组合的combn重定向到这里的,因为这是其中一个重复目标。这是一个旧问题,@RichScriven提供的答案非常好,但我想给社区提供几个更自然、更高效的选项(最后两个)。
我们首先注意到输出与
幂集非常相似。调用
rje
包中的
powerSet
,我们可以看到我们的输出确实与幂集中的每个元素匹配,除了第一个元素等同于
空集合:
x <- c("red", "blue", "black")
rje::powerSet(x)
[[1]]
character(0)
[[2]]
[1] "red"
[[3]]
[1] "blue"
[[4]]
[1] "red" "blue"
[[5]]
[1] "black"
[[6]]
[1] "red" "black"
[[7]]
[1] "blue" "black"
[[8]]
[1] "red" "blue" "black"
如果您不想要第一个元素,可以在函数调用末尾轻松添加
[-1]
,如下所示:
rje::powerSet(x)[-1]
。
接下来的两个解决方案来自较新的包
arrangements
和
RcppAlgos
(我是作者),将为用户提供更高效的解决方案。这两个包都能够生成
多重集合的组合。
为什么这很重要?
可以证明,从集合
A
的幂集到多重集合
c(rep(emptyElement, length(A)), A)
选择
length(A)
的所有组合存在
一对一映射,其中
emptyElement
是空集的表示(如零或空白)。考虑到这一点,观察:
library(arrangements)
arrangements::combinations(x = c("",x), k = 3, freq = c(2, rep(1, 3)))
[,1] [,2] [,3]
[1,] "" "" "red"
[2,] "" "" "blue"
[3,] "" "" "black"
[4,] "" "red" "blue"
[5,] "" "red" "black"
[6,] "" "blue" "black"
[7,] "red" "blue" "black"
library(RcppAlgos)
comboGeneral(c("",x), 3, freqs = c(2, rep(1, 3)))
[,1] [,2] [,3]
[1,] "" "" "red"
[2,] "" "" "blue"
[3,] "" "" "black"
[4,] "" "red" "blue"
[5,] "" "red" "black"
[6,] "" "blue" "black"
[7,] "red" "blue" "black"
如果您不喜欢处理空元素和/或矩阵,您也可以使用
lapply
返回一个列表。
lapply(seq_along(x), comboGeneral, v = x)
[[1]]
[,1]
[1,] "red"
[2,] "blue"
[3,] "black"
[[2]]
[,1] [,2]
[1,] "red" "blue"
[2,] "red" "black"
[3,] "blue" "black"
[[3]]
[,1] [,2] [,3]
[1,] "red" "blue" "black"
lapply(seq_along(x), function(y) arrangements::combinations(x, y))
[[1]]
[,1]
[1,] "red"
[2,] "blue"
[3,] "black"
[[2]]
[,1] [,2]
[1,] "red" "blue"
[2,] "red" "black"
[3,] "blue" "black"
[[3]]
[,1] [,2] [,3]
[1,] "red" "blue" "black"
现在我们展示最后两种方法更加高效(注:我从@RichSciven提供的答案中删除了
do.call(c,
和
simplify = FALSE
,以便比较生成类似输出。 我还包括
rje :: powerSet
以确保):
set.seed(8128)
bigX <- sort(sample(10^6, 20))
library(microbenchmark)
microbenchmark(powSetRje = powerSet(bigX),
powSetRich = lapply(seq_along(bigX), combn, x = bigX),
powSetArrange = lapply(seq_along(bigX), function(y) arrangements::combinations(x = bigX, k = y)),
powSetAlgos = lapply(seq_along(bigX), comboGeneral, v = bigX),
unit = "relative")
Unit: relative
expr min lq mean median uq max neval
powSetRje 64.4252454 44.063199 16.678438 18.63110 12.082214 7.317559 100
powSetRich 61.6766640 43.027789 16.009151 17.88944 11.406994 7.222899 100
powSetArrange 0.9508052 1.060309 1.080341 1.02257 1.262713 1.126384 100
powSetAlgos 1.0000000 1.000000 1.000000 1.00000 1.000000 1.000000 100
进一步地,
arrangements
还配备了一个名为
layout
的参数,允许用户选择特定的输出格式。其中之一是
layout = "l"
用于列表。它类似于在
combn
中设置
simplify = FALSE
,并且允许我们获得类似于
powerSet
的输出。请注意:
do.call(c, lapply(seq_along(x), function(y) {
arrangements::combinations(x, y, layout = "l")
}))
[[1]]
[1] "red"
[[2]]
[1] "blue"
[[3]]
[1] "black"
[[4]]
[1] "red" "blue"
[[5]]
[1] "red" "black"
[[6]]
[1] "blue" "black"
[[7]]
[1] "red" "blue" "black"
而且基准测试:
microbenchmark(powSetRje = powerSet(bigX)[-1],
powSetRich = do.call(c, lapply(seq_along(bigX), combn, x = bigX, simplify = FALSE)),
powSetArrange = do.call(c, lapply(seq_along(bigX), function(y) arrangements::combinations(bigX, y, layout = "l"))),
times = 15, unit = "relative")
Unit: relative
expr min lq mean median uq max neval
powSetRje 5.539967 4.785415 4.277319 4.387410 3.739593 3.543570 15
powSetRich 4.994366 4.306784 3.863612 3.932252 3.334708 3.327467 15
powSetArrange 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 15 15
gtool
包,可以使用combinations
而不是permutations
:sapply(seq_along(x), combinations, v = x, n = length(x))
。 - Davide Passaretti