如何在R中使用特定差异来对值进行子集筛选?

3

我正在尝试使用特定差异来对向量中的一些值进行子集。在下面的向量中,我想要将一个向量分成几个具有特定差异1的向量。 例如, 一个问题

a <- c(1, 1.2, 1.6, 2, 2.2, 2.6, 3, 3.2, 3.6, 4, 4.2, 4.6, 5, 5.2, 5.6, 6, 7, 8, 9, 10)

因此,
 b <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
 c <- c(1.2, 2.2, 3.2, 4.2, 5.2)
 d <- c(1.6, 2.6, 3.6, 4.6, 5.6)

我尝试编写了一个For循环,但我认为它不够高效,有更好的方法来解决这个问题。


你可以尝试使用split(a, round(a %% 1, 1)),但我认为它不太可靠。通常,计算机在精确匹配数字方面表现不佳。请参见https://dev59.com/_Gkw5IYBdhLWcg3w_fiJ - Frank
2个回答

2

一种替代递归解决方案,对于每次递归,提取基于最小值的向量并将剩余值传递给下一次递归:

my_split = function(vec, tol) { 
    if(length(vec) == 0) list() 
    else {
        mod1 <- (vec - min(vec))%%1

        # here we check both abs(mod1) and abs(mod1 - 1) since for example 
        # (4.6 - 3.6)%%1 == 1 due to the fact that 4.6 - 3.6 < 1
        splits <- split(vec, abs(mod1) < tol | abs(mod1 - 1) < tol)
        c(list(splits$`TRUE`), my_split(splits$`FALSE`, tol))
        }      
    }

my_split(a, 0.001)     # use a tolerance here to deal with the problem that floating number 
                       # can not be exactly represented

# [[1]]
# [1]  1  2  3  4  5  6  7  8  9 10

# [[2]]
# [1] 1.2 2.2 3.2 4.2 5.2

# [[3]]
# [1] 1.6 2.6 3.6 4.6 5.6

1

这是您需要的:

a <- c(1, 1.2, 1.6, 2, 2.2, 2.6, 3, 3.2, 3.6, 4, 4.2, 4.6, 5, 5.2, 5.6, 6, 7, 8, 9, 10)

a_min = a[1]
a_max = a[length(a)]

h = a[a<(a_min+1)]

d = lapply(h, function(x){seq(x,a_max)[seq(x,a_max)%in%a]})

What does this do?
h stores all the elements between the first one and the same + 1
For each one of these create a sequence from it to the last element of a and only keep those that are in a.

The result is a list that contains each sequence:

> d
[[1]]
 [1]  1  2  3  4  5  6  7  8  9 10

[[2]]
[1] 1.2 2.2 3.2 4.2 5.2

[[3]]
[1] 1.6 2.6 3.6 4.6 5.6


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接