如何将十进制数转换为三进制数。

4

我想知道是否有一种方法可以将十进制数转换为三进制数,假设已有一个用于转换为二进制的函数intToBits.

实际上,我需要将像这样的字符字符串转换:

> S0 <- c("Hello Stac")

转换为3进制。我首先想到将其转换为十进制,使用

> S01 <- utf8ToInt(S0)
> S01
## [1]  72 101 108 108 111  32  83 116  97  99

然后将结果转换为3进制。我想要得到像这样的东西:

> S1
## [1] 2200 10202 11000 11010  11022 1012 10002 11022 10121 10200

是的,我很抱歉,我编辑了问题,希望它能更具信息性。 - Jose Gracia Rodriguez
3个回答

5

为了练习,您可以尝试写一个自己的转换函数,就像下面这样:

f <- function(x, base = 3) {
  q <- c()
  while (x) {
    q <- c(x %% base, q)
    x <- x %/% base
  }
  # as.numeric(paste0(q, collapse = ""))
  sum(q * 10^(rev(seq_along(q) - 1)))
}

或使用递归

f <- function(x, base = 3) {
  ifelse(x < base, x, f(x %/% base) * 10 + x %% base)
}

那么你可以运行

> sapply(utf8ToInt(S0),f)
 [1]  2200 10202 11000 11000 11010  1012 10002 11022 10121 10200

3
不错的编程练习。我对@ThomasIsCoding的答案进行了矢量化处理,以避免在字符串和字符串内部的字符上进行昂贵的循环。该想法是循环遍历数字,因为Unicode代码点在任何基数下都不超过21位数字,而字符向量中的总字符数可以多达几个数量级。
下面的函数以字符向量x、基数b(从2到10)和逻辑标志double作为参数。它返回一个名为res的列表,使得res [[i]]是一个长度为nchar(x [i])的向量,给出了x [i]的基础-b表示。根据double的值,列表元素可以是双精度向量或字符向量。
utf8ToBase <- function(x, b = 10, double = TRUE) {
    ## Do some basic checks
    stopifnot(is.character(x), !anyNA(x), 
              is.numeric(b), length(b) == 1L, 
              b %% 1 == 0, b >= 2, b <= 10)
    
    ## Require UTF-8 encoding
    x <- enc2utf8(x)
    
    ## Operate on concatenation to avoid loop over strings
    xx <- paste(x, collapse = "")
    ixx <- utf8ToInt(xx)
    
    ## Handle trivial case early
    if (length(ixx) == 0L) {
        el <- if (double) base::double(0L) else character(0L)
        res <- rep.int(list(el), length(x))
        names(res) <- names(x)
        return(res)
    }
    
    ## Use common field width determined from greatest integer
    width <- as.integer(floor(1 + log(max(ixx, 1), base = b)))
    res <- rep.int(strrep("0", width), length(ixx))
    
    ## Loop over digits
    pos <- 1L
    pow <- b^(width - 1L)
    while (pos <= width) {
        quo <- ixx %/% pow
        substr(res, pos, pos) <- as.character(quo)
        ixx <- ixx - pow * quo
        pos <- pos + 1L
        pow <- pow %/% b
    }
    
    ## Discard leading zeros
    if (double) {
        res <- as.double(res)
        if (b == 2 && any(res > 0x1p+53)) {
            warning("binary result not guaranteed due to loss of precision")
        }
    } else {
        res <- sub("^0+", "", res)
    }
    
    ## Return list
    res <- split(res, rep.int(gl(length(x), 1L), nchar(x)))
    names(res) <- names(x)
    res
}

x <- c(foo = "Hello Stack Overflow!", bar = "Hello world!")
utf8ToBase(x, 2)

$foo
 [1] 1001000 1100101 1101100 1101100 1101111  100000
 [7] 1010011 1110100 1100001 1100011 1101011  100000
[13] 1001111 1110110 1100101 1110010 1100110 1101100
[19] 1101111 1110111  100001

$bar
 [1] 1001000 1100101 1101100 1101100 1101111  100000
 [7] 1110111 1101111 1110010 1101100 1100100  100001

utf8ToBase(x, 3)

$foo
 [1]  2200 10202 11000 11000 11010  1012 10002 11022 10121 10200
[11] 10222  1012  2221 11101 10202 11020 10210 11000 11010 11102
[21]  1020

$bar
 [1]  2200 10202 11000 11000 11010  1012 11102 11010 11020 11000
[11] 10201  1020

utf8ToBase(x, 10)

$foo
 [1]  72 101 108 108 111  32  83 116  97  99 107  32  79 118 101
[16] 114 102 108 111 119  33

$bar
 [1]  72 101 108 108 111  32 119 111 114 108 100  33

一些注意事项:

  • For efficiency, the function concatenates the strings in x rather than looping over them. It throws an error if the concatenation would exceed 2^31-1 bytes, which is the maximum string size allowed by R.

    x <- strrep(letters[1:2], 0x1p+30)
    log2(sum(nchar(x))) # 31
    utf8ToBase(x, 3)
    
    Error in paste(x, collapse = "") : result would exceed 2^31-1 bytes
    
  • The largest Unicode code point is 0x10FFFF. The binary representation of this number exceeds 2^53 when interpreted as decimal, so it cannot be stored in a double vector without loss of precision:

    x <- sub("^0+", "", paste(rev(as.integer(intToBits(0x10FFFF))), collapse = ""))
    x
    ## [1] "100001111111111111111"
    sprintf("%.0f", as.double(x))
    ## [1] "100001111111111114752"
    

    As a defensive measure, the function warns if 2^53 is exceeded when b = 2 and double = TRUE.

    utf8ToBase("\U10FFFF", b = 2, double = TRUE)
    
    [[1]]
    [1] 1.000011e+20
    
    Warning message:
    In utf8ToBase("\U{10ffff}", b = 2, double = TRUE) :
      binary result not guaranteed due to loss of precision
    
    utf8ToBase("\U10FFFF", b = 2, double = FALSE)
    
    [[1]]
    [1] "100001111111111111111"
    

2
您可以使用函数:
library(cwhmisc)
int2B(utf8ToInt(S0), 3)[[1]] |> as.numeric()
# [1]  2200 10202 11000 11000 11010  1012 10002 11022 10121 10200

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接