在小数点和第一个非零数字之间计算前导零的数量

Question

在小数点和第一个非零数字之间计算前导零的数量

rregexdesctools

28

假设我们有一个数字1.000633，我想要计算小数点后第一个非零数字出现前的零的数量，答案应该是3。对于0.002，答案应该是2。

在R中没有这样的函数可以帮助。我已经在DescTools软件包中探索了Ndec函数，但它不能胜任这项任务。

- Annie

9个回答

16

这里有另一个可能性：

zeros_after_period <- function(x) {
if (isTRUE(all.equal(round(x),x))) return (0) # y would be -Inf for integer values
y <- log10(abs(x)-floor(abs(x)))   
ifelse(isTRUE(all.equal(round(y),y)), -y-1, -ceiling(y))} # corrects case ending with ..01

示例：

x <- c(1.000633, 0.002, -10.01, 7.00010001, 62.01)
sapply(x,zeros_after_period)
#[1] 3 2 1 3 1

- RHertel

2

我喜欢这个解决方案，即使存在0.001的问题。 - zx8754

我认为您忘记将其向量化，因为现在它只能处理长度为1的向量...或许应该是 ifelse(round(y) == y, -y-1, -ceiling(y))？ - David Arenburg

不要列，只有几个值，例如 x <- c(0.1, 1.0, 1.001)。 - David Arenburg

我不知道为什么我的回答下面有两条评论，内容是“它不起作用”。事实上，它是有效的。 - RHertel

y = log10(abs(x) %% 1) 似乎也可以工作。为了使它向量化，y = -log10(abs(x) %% 1); ceiling(y) - ( (y %% 1) < 10^-options()$digits ) 或者使用其他阈值，我猜想。可能仍有一两个边缘情况。 - Frank

显示剩余6条评论

9

我们可以使用sub。

ifelse(grepl("\\.0", str1), 
    nchar(sub("[^\\.]+\\.(0+)[^0]+.*", "\\1", str1)), NA)
#[1] 3 2 3 3 2

或者使用 stringi。

library(stringi)
r1 <- stri_extract(str1, regex="(?<=\\.)0+")
ifelse(is.na(r1), NA, nchar(r1))
#[1] 3 2 3 3 2

只是为了检查它是否适用于任何奇怪的情况

str2 <- "0.00A-Z"
nchar(sub("[^\\.]+\\.(0+)[^0]+.*", "\\1", str2))
#[1] 2

数据

str1 <- as.character(c(1.000633, 0.002, 0.000633,
                                  10.000633, 3.0069006))

- akrun

2

再次感谢，尝试使用str1 <- as.character(10.000633)。 - Annie

1

你可能需要编辑你的第一个解决方案，因为它是错误的。 - David Arenburg

2

@akrun 那里可能有任意数量的数字，这应该适用于所有数字。几乎每个人在他们的回答下都有一个可能存在问题的评论，不仅仅是你。可以看看这里和这里作为例子。 - David Arenburg

8

你的意思是什么：“好的，Jaap也在线上”？ - Jaap

2

只需允许数字中除了0以外的其他数字，例如 "[^\\.]+\\.(0+)[^0]{1}.*"，这样就可以解决问题了（尽管我仍然更喜欢RHertel的“numeric”方法）。这是准确解决问题的问题，而不是点赞。 - Cath

显示剩余9条评论

7

使用 rle 函数：

#test values
x <- c(0.000633,0.003,0.1,0.001,0.00633044,10.25,111.00012,-0.02)

#result
sapply(x, function(i){
  myNum <- unlist(strsplit(as.character(i), ".", fixed = TRUE))[2]
  myNumRle <- rle(unlist(strsplit(myNum, "")))
  if(myNumRle$values[1] == 0) myNumRle$lengths[1] else 0
})

#output
# [1] 3 2 0 2 2 0 3 1

- zx8754

7

使用 stringr 包中的 str_count ，可以另一种方式实现。

 x <- as.character(1.000633)
 str_count(gsub(".*[.]","",x), "0")
 #[1] 3

编辑：这个计算方式会统计小数点后面所有的零，直到遇到第一个非零值。

y <- c(1.00215, 1.010001, 50.000809058, 0.1)
str_count(gsub(".*[.]","",gsub("(?:(0+))[1-9].*","\\1",as.character(y))),"0")
#[1] 2 1 3 0

- Sotos

哇，这个问题升级得真快！:) 我选择了OP提到的两种情况。我会尽快修改。谢谢@DavidArenburg - Sotos

那么 y <- 0.00001 呢？ - Scott Kaiser

7

floor( -log10( eps + abs(x) - floor( abs( x ) ) ) )

- MatthewPeter

4

欢迎来到Stack Overflow，感谢您回答这个问题。由于没有注释的代码往往不够教育性，因此我们希望您添加一些解释来说明如何回答这个问题。谢谢！ - Toby Speight

3

是的，这是最好的解决方案。但是，您应该考虑像这样的整数对数值：

count0 <- function(x, tol = .Machine$double.eps ^ 0.5) {   x <- abs(x);   y <- -log10(x - floor(x));   floor(y) - (y %% 1 < tol) }

- Roland

非常好的答案，我喜欢它更数学化的方法，而不是全部使用grep，谢谢！ - Antoni

1

只是想补充一下，我尝试过的所有解决方案都存在数字问题，比如0.00001会被格式化为科学计数法，除非你小心地指定不要这样做。我最终采用了以下解决方案：

leading_zero <- function(x)  {
  if (x < 0.001){
    x <- as.character(format(x,scientific=FALSE))
  }
  nlead <- attr(regexpr("(?<=\\.)0+|$", x, perl = TRUE), "match.length") # leading zeros
  nlead
}

- Jared Rieck

0

你可以使用sub，因为我们不需要跳跃。因此不需要使用gsub

 nchar(sub(".*\\.(0*).*","\\1",str1))
[1] 3 2 3 3 2

在哪里

str1 <- as.character(c(1.000633, 0.002, 0.000633,
                   10.000633, 3.0069006))

- Onyambu

0

类似于@MatthewPeter的解决方案。如果您使用ceiling()函数而不是floor()函数，然后再减去1，您就不会遇到1*10**x这样的数字问题，比如(0.1, 0.01, 0.001, ...)。

x |>                  # input vector of numeric values
  abs() %%            # take the absolute value (delete sign of numbers)
  1 |>                # do numbers modulo 1 
                      # (delete everything before the decimal point)
  log10() |>          # use log10 to count the numbers after the period
  abs() |>            # flip sign, as we want the positive numbers
  ceiling() -         # take the ceiling of the numbers. 
                      # this will solve the 1*10**x issue
  1                   # substract 1 since we actually
                      # wanted the floor of the values

数据：

x <- c(0.000633,0.003,0.1,0.001,0.00633044,10.25,111.00012,-0.02)
# [1] 3 2 0 2 2 0 3 1

- DuesserBaest

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- David Arenburg · Accepted Answer

25

使用 regexpr 及其 match.length 参数

attr(regexpr("(?<=\\.)0+", x, perl = TRUE), "match.length")

- David Arenburg

1

对于 x <- 10.2，这个返回的是 -1 而不是 0。我不得不在我的解决方案中插入一个 ifelse 语句来捕获一个会在没有它的情况下失败的情况。这可能是你认为我的实现复杂的原因。另一方面，也许你可以考虑捕获这种情况，这样你的解决方案也适用于任何数字。 - RHertel

1

@RHertel它总是返回-1表示没有匹配。这是“regexpr”未匹配的符号。我的解决方案适用于任何数字。 - David Arenburg

2

好的，我了解了 - 就像我了解为什么我的原始帖子需要更正一样。我只是不确定这是否符合OP所请求的输出（“...计算小数点后第一个非零数字之前的零的数量…”）。负数的零计数对我来说似乎没有太多意义。 - RHertel

2

@RHertel 如果 OP 想要的话，这可以很容易地以向量化的方式修复，但在这种情况下，对于没有匹配项，-1 或 0 对我来说似乎同样好。 - David Arenburg

2

@RHertel 如果你想得到0而不是-1，只需使用(?<=\\.)0+|$作为正则表达式即可。 - maaartinus

显示剩余4条评论