在R中，for循环和while循环的区别

Question

在R中，for循环和while循环的区别

25

在使用R语言时，我注意到一个有趣的事情。当我使用for循环和while循环实现从1到N的平方计算的简单程序时，它们的行为是不同的。（在这种情况下，我不关心向量化或apply函数）。

fn1 <- function (N) 
{
    for(i in 1:N) {
        y <- i*i
    }
}

与（AND）

fn2 <- function (N) 
{
    i=1
    while(i <= N) {
        y <- i*i
        i <- i + 1
    }
}

结果如下：

system.time(fn1(60000))
   user  system elapsed 
  2.500   0.012   2.493 
There were 50 or more warnings (use warnings() to see the first 50)
Warning messages:
1: In i * i : NAs produced by integer overflow
.
.
.

system.time(fn2(60000))
   user  system elapsed 
  0.138   0.000   0.137

现在我们知道for循环更快，我猜这是因为预分配和优化。但为什么会溢出呢？

更新：现在尝试另一种方法，使用向量：

fn3 <- function (N) 
{
    i <- 1:N
    y <- i*i
}
system.time(fn3(60000))
   user  system elapsed 
  0.008   0.000   0.009 
Warning message:
In i * i : NAs produced by integer overflow

也许这是一个奇怪的内存问题？我正在使用 Mac OS X，有 4GB 的内存和所有 R 的默认设置。这在 32 位和 64 位版本中都会发生（除了时间更快之外）。

Alex

- Alex

6

根据你的时间，while循环更快。 - Marek

2

当你将for循环中的计数器转换为浮点数时，它比while循环更快，但这只是因为for循环没有警告。 - John

1

R充满了这种无聊的废话。 - Alex Brown

不错的问题。我喜欢性能分析。 - Alex Brown

3个回答

4

在for循环中的变量是一个整数序列，因此最终你会做到这一点：

> y=as.integer(60000)*as.integer(60000)
Warning message:
In as.integer(60000) * as.integer(60000) : NAs produced by integer overflow

与 while 循环不同，您在创建浮点数。

这也是这些事物不同的原因：

> seq(0,2,1)
[1] 0 1 2
> seq(0,2)
[1] 0 1 2

不相信我吗？

> identical(seq(0,2),seq(0,2,1))
[1] FALSE

因为：

> is.integer(seq(0,2))
[1] TRUE
> is.integer(seq(0,2,1))
[1] FALSE

- Spacedman

但为什么浮点数的范围比整数大？ - Alex

1

更新：“请注意，在几乎所有的R实现中，可表示整数的范围都被限制在约+/-2*10^9左右：双精度浮点数可以完全容纳更大的整数。”来自R整数文档的说明： - Alex

@Alex（不是布朗）：整数是32位二进制表示法。因此，最大范围为2^32，或~[-2e9，+2e9]。浮点数使用32位不同：它为数字保存了一些位数（小数点在第一个数字之后），并为幂保存了另一组位数。显然，这使得范围非常大。顺便说一下，除了一些细节外，这个系统在每种计算机语言/应用程序中都是相同的普遍计算机知识。 - Joris Meys

3

关于时间：

fn1 <- function (N) {
    for(i in as.numeric(1:N)) { y <- i*i }
}
fn2 <- function (N) {
    i=1
    while (i <= N) {
        y <- i*i
        i <- i + 1
    }
}

system.time(fn1(60000))
# user  system elapsed 
# 0.06    0.00    0.07 
system.time(fn2(60000))
# user  system elapsed 
# 0.12    0.00    0.13

现在我们知道for循环比while循环更快。在计时过程中不能忽略警告。

- Marek

1

这仍然不是完全公平的，因为while循环有更大的主体;我知道这是模拟for循环所必需的，但在某些问题中并非如此。 - mbq

2

@mbq，这就是为什么for循环和while循环不能进行比较的原因。它们各自有不同的目的。你可以在fn1中添加i<-i+1行，但它仍然会更快，因为fn2必须检查条件，这意味着要调用60k次<=。如果你在fn1中再添加另一行i<=N，那么时间就相等了。 - Marek

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Alex Brown · Accepted Answer

因为1是数值类型，但不是整数（即它是浮点数），而1:6000既是数值类型又是整数。

> print(class(1))
[1] "numeric"
> print(class(1:60000))
[1] "integer"

60000的平方是3.6亿，这个数不可用有符号32位整数表示，因此会出现溢出错误：

> as.integer(60000)*as.integer(60000)
[1] NA
Warning message:
In as.integer(60000) * as.integer(60000) : NAs produced by integer overflow

36亿在浮点数中很容易表示，但是：

> as.single(60000)*as.single(60000)
[1] 3.6e+09

要修复你的 for 代码，请将其转换为浮点表示：

function (N)
{
    for(i in as.single(1:N)) {
        y <- i*i
    }
}