我写了这段代码,假设len
是尾递归的,但仍然发生堆栈溢出。问题出在哪里?
myLength :: [a] -> Integer
myLength xs = len xs 0
where len [] l = l
len (x:xs) l = len xs (l+1)
main = print $ myLength [1..10000000]
我写了这段代码,假设len
是尾递归的,但仍然发生堆栈溢出。问题出在哪里?
myLength :: [a] -> Integer
myLength xs = len xs 0
where len [] l = l
len (x:xs) l = len xs (l+1)
main = print $ myLength [1..10000000]
请记住,Haskell是一种惰性语言。只有在绝对必要时才会执行计算(l+1)。
“简单”的解决方案是使用'$!'来强制求值:
myLength :: [a] -> Integer
myLength xs = len xs 0
where len [] l = l
len (x:xs) l = len xs $! (l+1)
main = print $ myLength [1..10000000]
len
构建了thunk:len [1..100000] 0
-> len [2..100000] (0+1)
-> len [3..100000] (0+1+1)
等等等。你必须强制使用len
每次减少l
:
len (x:xs) l = l `seq` len xs (l+1)
欲了解更多信息,请查看http://haskell.org/haskellwiki/Stack_overflow。
foldl存在同样的问题,它会构建一个thunk。您可以使用Data.List中的foldl'来避免这个问题:
import Data.List
myLength = foldl' (const.succ) 0
jmg$ ghc --make tail.hs -fforce-recomp [1 of 1] Compiling Main ( tail.hs, tail.o ) Linking tail ... jmg$ ./tail Stack space overflow: current size 8388608 bytes. Use `+RTS -Ksize -RTS' to increase it. girard:haskell jmg$ ghc -O --make tail.hs -fforce-recomp [1 of 1] Compiling Main ( tail.hs, tail.o ) Linking tail ... jmg$ ./tail 10000000 jmg$
@Hynek -Pichi- Vychodil 以上测试是在Mac OS X Snow Leopard 64位操作系统上使用GHC 7和GHC 6.12.1进行的,每个版本都是32位的。在您downvote之后,我在Ubuntu Linux上重复了这个实验,结果如下:
jmg@girard:/tmp$ cat length.hs myLength :: [a] -> Integer myLength xs = len xs 0 where len [] l = l len (x:xs) l = len xs (l+1)
main = print $ myLength [1..10000000]
jmg@girard:/tmp$ ghc --version The Glorious Glasgow Haskell Compilation System, version 6.12.1 jmg@girard:/tmp$ uname -a Linux girard 2.6.35-24-generic #42-Ubuntu SMP Thu Dec 2 02:41:37 UTC 2010 x86_64 GNU/Linux jmg@girard:/tmp$ ghc --make length.hs -fforce-recomp [1 of 1] Compiling Main ( length.hs, length.o ) Linking length ... jmg@girard:/tmp$ time ./length Stack space overflow: current size 8388608 bytes. Use `+RTS -Ksize -RTS' to increase it.
real 0m1.359s user 0m1.140s sys 0m0.210s jmg@girard:/tmp$ ghc -O --make length.hs -fforce-recomp [1 of 1] Compiling Main ( length.hs, length.o ) Linking length ... jmg@girard:/tmp$ time ./length 10000000
real 0m0.268s user 0m0.260s sys 0m0.000s jmg@girard:/tmp$
因此,如果您有兴趣,我们可以继续找出为什么这对您失败的原因。我猜,如果这种直接递归列表的方式不能在这种情况下被优化成有效的循环,那么GHC HQ将会把它视为一个bug。
$ ghc -O --make length.hs -fforce-recomp [1 of 1] Compiling Main (length.hs, length.o) Linking length ... hynek@hynek:~/work/haskell$ ./length 栈空间溢出:当前大小为8388608字节。使用
+RTS-Ksize-RTS'来增加它。 - Hynek -Pichi- Vychodil你知道吗,有一种更简单的方法来编写这个函数:
myLength xs = foldl step 0 xs where step acc x = acc + 1
Alex
eelco.lempsink.nl回答了你提出的问题。这是Yann答案的演示:
module Main
where
import Data.List
import System.Environment (getArgs)
main = do
n <- getArgs >>= readIO.head
putStrLn $ "Length of an array from 1 to " ++ show n
++ ": " ++ show (myLength [1..n])
myLength :: [a] -> Int
myLength = foldl' (const . succ) 0
foldl' 从左到右遍历列表,每次将 1 加到从 0 开始的累加器中。
这是运行程序的示例:
C:\haskell>ghc --make Test.hs -O2 -fforce-recomp
[1 of 1] Compiling Main ( Test.hs, Test.o )
Linking Test.exe ...
C:\haskell>Test.exe 10000000 +RTS -sstderr
Test.exe 10000000 +RTS -sstderr
Length of an array from 1 to 10000000: 10000000
401,572,536 bytes allocated in the heap
18,048 bytes copied during GC
2,352 bytes maximum residency (1 sample(s))
13,764 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
Generation 0: 765 collections, 0 parallel, 0.00s, 0.00s elapsed
Generation 1: 1 collections, 0 parallel, 0.00s, 0.00s elapsed
INIT time 0.00s ( 0.00s elapsed)
MUT time 0.27s ( 0.28s elapsed)
GC time 0.00s ( 0.00s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 0.27s ( 0.28s elapsed)
%GC time 0.0% (0.7% elapsed)
Alloc rate 1,514,219,539 bytes per MUT second
Productivity 100.0% of total user, 93.7% of total elapsed
C:\haskell>