让我们进行一些适当的基准测试。这里有一些代码,其中将您的洗牌重命名为shuffle1
,并且我的个人最爱变体被命名为shuffle2
。
import System.Random
import Control.Monad
import Control.Monad.ST.Strict
import Data.STRef.Strict
import Data.Vector.Mutable
import Prelude as P
import Criterion.Main
shuffle1 :: RandomGen g => [a] -> g -> ([a], g)
shuffle1 [] g0 = ([],g0)
shuffle1 [x] g0 = ([x],g0)
shuffle1 xs g0 = (x:newtail,g2)
where (i,g1) = randomR (0, P.length $ P.tail xs) g0
(xs1,x:xs2) = P.splitAt i xs
(newtail,g2) = shuffle1 (xs1++xs2) g1
shuffle2 :: RandomGen g => [a] -> g -> ([a], g)
shuffle2 xs g0 = runST $ do
let l = P.length xs
v <- new l
sequence_ $ zipWith (unsafeWrite v) [0..] xs
let loop g i | i <= 1 = return g
| otherwise = do
let i' = i - 1
(j, g') = randomR (0, i') g
unsafeSwap v i' j
loop g' i'
gFinal <- loop g0 l
shuffled <- mapM (unsafeRead v) [0 .. l - 1]
return (shuffled, gFinal)
main = do
let s1 x = fst $ shuffle1 x g0
s2 x = fst $ shuffle2 x g0
arr = [0..1000] :: [Int]
g0 = mkStdGen 0
print (g0, arr)
defaultMain [bench "shuffle1" $ nf s1 arr, bench "shuffle2" $ nf s2 arr]
那么,让我们来看一些结果:
carl@ubuntu:~/hask$ ghc -O2 shuffle.hs
[1 of 1] Compiling Main ( shuffle.hs, shuffle.o )
Linking shuffle ...
carl@ubuntu:~/hask$ ./shuffle
(1 1,[0, .. <redacted for brevity>])
warming up
estimating clock resolution...
mean is 5.762060 us (160001 iterations)
found 4887 outliers among 159999 samples (3.1%)
4751 (3.0%) high severe
estimating cost of a clock call...
mean is 42.13314 ns (43 iterations)
benchmarking shuffle1
mean: 10.95922 ms, lb 10.92317 ms, ub 10.99903 ms, ci 0.950
std dev: 193.8795 us, lb 168.6842 us, ub 244.6648 us, ci 0.950
found 1 outliers among 100 samples (1.0%)
variance introduced by outliers: 10.396%
variance is moderately inflated by outliers
benchmarking shuffle2
mean: 256.9394 us, lb 255.5414 us, ub 258.7409 us, ci 0.950
std dev: 8.042766 us, lb 6.460785 us, ub 12.28447 us, ci 0.950
found 1 outliers among 100 samples (1.0%)
1 (1.0%) high severe
variance introduced by outliers: 26.750%
variance is moderately inflated by outliers
好的,我的系统非常嘈杂,不适合用于类似数字的严格基准测试。但这在这里并不重要。shuffle2
在一个包含1001个元素的列表中大约比shuffle1
快40倍。由于O(n)和O(n^2)之间的差异,在更大的列表中只会增加速度。我相信你测试代码所计时的内容不是洗牌算法。
实际上,我有一个猜测。你的版本足够懒惰,可以逐步返回结果。如果在调用生成器后从未触摸它,那么5秒就是获得前几个结果的合理时间。也许这就是你计时的原因。