在Julia中的多线程循环中设置随机数种子

4
我想在Julia中使用多线程生成随机数。我正在使用Threads.@threads宏来实现。但是,我很难固定种子的数量,以便每次运行代码都能获得相同的结果。以下是我的尝试:
Random.seed!(1234)
a = [Float64[] for _ in 1:10]

Threads.@threads for i = 1:10
    push!(a[Threads.threadid()],rand())
end

sum(reduce(vcat, a))

上述脚本在每次运行时都会产生不同的结果。相比之下,如果我使用普通的for循环,我会得到相同的结果。
Random.seed!(12445)
b = []

for i = 1:10
    push!(b,rand())
end

sum(b)

我觉得这个问题的解决方案一定很简单,但是我找不到。非常感谢任何帮助。
谢谢。
3个回答

3
你需要为每个线程生成单独的随机流。 最简单的方法是使用具有不同种子的随机数生成器:
using Random

rngs = [MersenneTwister(i) for i in 1: Threads.nthreads()];

Threads.@threads for i = 1:10
     val = rand(rngs[Threads.threadid()])
     # do something with val
end

如果您不想为不同的随机数种子之间存在相关性风险,您可以实际上跳到单个数字生成器周围:

julia> rngs2 = Future.randjump.(Ref(MersenneTwister(0)), big(10)^20 .* (1:Threads.nthreads()))
4-element Vector{MersenneTwister}:
 MersenneTwister(0, (200000000000000000000, 0))
 MersenneTwister(0, (400000000000000000000, 0))
 MersenneTwister(0, (600000000000000000000, 0))
 MersenneTwister(0, (800000000000000000000, 0))

谢谢!不错。我选择Bogumil的评论,因为它更简单一些,但你的解决方案同样很棒。 - Fabrizio Leone

2

你好,Fabrizio。在BetaML中,我用以下方法解决了这个问题:

"""
    generateParallelRngs(rng::AbstractRNG, n::Integer;reSeed=false)
For multi-threaded models, return n independent random number generators (one per thread) to be used in threaded computations.
Note that each ring is a _copy_ of the original random ring. This means that code that _use_ these RNGs will not change the original RNG state.
Use it with `rngs = generateParallelRngs(rng,Threads.nthreads())` to have a separate rng per thread.
By default the function doesn't re-seed the RNG, as you may want to have a loop index based re-seeding strategy rather than a threadid-based one (to guarantee the same result independently of the number of threads).
If you prefer, you can instead re-seed the RNG here (using the parameter `reSeed=true`), such that each thread has a different seed. Be aware however that the stream  of number generated will depend from the number of threads at run time.
"""
function generateParallelRngs(rng::AbstractRNG, n::Integer;reSeed=false)
    if reSeed
        seeds = [rand(rng,100:18446744073709551615) for i in 1:n] # some RNGs have issues with too small seed
        rngs  = [deepcopy(rng) for i in 1:n]
        return Random.seed!.(rngs,seeds)
    else
        return [deepcopy(rng) for i in 1:n]
    end
end

上述函数在Julia中使用的线程数量不同的情况下也能够产生相同的结果,可以像这样使用:

using Test

TESTRNG = MersenneTwister(123)

println("** Testing generateParallelRngs()...")
x = rand(copy(TESTRNG),100)

function innerFunction(bootstrappedx; rng=Random.GLOBAL_RNG)
     sum(bootstrappedx .* rand(rng) ./ 0.5)
end
function outerFunction(x;rng = Random.GLOBAL_RNG)
    masterSeed = rand(rng,100:9999999999999) # important: with some RNG it is important to do this before the generateParallelRngs to guarantee independance from number of threads
    rngs       = generateParallelRngs(rng,Threads.nthreads()) # make new copy instances
    results    = Array{Float64,1}(undef,30)
    Threads.@threads for i in 1:30
        tsrng = rngs[Threads.threadid()]    # Thread safe random number generator: one RNG per thread
        Random.seed!(tsrng,masterSeed+i*10) # But the seeding depends on the i of the loop not the thread: we get same results indipendently of the number of threads
        toSample = rand(tsrng, 1:100,100)
        bootstrappedx = x[toSample]
        innerResult = innerFunction(bootstrappedx, rng=tsrng)
        results[i] = innerResult
    end
    overallResult = mean(results)
    return overallResult
end


# Different sequences..
@test outerFunction(x) != outerFunction(x)

# Different values, but same sequence
mainRng = copy(TESTRNG)
a = outerFunction(x, rng=mainRng)
b = outerFunction(x, rng=mainRng)

mainRng = copy(TESTRNG)
A = outerFunction(x, rng=mainRng)
B = outerFunction(x, rng=mainRng)

@test a != b && a == A && b == B


# Same value at each call
a = outerFunction(x,rng=copy(TESTRNG))
b = outerFunction(x,rng=copy(TESTRNG))
@test a == b

谢谢!不错。我选择Bogumil的评论,因为它更简单一些,但你的解决方案同样很棒。 - Fabrizio Leone

1
假设您使用的是 Julia 1.6 版本,您可以执行以下操作:
julia> using Random

julia> foreach(i -> Random.seed!(Random.default_rng(i), i), 1:Threads.nthreads())

目前Julia每个线程已经有一个独立的随机数生成器,因此您不需要自己生成(当然您可以按照其他答案中的方法自己生成,但并非必须)。
另外请注意,在未来的Julia版本中:
Threads.@threads for i = 1:10
    push!(a[Threads.threadid()],rand())
end

无法保证该部分产生可重复的结果。在Julia 1.6中,Threads.@threads使用静态调度,但正如您可以在其docstring中阅读的那样,它可能会发生变化。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接