我的Julia代码为什么在迭代次数增加时变得更慢了？

Question

我的Julia代码为什么在迭代次数增加时变得更慢了？

5

我编写了一个主函数，使用随机优化算法（粒子群优化）来找到ODE系统的最优解。我会运行50次以确保可以找到最优解。起初，它正常运行，但现在我发现随着迭代次数的增加，计算时间会增加。前十次计算所需的时间不到300秒，但最后一次计算需要500秒。似乎每次计算需要多花费3~5秒钟的时间。我已经遵循高性能技巧来优化我的代码，但它并没有奏效。

对不起，我之前不太清楚如何上传我的代码，在下面是我编写的代码。但在这个代码中，实验数据没有被加载，我可能需要找到一种上传数据的方法。在主函数中，随着的增加，每次计算的时间成本都在增加。
哦，顺便说一句，我发现另一个有趣的现象。我改变了计算次数，计算时间再次改变了。在主循环的前20次计算中，每次计算大约需要300秒，并且内存使用量波动较大。但是发生了我不知道的事情，速度加快了。每次计算需要的时间少了1/4，大约是80秒。而内存使用量变成了这样的一条直线：
我知道Julia会在第一次运行时进行预热，然后加速。但这种情况似乎不同。这种情况看起来像是Julia在前20次计算中运行缓慢，然后它找到了一种优化内存使用和加速的好方法。然后程序就以全速运行。
using CSV, DataFrames using BenchmarkTools using DifferentialEquations using Statistics using Dates using Base.Threads using Suppressor function uniform(dim::Int, lb::Array{Float64, 1}, ub::Array{Float64, 1}) arr = rand(Float64, dim) @inbounds for i in 1:dim; arr[i] = arr[i] * (ub[i] - lb[i]) + lb[i] end return arr end mutable struct Problem cost_func dim::Int lb::Array{Float64,1} ub::Array{Float64,1} end mutable struct Particle position::Array{Float64,1} velocity::Array{Float64,1} cost::Float64 best_position::Array{Float64,1} best_cost::Float64 end mutable struct Gbest position::Array{Float64,1} cost::Float64 end function PSO(problem, data_dict; max_iter=100,population=100,c1=1.4962,c2=1.4962,w=0.7298,wdamp=1.0) dim = problem.dim lb = problem.lb ub = problem.ub cost_func = problem.cost_func gbest, particles = initialize_particles(problem, population, data_dict) # main loop for iter in 1:max_iter @threads for i in 1:population particles[i].velocity .= w .* particles[i].velocity .+ c1 .* rand(dim) .* (particles[i].best_position .- particles[i].position) .+ c2 .* rand(dim) .* (gbest.position .- particles[i].position) particles[i].position .= particles[i].position .+ particles[i].velocity particles[i].position .= max.(particles[i].position, lb) particles[i].position .= min.(particles[i].position, ub) particles[i].cost = cost_func(particles[i].position,data_dict) if particles[i].cost < particles[i].best_cost particles[i].best_position = copy(particles[i].position) particles[i].best_cost = copy(particles[i].cost) if particles[i].best_cost < gbest.cost gbest.position = copy(particles[i].best_position) gbest.cost = copy(particles[i].best_cost) end end end w = w * wdamp if iter % 50 == 1 println("Iteration " * string(iter) * ": Best Cost = " * string(gbest.cost)) println("Best Position = " * string(gbest.position)) println() end end gbest, particles end function initialize_particles(problem, population,data_dict) dim = problem.dim lb = problem.lb ub = problem.ub cost_func = problem.cost_func gbest_position = uniform(dim, lb, ub) gbest = Gbest(gbest_position, cost_func(gbest_position,data_dict)) particles = [] for i in 1:population position = uniform(dim, lb, ub) velocity = zeros(dim) cost = cost_func(position,data_dict) best_position = copy(position) best_cost = copy(cost) push!(particles, Particle(position, velocity, cost, best_position, best_cost)) if best_cost < gbest.cost gbest.position = copy(best_position) gbest.cost = copy(best_cost) end end return gbest, particles end function get_dict_label(beta::Int) beta_str = lpad(beta,2,"0") T_label = "Temperature_" * beta_str M_label = "Mass_" * beta_str MLR_label = "MLR_" * beta_str return T_label, M_label, MLR_label end function get_error(x::Vector{Float64}, y::Vector{Float64}) numerator = sum((x.-y).^2) denominator = var(x) * length(x) numerator/denominator end function central_diff(x::AbstractArray{Float64}, y::AbstractArray{Float64}) # Central difference quotient dydx = Vector{Float64}(undef, length(x)) dydx[2:end] .= diff(y) ./ diff(x) @views dydx[2:end-1] .= (dydx[2:end-1] .+ dydx[3:end])./2 # Forward and Backward difference dydx[1] = (y[2]-y[1])/(x[2]-x[1]) dydx[end] = (y[end]-y[end-1])/(x[end]-x[end-1]) return dydx end function decomposition!(dm,m,p,T) # A-> residue + volitale # B-> residue + volatile beta,A1,E1,n1,k1,A2,E2,n2,k2,m1,m2 = p R = 8.314 rxn1 = -m1 * exp(A1-E1/R/T) * max(m[1]/m1,0)^n1 / beta rxn2 = -m2 * exp(A2-E2/R/T) * max(m[2]/m2,0)^n2 / beta dm[1] = rxn1 dm[2] = rxn2 dm[3] = -k1 * rxn1 - k2 * rxn2 dm[4] = dm[1] + dm[2] + dm[3] end function read_file(file_path) df = CSV.read(file_path, DataFrame) data_dict = Dict{String, Vector{Float64}}() for beta in 5:5:21 T_label, M_label, MLR_label = get_dict_label(beta) T_data = collect(skipmissing(df[:, T_label])) M_data = collect(skipmissing(df[:, M_label])) T = T_data[T_data .< 780] M = M_data[T_data .< 780] data_dict[T_label] = T data_dict[M_label] = M data_dict[MLR_label] = central_diff(T, M) end return data_dict end function initial_condition(beta::Int64, ode_parameters::Array{Float64,1}) m_FR_initial = ode_parameters[end] m_PVC_initial = 1 - m_FR_initial T_span = (300.0, 800.0) # temperature range p = [beta; ode_parameters; m_PVC_initial] m0 = [p[end-1], p[end], 0.0, 1.0] # initial mass return m0, T_span, p end function cost_func(ode_parameters, data_dict) total_error = 0.0 for beta in 5:5:21 T_label, M_label, MLR_label= get_dict_label(beta) T = data_dict[T_label]::Vector{Float64} M = data_dict[M_label]::Vector{Float64} MLR = data_dict[MLR_label]::Vector{Float64} m0, T_span, p = initial_condition(beta,ode_parameters) prob = ODEProblem(decomposition!,m0,T_span,p) sol = solve(prob, AutoVern9(Rodas5(autodiff=false)),saveat=T,abstol=1e-8,reltol=1e-8,maxiters=1e4) if sol.retcode != :Success # println(1) return Inf else M_sol = @view sol[end, :] MLR_sol = central_diff(T, M_sol)::Array{Float64,1} error1 = get_error(MLR, MLR_sol)::Float64 error2 = get_error(M, M_sol)::Float64 total_error += error1 + error2 end end total_error end function main() flush(stdout) total_time = 0 best_costs = [] file_path = raw"F:\17-Fabric\17-Fabric (Smoothed) TG.csv" data_dict = read_file(file_path) dimension = 9 lb = [5, 47450, 0.0, 0.0, 24.36, 148010, 0.0, 0.0, 1e-5] ub = [25.79, 167700, 5, 1, 58.95, 293890, 5, 1, 0.25] problem = Problem(cost_func,dimension,lb,ub) global_best_cost = Inf println("-"^100) println("Running PSO ...") population = 50 max_iter = 1001 println("The population is: ", population) println("Max iteration is:", max_iter) for i in 1:50 # The number of calculation start_time = Dates.now() println("Current iteration is: ", string(i)) gbest, particles = PSO(problem, data_dict, max_iter=max_iter, population=population) if gbest.cost < global_best_cost global_best_cost = gbest.cost global_best_position = gbest.position end end_time = Dates.now() time_duration = round(end_time-start_time, Second) total_time += time_duration.value push!(best_costs, gbest.cost) println() println("The Best is:") println(gbest.cost) println(gbest.position) println("The calculation time is: " * string(time_duration)) println() println("-"^50) end println('-'^100) println("Global best cost is: ", global_best_cost) println("Global best position is: ", global_best_position) println(total_time) best_costs end @suppress_err begin @time global best_costs = main() end

所以，这个问题可能的机制是什么？有没有办法避免这个问题？因为如果我增加粒子的数量和最大迭代次数，时间会变得非常长，这是不可接受的。
那么如何加速上述程序的可能机制是什么？如何触发这种机制？

- Wang

3

没有实际的代码，我们可能无法说出任何内容。 - phipsgabler

先生，这段代码比较长，我可以通过电子邮件或其他方式向您展示我的代码吗？ - Wang

2

王，最好公开一些，这样不仅 phipsgabler 可以评论。也许你可以将事情简化为一个最小的工作示例？ - logankilpatrick

抱歉，我不太熟悉Stackoverflow。我会想办法发布我的代码。 - Wang

你是在使用散热有限的笔记本电脑上运行吗？因此，CPU 频率不能保持太长时间高速运转吗？如果是这样，那么这个问题与为什么我的超轻便笔记本电脑 CPU 无法在 HPC 中保持最高性能相同 - 快速升至高性能，然后由于达到热限制而下降。 - Peter Cordes

哦，这是一台配备了i7-8700和16G RAM以及2666MHz高性能模式的PC。我认为这个问题更像是一个内存分配问题。因为我没有为所有变量预先分配空间，这可能会导致垃圾回收效率低下。它变慢的原因是因为有些内存没有被释放。它突然加速的原因也与内存分配有关。经过几次运行，Julia发现某些变量总是被使用，而Julia只是优化了这个过程。但是，我不知道Julia是否具有这种能力。 - Wang

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Chris Rackauckas · Accepted Answer

4

当ODE的参数被优化时，它的特性可以完全改变。你的方程可能会变得更加刚性，并需要不同的ODE求解器。还有许多其他相关的方法，但你可以看到更改参数可能会导致这样的性能问题。在这种估计案例中最好使用像AutoTsit5（Rodas5()）之类的方法，因为很难知道或猜测性能将会如何，因此方法选择中的自适应性可能至关重要。

- Chris Rackauckas

1

哦天啊，Chris Rackauckas！我很激动。这不是关于ODE求解器的问题。更像是我运行了50次我的程序，但它变得越来越慢了。 - Wang

哦，不是ODE求解器本身的迭代，而是程序的重复运行？听起来像是你的内存已经满了。奇怪，很难说为什么它不会触发垃圾回收。你确定引用已经丢失了吗？如果你做 A=nothing，那就可以确保引用被分离并允许GC触发。否则，如果只是 A = f(...)，它必须等到 f 完成后才能进行，因为如果 A 在 f 完成之前被GC，则其行为是错误的（并且可能导致段错误）。 - Chris Rackauckas

谢谢，Chris Rackauckas！我可能需要想出一种解决这个问题的方法。但是你能否提供丢失引用和GC方面的示例或任何参考资料？我对此有点困惑。哦，顺便说一下，我注意到ODE问题的类型是'any'，因此ODEs的解也是'any'。这使我的成本函数类型不稳定。我该如何指定ODEs问题和解的类型？ - Wang

文档提到IIP指定是动态的，所以在这里它需要是ODEProblem{true}(...)。话虽如此，通常不会有区别，因为函数障碍的位置。 - Chris Rackauckas