Hello everyone!
I’m testing a population-based algorithm and have encountered performance issues, particularly when looping with large populations (i.e., using large data matrices).
Everything works fine when using matrix sizes of 15000x1500 and 20000x2000 and smaller. However, upon increasing the matrix size to 25000x2500, I notice a significant performance drop around iteration number 810 (I’m using 1000 max iterations). At this moment, the algorithm starts taking more than 10 times longer per iteration even after reaching the second run of the same algorithm.
At first, I suspected it might be due to suboptimal code on my part, potentially causing issues in the GC. I even tested adding a GC.gc()
before initiating the second run, but it didn’t have any effect on the performance.
Strangely, upon running the algorithm with an even larger population of 30000x3000 and 50000x5000, I observed no performance drop throughout the run.
It is not related to lack of memory, as my system has about 50% RAM free during the computation.
I’m using Julia version 1.9.4 and I’ve observed this behavior on my desktop with a AMD CPU and in my laptop with an Intel CPU.
Bellow, I attach code with a minimal reproducible example and some logs from test done with my AMD system.
MRE code
using Dates
function alg(f::Function, f_min::Number, f_max::Number, n_var::Int, n_pop::Int, n_iter::Int)
start = time()
pop = Array{Float64}( undef, n_pop, n_var )
for i = 1:n_var
pop[:, i] = f_min .+ rand( n_pop ) .* ( f_max - f_min )
end
pop_c = Array{Float64}( undef, n_pop )
for i = 1:n_pop
pop_c[i] = f( view( pop, i, : ) )
end
_time = now()
i = 1
while i <= n_iter
if i % 10 == 0
println( "\t # $(now())\t- ITER: $(i)\t- MAIN LOOP... \t 10x iter: $(now()-_time)" )
_time = now()
end
xb = pop[ argmin( pop_c ), : ]
xw = pop[ argmax( pop_c ), : ]
for p in 1:n_pop
xn = pop[p, :] + rand( n_var ) .* ( xb - abs.( pop[p, :] ) ) - rand( n_var ) .* ( xw - abs.( pop[p, :] ) )
xn = max.( min.( xn, f_max ), f_min )
xn_c = f( xn )
if xn_c < pop_c[p]
pop[p, :] = xn
pop_c[p] = xn_c
end
end
i += 1
end
return minimum( pop_c ), string( time() - start )
end
function fun(x)
n = length(x)
z = zeros(n)
for i in 1:n÷2
z[2*i-1] = 10000 * x[2*i-1] * x[2*i] - 1
z[2*i] = exp( -x[2*i-1] ) + exp( -x[2*i] ) - 1.0001
end
return sum( abs.( z ) )
end
# # # # # # # # # # # # # # # # # # # # # # # # # # # # #
dim = 2500
pop = dim*10
n_iter = 1000
for run in 1:3
println("[$(dim), $(pop), $(n_iter)] - Run: $(run)" )
alg( fun, -10, 10, dim , pop, n_iter )
end
Log - Run with 25000x2500 (with performance issues)
[2000, 25000, 1000] - Run: 1
# 2023-11-30T17:54:42.845 - ITER: 10 - MAIN LOOP... 10x iter: 31029 milliseconds
# 2023-11-30T17:55:13.549 - ITER: 20 - MAIN LOOP... 10x iter: 30639 milliseconds
# 2023-11-30T17:55:42.978 - ITER: 30 - MAIN LOOP... 10x iter: 29000 milliseconds
# 2023-11-30T17:56:10.945 - ITER: 40 - MAIN LOOP... 10x iter: 28013 milliseconds
# 2023-11-30T17:56:37.901 - ITER: 50 - MAIN LOOP... 10x iter: 26171 milliseconds
# 2023-11-30T17:57:03.347 - ITER: 60 - MAIN LOOP... 10x iter: 24827 milliseconds
# 2023-11-30T17:57:27.636 - ITER: 70 - MAIN LOOP... 10x iter: 24071 milliseconds
# 2023-11-30T17:57:51.104 - ITER: 80 - MAIN LOOP... 10x iter: 22444 milliseconds
# 2023-11-30T17:58:13.315 - ITER: 90 - MAIN LOOP... 10x iter: 20687 milliseconds
# 2023-11-30T17:58:33.598 - ITER: 100 - MAIN LOOP... 10x iter: 19461 milliseconds
(...)
# 2023-11-30T18:01:31.810 - ITER: 200 - MAIN LOOP... 10x iter: 16608 milliseconds
(...)
# 2023-11-30T18:04:17.420 - ITER: 300 - MAIN LOOP... 10x iter: 16126 milliseconds
(...)
# 2023-11-30T18:06:57.865 - ITER: 400 - MAIN LOOP... 10x iter: 16575 milliseconds
(...)
# 2023-11-30T18:09:42.826 - ITER: 500 - MAIN LOOP... 10x iter: 16739 milliseconds
(...)
# 2023-11-30T18:12:26.032 - ITER: 600 - MAIN LOOP... 10x iter: 16195 milliseconds
(...)
# 2023-11-30T18:15:05.542 - ITER: 700 - MAIN LOOP... 10x iter: 16136 milliseconds
(...)
# 2023-11-30T18:17:49.975 - ITER: 800 - MAIN LOOP... 10x iter: 16133 milliseconds
(...)
## 2023-11-30T18:17:59.621 - ITER: 806 - MAIN LOOP... 1x iter: 1616 milliseconds
## 2023-11-30T18:18:01.157 - ITER: 807 - MAIN LOOP... 1x iter: 1536 milliseconds
## 2023-11-30T18:18:02.557 - ITER: 808 - MAIN LOOP... 1x iter: 1400 milliseconds
## 2023-11-30T18:18:03.824 - ITER: 809 - MAIN LOOP... 1x iter: 1266 milliseconds
## 2023-11-30T18:18:06.362 - ITER: 810 - MAIN LOOP... 1x iter: 2537 milliseconds
## 2023-11-30T18:18:09.024 - ITER: 811 - MAIN LOOP... 1x iter: 2662 milliseconds
## 2023-11-30T18:18:11.141 - ITER: 812 - MAIN LOOP... 1x iter: 2117 milliseconds
## 2023-11-30T18:18:12.445 - ITER: 813 - MAIN LOOP... 1x iter: 1304 milliseconds
## 2023-11-30T18:18:28.505 - ITER: 814 - MAIN LOOP... 1x iter: 16060 milliseconds
## 2023-11-30T18:18:55.896 - ITER: 815 - MAIN LOOP... 1x iter: 27391 milliseconds
## 2023-11-30T18:19:23.303 - ITER: 816 - MAIN LOOP... 1x iter: 27407 milliseconds
## 2023-11-30T18:19:50.742 - ITER: 817 - MAIN LOOP... 1x iter: 27439 milliseconds
(...)
# 2023-11-30T19:00:30.412 - ITER: 900 - MAIN LOOP... 10x iter: 263298 milliseconds
(...)
# 2023-11-30T19:46:04.056 - ITER: 1000 - MAIN LOOP... 10x iter: 271844 milliseconds
[2500, 25000, 1000] - Run: 2
# 2023-11-30T19:50:37.485 - ITER: 10 - MAIN LOOP... 10x iter: 273319 milliseconds
# 2023-11-30T19:55:13.279 - ITER: 20 - MAIN LOOP... 10x iter: 275794 milliseconds
(...)
Log - Run with 30000x3000 (OK)
[3000, 30000, 1000] - Run: 1
# 2023-12-01T18:51:34.073 - ITER: 10 - MAIN LOOP... 10x iter: 43813 milliseconds
# 2023-12-01T18:52:16.072 - ITER: 20 - MAIN LOOP... 10x iter: 41987 milliseconds
# 2023-12-01T18:52:53.990 - ITER: 30 - MAIN LOOP... 10x iter: 37918 milliseconds
# 2023-12-01T18:53:22.260 - ITER: 40 - MAIN LOOP... 10x iter: 28270 milliseconds
# 2023-12-01T18:53:49.793 - ITER: 50 - MAIN LOOP... 10x iter: 27533 milliseconds
# 2023-12-01T18:54:17.857 - ITER: 60 - MAIN LOOP... 10x iter: 28062 milliseconds
# 2023-12-01T18:54:45.207 - ITER: 70 - MAIN LOOP... 10x iter: 27350 milliseconds
# 2023-12-01T18:55:12.597 - ITER: 80 - MAIN LOOP... 10x iter: 27390 milliseconds
# 2023-12-01T18:55:40.331 - ITER: 90 - MAIN LOOP... 10x iter: 27734 milliseconds
# 2023-12-01T18:56:08.854 - ITER: 100 - MAIN LOOP... 10x iter: 28523 milliseconds
(...)
# 2023-12-01T18:59:46.940 - ITER: 200 - MAIN LOOP... 10x iter: 21731 milliseconds
(...)
# 2023-12-01T19:03:22.487 - ITER: 300 - MAIN LOOP... 10x iter: 21225 milliseconds
(...)
# 2023-12-01T19:06:57.561 - ITER: 400 - MAIN LOOP... 10x iter: 21493 milliseconds
(...)
# 2023-12-01T19:10:30.105 - ITER: 500 - MAIN LOOP... 10x iter: 21296 milliseconds
(...)
# 2023-12-01T19:14:04.754 - ITER: 600 - MAIN LOOP... 10x iter: 21295 milliseconds
(...)
# 2023-12-01T19:17:46.691 - ITER: 700 - MAIN LOOP... 10x iter: 22236 milliseconds
(...)
# 2023-12-01T19:21:27.534 - ITER: 800 - MAIN LOOP... 10x iter: 21383 milliseconds
(...)
# 2023-12-01T19:25:03.962 - ITER: 900 - MAIN LOOP... 10x iter: 21615 milliseconds
(...)
# 2023-12-01T19:28:42.516 - ITER: 1000 - MAIN LOOP... 10x iter: 21844 milliseconds
[3000, 30000, 1000] - Run: 2
# 2023-12-01T19:29:05.372 - ITER: 10 - MAIN LOOP... 10x iter: 19463 milliseconds
# 2023-12-01T19:29:27.061 - ITER: 20 - MAIN LOOP... 10x iter: 21689 milliseconds
# 2023-12-01T19:29:48.748 - ITER: 30 - MAIN LOOP... 10x iter: 21686 milliseconds
(...)
Log - Run with 50000x5000 (OK)
[5000, 50000, 1000] - Run: 1
# 2023-11-30T20:29:34.222 - ITER: 10 - MAIN LOOP... 10x iter: 91541 milliseconds
# 2023-11-30T20:31:02.215 - ITER: 20 - MAIN LOOP... 10x iter: 87992 milliseconds
# 2023-11-30T20:32:23.846 - ITER: 30 - MAIN LOOP... 10x iter: 81628 milliseconds
# 2023-11-30T20:33:36.234 - ITER: 40 - MAIN LOOP... 10x iter: 72385 milliseconds
# 2023-11-30T20:34:49.438 - ITER: 50 - MAIN LOOP... 10x iter: 73202 milliseconds
# 2023-11-30T20:36:02.304 - ITER: 60 - MAIN LOOP... 10x iter: 72862 milliseconds
# 2023-11-30T20:37:16.345 - ITER: 70 - MAIN LOOP... 10x iter: 74035 milliseconds
# 2023-11-30T20:38:31.253 - ITER: 80 - MAIN LOOP... 10x iter: 74902 milliseconds
# 2023-11-30T20:39:45.898 - ITER: 90 - MAIN LOOP... 10x iter: 74642 milliseconds
# 2023-11-30T20:41:03.194 - ITER: 100 - MAIN LOOP... 10x iter: 77291 milliseconds
(...)
# 2023-11-30T20:53:31.399 - ITER: 200 - MAIN LOOP... 10x iter: 77181 milliseconds
(...)
# 2023-11-30T21:05:56.724 - ITER: 300 - MAIN LOOP... 10x iter: 76036 milliseconds
(...)
# 2023-11-30T21:19:39.973 - ITER: 400 - MAIN LOOP... 10x iter: 82673 milliseconds
(...)
# 2023-11-30T21:32:59.258 - ITER: 500 - MAIN LOOP... 10x iter: 78104 milliseconds
(...)
# 2023-11-30T21:45:59.476 - ITER: 600 - MAIN LOOP... 10x iter: 78104 milliseconds
(...)
# 2023-11-30T21:59:11.549 - ITER: 700 - MAIN LOOP... 10x iter: 79619 milliseconds
(...)
# 2023-11-30T22:12:12.002 - ITER: 800 - MAIN LOOP... 10x iter: 77864 milliseconds
(...)
# 2023-11-30T22:25:21.053 - ITER: 900 - MAIN LOOP... 10x iter: 72679 milliseconds
(...)
# 2023-11-30T22:38:42.306 - ITER: 1000 - MAIN LOOP... 10x iter: 73025 milliseconds
[3000, 30000, 1000] - Run: 2
# 2023-11-30T22:40:03.960 - ITER: 10 - MAIN LOOP... 10x iter: 80654 milliseconds
(...)