To start off, here is some necessary information
julia> versioninfo()
Julia Version 1.2.0
Commit c6da87ff4b (2019-08-20 00:03 UTC)
Platform Info:
OS: macOS (x86_64-apple-darwin18.6.0)
CPU: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
Reproducible Example: Start off each simulation on the same number and generate random integers until reaching an even number. Store that run into an array and repeat 5 times.
Goal: Accomplish this correctly (correctly being defined in the problem below) while minimizing memory allocation.
Problem:
- a) Minimize memory allocation by pre-allocating array, but simulation is ruined as the array is “passed by reference” (not quite sure if that’s technically correct) such that once the first run reaches an even number, all subsequent runs end with an even number and immediately break from the simulation or
- b) Correctly simulate all runs but destroy run-time by allocating a new starting array for each simulation.
Note: This is a reprex of a much larger problem I am working on now. So while the time & allocation difference in this example is minimal, it is having a stronger effect in my code.
To start off, I’ll show you the code from problem part (a):
example01.jl
function gen_array(seq::Array{Int64,1})
while true
seq[end] % 2 == 0 && break
num = rand(Int64)
push!(seq, num)
end
return seq
end
function simulate_chains(n::Int64, seq::Array{Int64,1})
map(i -> gen_array(seq), 1:n)
end
function main()
N::Int64 = 5
START = Array{Int64,1}(undef, 1)
fill!(START, 1233)
simulation = simulate_chains(N, START)
end
Here is me loading the script into the REPL and benchmarking it twice to allow it to compile:
julia> include("example01.jl")
main (generic function with 1 method)
julia> @time main();
0.048245 seconds (220.16 k allocations: 11.701 MiB)
julia> @time main()
0.000004 seconds (9 allocations: 480 bytes)
5-element Array{Array{Int64,1},1}:
[1233, -4418181421764378021, -6736001003188972884]
[1233, -4418181421764378021, -6736001003188972884]
[1233, -4418181421764378021, -6736001003188972884]
[1233, -4418181421764378021, -6736001003188972884]
[1233, -4418181421764378021, -6736001003188972884]
Notice how the output is the same for each simulation. While this may be possible, it is certainly not probable. As mentioned in problem part (a), the START
array is remaining “persistent” across simulations. This is not desired.
Now for the code from problem part (b)
example02.jl
function gen_array(seq::Array{Int64,1}=[1233])
while true
seq[end] % 2 == 0 && break
num = rand(Int64)
push!(seq, num)
end
return seq
end
function simulate_chains(n::Int64)
map(i -> gen_array(), 1:n)
end
function main()
N::Int64 = 5
simulation = simulate_chains(N)
end
Notice how I’ve changed the code by giving gen_array()
a default value and removing my pre-allocated array from main()
. Here I am in a new REPL session running the code:
julia> include("example02.jl")
main (generic function with 1 method)
julia> @time main();
0.061380 seconds (256.65 k allocations: 13.881 MiB)
julia> @time main()
0.000004 seconds (15 allocations: 1008 bytes)
5-element Array{Array{Int64,1},1}:
[1233, 1117329081981021236]
[1233, -6637213868465549306]
[1233, 4235110922391529754]
[1233, 6888918319811551361, 8043136562908430093, 3767038082823508588]
[1233, 6019512490854919842]
As you can see I’ve allocated more memory in this code (although the time hasn’t changed so much). My real code has significant time difference by only changing this aspect.
Is there a way to have my cake and eat it too?