Variable declaration and initialisation

ennvvy · October 15, 2019, 5:07am

I am trying to optimise a piece of my code. I have currently used mutable struct to declare variable types. Following which I assign values to them,

mutable struct Par
    N::Int64 ; 
    n::Int64 ; 
end
mutable struct Mat
    A::Array{Int64,2} ; 
    θₜ::Array{Float64,2} ; 
    Atemp::Array{Int64,2} ; 
    Asep::Array{Float64,2} ;
    allag::Array{Float64,2} ; 
end
p = Par(1000,5000) ;
m = Mat(zeros(Int64,p.N,p.N),zeros(Float64,p.n,p.N),zeros(Int64,p.N,p.N),zeros(Float64,p.N,p.N),zeros(Float64,p.N,p.N)) ;

I would want to increase values of N and n in subsequent simulations. But I notice already that initialising the matrices themselves take time of the order of 10^-3,

using BenchmarkTools
@btime m = Mat(zeros(Int64,p.N,p.N),zeros(Float64,p.n,p.N),zeros(Int64,p.N,p.N),zeros(Float64,p.N,p.N),zeros(Float64,p.N,p.N)) ;
  13.360 ms (21 allocations: 68.67 MiB)

Is there a faster/optimal way of doing this?

jling · October 15, 2019, 5:09am

do you need to initialize different matrix of this size millions of times? (potentially you can use Undef) Looking at the allocation size, I think whatever you do you will be memory bounded.

ennvvy · October 15, 2019, 5:17am

I am carrying out MonteCarlo simulations, and I would like to increase N from 1000 to 100,000 or so. I was trying to see if I could save some time here to better the execution time.

PetrKryslUCSD · October 15, 2019, 5:19am

I would try to design the algorithm to use a buffer pre-allocated, and reused over and over again.

ennvvy · October 15, 2019, 5:20am

Can you please suggest how to do so?

PetrKryslUCSD · October 15, 2019, 5:37am

How about running the MC in parallel. Let us say in threads. Then each thread would have a buffer of the type you specified above, and when it starts it overwrites the values in the buffer with the correct initial values and takes off. This way you need to allocate only T thread buffers. Obviously, because each thread only accesses its own buffer, there is no memory race…And this way you don’t run out of memory when allocating lots of matrices that get used only once

ennvvy · October 15, 2019, 5:39am

Thank you. I am relatively new to Julia and not an ace programmer. Can I ask if there is any documentation on the use of threads, some example to look up to?

PetrKryslUCSD · October 15, 2019, 6:14am

using Base.Threads
struct ThreadBuffer
	t::Int64
	parameters::Vector{Float64}
end

@show parameters = rand(10)
nth = 3
chunk = Int64(round(length(parameters) / nth))
buffers = ThreadBuffer[]
for t in 1:nth-1
	push!(buffers, ThreadBuffer(t, parameters[chunk*(t-1)+1:chunk*(t)+1-1]))
end
push!(buffers, ThreadBuffer(nth, parameters[chunk*(nth-1)+1:length(parameters)]))
@show buffers

threadwork(t) = begin
for p in 1:length(buffers[t].parameters)
	println("Thread $(t), Working with parameter = $(buffers[t].parameters[p])")
	sleep((nth-t+1)*rand())
end
end

tasks = []
Threads.@threads for t in 1:nth
	push!(tasks, Threads.@spawn threadwork(t))
end
for t in 1:length(tasks)
	Threads.wait(tasks[t])
end
println("Done")

non-Jedi · October 15, 2019, 8:13pm

I’d strongly suggest getting the serial version of your code working
before worrying about multi-threading. The only gotcha when moving
from serial to parallel is making sure that any functions using random
numbers take an rng as an argument rather than just using the global
rng.

What we mean when we say use a pre-allocated buffer is just to do
something like the following for memory of size n*n with m
replications:

# Pre-allocate your matrix/vector/chunk of memory
mem = Matrix{Float64}(undef, n, n)
results = Vector{Float64}(undef, n)
for i in eachindex(results)
    for j in eachindex(mem)
        mem[j] = # do calculation
    end#for
    results[i] = f(mem)
end

Basically just re-use arrays you allocate as much as possible. If n
varies between runs of your simulation, you can still re-use the
arrays: just change them to the appropriate size with resize! (under
the hood, julia may have to reallocate the array and copy its values
to a different location in memory when resize!ing, but user-code
doesn’t have to worry about that).

ennvvy · October 16, 2019, 5:13am

Thank you. I already have the Serial version of my code working. Since, it takes really long, I am trying ways to optimise it and better the execution time. Any suggestions for doing the same will be of great help.

ennvvy · October 16, 2019, 7:21am

I was trying to understand what each syntax in this code does. When I tried running,

import Base.Threads.@spawn

Threads.@threads for t in 1:nth
	push!(tasks, Threads.@spawn threadwork(t))
end

I get the following errors,

WARNING: could not import Threads.@spawn into Main
UndefVarError: @spawn not defined

Is there something else that needs to be done?

PetrKryslUCSD · October 16, 2019, 7:24am

Please remember to run this with Julia 1.3.

ennvvy · October 16, 2019, 7:47am

That worked. Thank you.

pablosanjose · October 16, 2019, 8:25am

If the post by @PetrKryslUCSD answered your question it is a good idea to mark it as “Solution”.

Topic		Replies	Views
Initialize a variable inside a struct New to Julia question	14	7386	August 1, 2020
Struct variables explicit setting New to Julia	30	1304	March 19, 2021
How to make this piece of code faster? optimising it or using multiple threads? New to Julia	68	3654	November 29, 2019
Constructor to delay array allocation in mutable struct General Usage memory-allocation , mutable-structure	2	599	October 18, 2020
Initialise array without specific types Performance question	25	1865	April 30, 2021

Variable declaration and initialisation

Related topics