I’m trying to maximize a likelihood function that involves a large number of simulations of a reasonably complex model. Serialization time is a bottleneck. Does anything jump out as being a problem?
Here is an example that captures the important part of my code, although I can’t be certain this is capturing the problem:
struct Observation
X::Array{Float64,2}
y::Array{Float64,2}
d::Dict{Int,Float64}
t::Array{Int64}
end
function logl(obs::Array{Observation,1},β::Array{Float64,1})
out = @distributed (+) for ob in obs
logl_i(ob,β) # perform simulation and return contribution to the likelihood
return out
end
The code is spending 2/3 of the time serializing, and does so for each call to logl.
Is it the use of a custom struct to store the data? Could variation in the size of the objects in the struct be an issue?