Repeated serialization for distributed simulation

I’m trying to maximize a likelihood function that involves a large number of simulations of a reasonably complex model. Serialization time is a bottleneck. Does anything jump out as being a problem?

Here is an example that captures the important part of my code, although I can’t be certain this is capturing the problem:

struct Observation
        X::Array{Float64,2}
        y::Array{Float64,2}
        d::Dict{Int,Float64}
        t::Array{Int64}
end

function logl(obs::Array{Observation,1},β::Array{Float64,1})
        out = @distributed (+) for ob in obs
                logl_i(ob,β) # perform simulation and return contribution to the likelihood
        return out
end

The code is spending 2/3 of the time serializing, and does so for each call to logl.

Is it the use of a custom struct to store the data? Could variation in the size of the objects in the struct be an issue?