To investigate how sample size affects uncertainty in parameter inference, I do something like:
Θ_example = something #Θ is a vector of the parameters I'm trying to infer
Θvec = zeros(numSamples, size(Θ_example))
for i in 1:numSamples
s = generateSample()
Θ = inferParameters(s)
Θvec[i] = Θ
end
where inferParameters
calls on Optim.optimize
and currently uses NelderMead()
When I run this on my MacBook Pro with an M1 Pro chip with numThreads=8 I can see that I use one core only and have ~22 threads. I see this on both in the terminal using top
and using the Mac’s Activity Monitor.
Naively, I’d like do something like this instead:
Θ_example = something #Θ is a vector of the parameters I'm trying to infer
Θvec = zeros(numSamples, size(Θ_example))
Threads.@threads for i i in 1:numSamples
s = generateSample()
Θ = inferParameters(s)
Θvec[i] = Θ
end
When I parallelize in this way I do see all my cores in use but it’s still the case that I’m only using ~22 threads spread over the 8 cores and my computation is significantly slower than that single-threaded version.
Finally, my question:
Is this behavior expected and is there a better way to utilize the multiple cores? I would have expected / hoped to have 20-odd threads running on each of the 8 cores rather than the same total number of threads as the single-core case.