I’m hoping there’s a more performant way to do this, but I’m missing something.

I’m migrating some MATLAB code that uses reversible-jump MCMC to produce optimal sets of point sources to explain observed events (2d coordinates + precision).

The way a model is evaluated for a move to be accepted in MATLAB is to generate a matrix of all combinations of sources and coordinates, with each value being the normpdf described by the modeled source coordinate and event precision evaluated at the event coordinate. This is done separately for each axis and multiplied, and then the values for each localization are summed across sources, then the log of these are summed to produce the score for the model.

This is what the code looks like in Julia, partially optimized:

```
xworking = zeros(μxs)
yworking = zeros(μys)
ws = 1 / nemitters
p_alloc = 0.0
for i ∈ eachindex(xs)
xworking .= normpdf.(μxs .+ αxs .* frames[i], σxs[i], xs[i])
yworking .= normpdf.(μys .+ αys .* frames[i], σys[i], ys[i])
xworking .*= yworking
p_emitter = sum(xworking)
p_alloc += log(p_emitter * ws)
end
return p_alloc
end
```

(the alpha values allow for linear drift)

Profiling shows `normpdf`

, within the broadcast `copy`

call, is the bottleneck.

I feel like there must be a more efficient approach to this. Are there particular packages I should look into? Can KernelDensity.jl be used?

The original MATLAB code is here for reference:

```
function LogL = p_Alloc(SMD,Mu_X,Mu_Y,Alpha_X,Alpha_Y,Ws)
%This function calculated the probability of a given allocation set.
X=SMD.X;
Y=SMD.Y;
T = repmat(SMD.FrameNum,[1,length(Mu_X)]);
SigmaX=SMD.X_SE;
SigmaY=SMD.Y_SE;
Lx = length(X);
Lmu = length(Mu_X);
LogL = log(sum(repmat(Ws,[Lx,1]).*normpdf(repmat(X,[1,Lmu]),...
repmat(Mu_X,[Lx,1])+repmat(Alpha_X,[Lx,1]).*T,...
repmat(SigmaX,[1,Lmu])).*normpdf(repmat(Y,[1,Lmu]),...
repmat(Mu_Y,[Lx,1])+repmat(Alpha_Y,[Lx,1]).*T,...
repmat(SigmaY,[1,Lmu])),2));
LogL = sum(LogL);
end
```