# Improving performance of mapping algorithm

Hi all,
I ported a code from MATLAB to Julia, because it takes too long right now:

``````using Random
using Printf
start = time()
Lk = 4
Lp = 1500
# create test data: coordinates of relative points and points to be mapped
irs = Array{Any}(undef,Lk)
ics = Array{Any}(undef,Lk)
irs[1] = 1:Lp
ics[1] = 1:Lp
for k in 2:Lk
@printf "in for loop, k = %.0f\n" k
A = zeros(Float64,Li,Lj)
# create distance matrix
for kk in 1:3
end
# pick pairs of smallest distance & clear these pairs from A
Ar = size(A,1)
As = size(A)
Ind = zeros(Int64, Ar,2)
for i in 1:Ar
ir, ic = Tuple(argmin(A))
A[ir,:] .= Inf
A[:,ic] .= Inf
Ind[i,:] = [ir,ic]
end
# sort for relative / mapped points
rs = sortperm(Ind[:,1]);
irs[k] = Ind[rs,2]
cs = sortperm(Ind[:,2]);
ics[k] = Ind[cs,1]
end
elapsed = time() - start
``````

Finding the minimum takes almost all the time and I was wondering if Julia with argmin(A) is maybe faster than MATLABâ€™s min(A). I also tried findmin(A), but the performance is the same. In reality Lp is greater than 1500, which is a real problem for me. Any suggestions how I can speed up the code?

The biggest problem with this code by far is that itâ€™s all in the gobal scope. Reading this Performance Tips Â· The Julia Language will give you a good start towards optimizing the code.

2 Likes

Thank you for this advice. I created a function for the expensive task of finding the minimum and this speeds the code up by 10%. There might be more to gain, but MATLAB is still 5x faster. Therefore, I will probably try to improve the MATLAB code or try to implement another mapping algorithm.

You should put everything in a function.

1 Like

And donâ€™t use `Array{Any}` â€“ create your the array using e.g. `ics = [1:Lp]`.
That way it will be concretely typed, which is what Julia needs for speed.

Thx for the suggestions. I think I am doing something wrong. Just enclosing everything in a function and calling it didnâ€™t change speed.
Also using `pload = Array{Float64}(undef,Lk,Lp,3)` and likewise for other usage of â€śAnyâ€ť did not increase speed.

If you want suggestions you are going to need to show us what you did.

Just a side note, just writing anything in Julia does not guarantee a speed increase. Optimizing code in any language first takes knowledge of that language and then lots of profiling and testing. I suspect that was done with your MATLAB implementation, Iâ€™m not so sure it was done with your Julia implementation.

2 Likes

There are a lot of array operations inside the loop. Each one of these allocates memory. Replace them by explicit loops.

It seems like you are mixing together the core code, that calculates the thing you want to find, and the test code, which creates test data, loops several times over stuff, etc.

Could you instead define the core functionality (for example, finding the elements with the smallest pairwise distance, if that is what you want) into a function with a clear interface, and separate it from the setup/test code?

This would make it easier for others to understand, it would make it faster and more readable.

BTW, you should do the same with your Matlab code. Putting units of functionality into functions is recommended in most languages.

2 Likes

As you say, all the time is in `argmin(A)`. This takes 7.4ms, times 1500 times 3 gives about 30s, which is how long (a slightly improved) version takes to run. Canâ€™t you avoid this entirely, perhaps call `sortperm(vec(A))` which takes 240ms, but only 3 times? A sketch:

``````julia> A0 = rand(Int8, 3,3);

julia> A = copy(A0)
3Ă—3 Array{Int8,2}:
117   -66  -26
41   -42   -1
48  -122  -17

julia> i, j = argmin(A).I
(3, 2)

julia> A[i,:] .= 127; A[:,j] .= 127; # typemax(Int8)

julia> i, j = argmin(A).I
(1, 3)

julia> A[i,:] .= 127; A[:,j] .= 127;

julia> i, j = argmin(A).I
(2, 1)

julia> A
3Ă—3 Array{Int8,2}:
127  127  127
41  127  127
127  127  127

julia> CartesianIndices(A0)[sortperm(vec(A0))]
9-element Array{CartesianIndex{2},1}:
CartesianIndex(3, 2) # first, eliminate 3,2
CartesianIndex(1, 2)
CartesianIndex(2, 2)
CartesianIndex(1, 3) # second
CartesianIndex(3, 3)
CartesianIndex(2, 3)
CartesianIndex(2, 1) # third
CartesianIndex(3, 1)
CartesianIndex(1, 1)
``````

Now you just need to filter out which rows/cols have already been seen, perhaps something like

``````julia> is, js = BitSet(), BitSet();
julia> push!(is, 3); push!(js, 2); # first
julia> 1 in is || 2 in js
true
julia> 2 in is || 2 in js
true
julia> 1 in is || 3 in js # second
false
``````
2 Likes

Thanks for confirming that the calculation of the minimum takes the most time. I had some problems understanding the output of the profiling.
This is an interesting shortcut for calculating the shortest distance between points. I will check if it gives the desired result.

Thanks for this idea. It works and is 10x faster. I tried to do the same in MATLAB, but maybe in this case I have done it less efficiently. Now Julia outperforms MATLAB by almost an order of magnitude.

Just in case anybody is interested, the solution looks like this:

``````using Random
using Printf

function evalMat(A)
# pick pairs of smallest distance and return corresponding points (or their order)
Ar = size(A,1)
Ar2 = size(A,2)
Ind = CartesianIndices(A)[sortperm(vec(A))]
irf, icf = Int64[], Int64[]
for i in 1:Ar*Ar2
if !(Ind[i][1] in irf || Ind[i][2] in icf)
push!(irf, Ind[i][1])
push!(icf, Ind[i][2])
end
end
return irf, icf
end

## get distance matrix A[i,j]; i relative points, j points to be mapped
Lk = 4     # 1 relative set, 3 working data sets
Lp = 1500    # number of points per set
irs = Array{Int64}(undef,Lk,Lp)
ics = Array{Int64}(undef,Lk,Lp)
irs[1,:] = 1:Lp
ics[1,:] = 1:Lp
for k in 2:Lk
@printf "in for loop, k = %.0f\n" k
A = zeros(Float64,Li,Lj)
for kk in 1:3
end
irs[k,:], ics[k,:] = evalMat(A)
end
``````

@mcabbott: I could not use BitSet() or Set(), since I need to know the order of the points, which I use later for mapping some fields. Sets do not preserve the order.

1 Like

Great! If Iâ€™m thinking correctly it should also be N log(N), instead of N^3. (At least if `sortperm` is where the time is, not the filtering of its result.)

Your code looks nicer too. There are a few things you can tidy up, e.g. itâ€™s not necessary for each term of a broadcast to make up the final shape, so I think you can do this (untested, but yesterday something similar worked):

``````pload = rand(Float64, 4,Lp,3)

@views for kk in 1:3
end
``````
1 Like

You donâ€™t need to multiply with `ones` in Matlab either. Compatible shapes will broadcast correctly.

Finally, you can also use `.+=` instead of repeating the `A`.

1 Like

I donâ€™t fully understand what you are trying to achieveâ€¦ letâ€™s say you have 4 lists of points, of same length (say 1500 black, 1500 red, 1500 green and 1500 blue). Do you want to:

• for each of the four sets,
• list all distinct pairs (black,coloured) formed by points in the first and points in the other set, sorted by their distance?

Anyway, regarding the structure of the code, you are repeating the exact same procedure Lk times, without any dependency between each run. You should probably define a function to do it for each pair of sets â€“ it would be much clearer:

``````  for k in 2:Lk
end
``````

(just out of curiosity, could you share the Matlab code and timing?)

2 Likes

Almost. I want to know the pairings, which for my data match more or less uniquely. It does not have to be sorted according to distance.

Some data sets have points matching perfectly, others have some distance between matching points. Otherwise I could use the distance matrix and just look for all entries below a tolerance, e.g., 1E-10.

Here is the MATLAB version:

``````Lk = 4;
Lp = 1500;
irs = cell(Lk,1);
ics = cell(Lk,1);
irs{1} = 1:Lp;
ics{1} = 1:Lp;
for k = 2:Lk
disp('In for loop, k = '); disp(k)
end

function [irf,icf] = sortpermByPairwiseDistance(pl1,pl2)
Li = size(pl1,2);
Lj = size(pl2,2);
% get distance matrix A[i,j]; i relative points, j points to be mapped
A = zeros(Li,Lj);
for kk = 1:3
A = A + (pl1(:,kk) - pl2(:,kk)').^2;
end
[irf, icf] = evalMat(A);
end

function [irf,icf] = evalMat(A)
Ar = size(A);
%% find pairs of smallest distance
[~,I] = sort(A(:));
[ir, ic] = ind2sub(Ar,I);
irf = zeros(Ar(1),1); icf = zeros(Ar(1),1);
j = 1;
for i=1:Ar(1)*Ar(2)
%if ~(ismember(ir(i),irf) || ismember(ic(i),icf))
if ~(any(irf(:)==ir(i)) || any(icf(:)==ic(i)))
irf(j) = ir(i);
icf(j) = ic(i);
j = j + 1;
end
end
end
``````

I assume these iterations with â€śany(â€¦)â€ť are not efficient with MATLAB, while matrix operations would be better.

Well, this should be about julia, butâ€¦ I send you my implementation in Matlab. Itâ€™s about 150 times faster
This only works well if there is a guarantee that the distance between corresponding points in different sets is smaller than any distance intra-setâ€¦ So, it fails if deviation introduced in second set is larger than the average (or expected) separation of points in original set.
If this is the case, we should either mark those points as undetermined, or do a minimization of total deviationsâ€¦

``````function [irs,ics,pload,trueperm]=mytests(data)
if isempty(data) % Generate test data
% Original data sets:
npt=1500
nsets=4
noiseAmpl=0.1
mixpoints=true %false
checkresult=true
dref=rand(npt,3)*100;
trueperm=[];
for k = 1:nsets-1
if mixpoints
trueperm(end+1,:)=randperm(npt);
else
trueperm(end+1,:)=1:npt;
end
dnoisy=dref+noiseAmpl*(2*rand(npt,3)-1);
end
else % convert to cell if needed
if ~iscell(data)
for i=1:size(data,1)
end
else
end
checkresult=false
end

tic
irs{1}=1:npt;
ics{1}=1:npt;
for k = 2:nsets
disp('In for loop, k = '); disp(k)
end
toc

if checkresult
disp('checking for wrong matches: should all be empty.')
for k=2:4
[~,p]=sort(irs{k});
find(ics{k}(p)~=trueperm(k-1,:))
end
end

function [ix1,ix2]=sortpermByPairwiseDistance(pl1,pl2)
% process one pair of data sets:
% We want to find/build a permutation of do2 that minimizes the distances
% between the two data sets.

t1=tic;
d1=pl1;
d2=pl2;
d2_=d2;
ix1=1:npt;
ix2=ix1;
d12=zeros(1,npt);
for io=1:npt
p1=pl1(io,:);
[d,ix]=min(sum((d2_-repmat(p1,npt,1)).^2,2));
d2(io,:)=d2_(ix,:);
ix2(ix)=io;
% or:
%ix2(io)=ix;
d12(io)=sqrt(d);
d2_(ix,:)=NaN;
end
toc(t1)
end

end
``````