Alternative to computation in a loop for distances for fast performance

nagda95 · December 6, 2021, 10:52am

Hello Everyone,
I want to compute the distance (SqEuclidean) of a new point (newP) from the existing neighboring points in a specific order. I have the following code:

newP=[x y z]
for i in eachindex(order)
	Dist[i]=sum((Points[order[i], :]' - newP) .^ 2)
end

However, the above piece of code comparatively takes too much time and I would like to speed up this computation. Is there some way I can optimize this loop?
Thank you.

nilshg · December 6, 2021, 12:06pm

Please make an MWE and clarify what you mean by too much time.

Some potential low-hanging fruit could be removing allocations, using StaticArrays for your point vectors, and avoiding access to variables in global scope, but all of this is hard to tell without a working example.

jlapeyre · December 6, 2021, 1:20pm

nagda95 · December 6, 2021, 1:32pm

Here is an MWE

Points=rand(Float64, (6700, 3))
newP=[0.2 0.3 0.4]
order=collect(1:6700) #order can be different but for simple example I use regular sequence here
Dist=zeros(size(order,1),1)
for i in eachindex(order)
	Dist[i]=sum((Points[order[i], :]' - newP) .^ 2)
end

I want to find Dist (distances) for a lot of newP so I want to speed up this computation in for loop (or rather not use for loop for fast performance)

jlapeyre · December 6, 2021, 1:49pm

The page on performance tips is very important.

I put the code you’d like to time in a function, so that it can be timed.

Making the identifiers (variables) const, increases performance in this case by a factor of ten.

Using @inbounds improves performance further by a few percent

const Points = rand(Float64, (6700, 3))

const newP = [0.2 0.3 0.4]

const order = collect(1:6700) #order can be different but for simple example I use regular sequence here

const Dist = zeros(size(order,1),1)

function fillDist()
    @inbounds for i in eachindex(order)
	Dist[i]=sum((Points[order[i], :]' - newP) .^ 2)
    end
end

You can get another factor of two or more this way:

const newP2 = [0.2, 0.3, 0.4]

function fillDist2()
    for i in eachindex(order)
	Dist[i]=sum((Points[order[i], :] .- newP2) .^ 2)
    end
end

Here, I have avoided doing a transpose. Also the .- and .^ are “fused” into a single loop, resulting in fewer allocations.

You get another factor of two by replacing the line in the loop by

 Dist[i]=sum(( @view(Points[order[i], :]) .- newP2) .^ 2)

roflmaostc · December 6, 2021, 2:17pm

And instead of using excessive global variables, maybe pass them explicitly to the function.

mcabbott · December 6, 2021, 3:03pm

Some ways to do this are:

pairwise2(x::AbstractMatrix, y::AbstractMatrix) = sum(abs2, x; dims=1)' .+ sum(abs2, y; dims=1) .- 2 .* (x'*y)  # written to use columns
Dist ≈ pairwise2(Points[order, :]', newP')

using Distances
Dist ≈ pairwise(SqEuclidean(), Points[order, :], newP; dims=1)  # dims=1 means rows here

using Tullio
Dist ≈ @tullio dist_t[i] := (Points[order[i], k] - newP[k])^2

Times:

julia> using BenchmarkTools

julia> @btime for i in eachindex($order)
               $Dist[i] = sum(($Points[$order[i], :]' - $newP) .^ 2)
       end
  min 514.084 μs, mean 577.836 μs (20100 allocations, 1.53 MiB)

julia> @btime pairwise2($Points[$order, :]', $newP');
  min 31.833 μs, mean 64.153 μs (11 allocations, 471.55 KiB)

julia> @btime pairwise(SqEuclidean(), $Points[$order, :], $newP; dims=1)
  min 27.459 μs, mean 45.095 μs (7 allocations, 262.02 KiB)

julia> @btime @tullio dist_t[i] := ($Points[$order[i], k] - $newP[k])^2;
  min 20.000 μs, mean 24.760 μs (2 allocations, 52.42 KiB)

julia> using LoopVectorization

julia> @btime @tullio dist_t[i] := ($Points[$order[i], k] - $newP[k])^2;
  min 13.375 μs, mean 17.343 μs (2 allocations, 52.42 KiB)

nagda95 · December 6, 2021, 11:13pm

@mcabbott Hello Michael,
Thank you for your response.
I am not able to implement the script using Tullio. When I run this code, I get the output as false.

Points=rand(Float64, (6700, 3))
newP=[0.2 0.3 0.4]
order=collect(1:6700) 
Dist=zeros(size(order,1),1)
using Tullio
Dist ≈ @tullio dist_t[i] := (Points[order[i], k] - newP[k])^2

I am sorry but I don’t understand. Could you please help?

DNF · December 7, 2021, 12:31am

A few points:

Julia Arrays are column major, so make sure to work along columns, not rows:

Points=rand(Float64, (3, 6700))
Points[:, order[i]]

Wrap your code in functions(!)

A matter of style: it is good to distinguish between matrices and vectors, so instead of

write

Dist = zeros(size(order, 1))

mcabbott · December 7, 2021, 2:17am

I think you’re testing here whether zero (from two lines up) is approximately equal to a result. My code above ought to return true if you have already run yours – it check that each method is getting the same result as your loop wrote into Dist.

Topic		Replies	Views
Faster squared euclidean distance calculation Performance	11	1818	October 2, 2021
Compute pairwise distances (Distances.jl) General Usage performance , distances , tullio , loopvectorization	9	3729	February 1, 2024
Getting the benchmarked speedup from Distances.jl General Usage performance	8	560	September 7, 2020
How to make LoopVectorization work with computing Euclidean distance for every column? New to Julia tullio , loopvectorization	13	1120	August 21, 2020
Euclidean distance from one point Performance question	6	1209	June 17, 2020

Alternative to computation in a loop for distances for fast performance

Related topics