Euclidean distance from one point

Geoffrey · June 17, 2020, 2:05pm

Hi, I’m looking for a way to compute an array representing the radius of the polar or spherical coordinates given the array sizes and a center point. I have this code for now but it’s incredibly slow and I’m looking for ways to improve it and tips to understand why it is slow:

using BenchmarkTools
function euclideandst(sz, p)
    # sz = size of the Array
    # p coordinates of the center point
    N = length(sz) # number of dimension
    return [sqrt(sum(x -> x .* x, [i[j] - c[j] for j = 1:N])) for i = CartesianIndices(sz)]
end
sz = (1024, 1024, 25)
c = (sz .+ 1) ./ 2
@btime dst = euclideandst($sz, $c)
39.293 s (314624005 allocations: 8.40 GiB)

Oscar_Smith · June 17, 2020, 2:21pm

I would highly reccomend Distances.jl for doing this type of work. It is incredibly well optimized.

cstich · June 17, 2020, 2:40pm

Also if you use several threads, using unsafe views (UnsafeArrays.jl) can make quite a bit of difference as well (even if you use the Distances.jl package).

Something like this should work; taken from ParalllelKMeans.

"""
distance(metric, X1, X2, i1, i2)
Allocationless calculation of distance between vectors X1[:, i1] and X2[:, i2] defined by the supplied distance metric.
"""
@inline distance(metric, X1, X2, i1, i2) = evaluate(metric, uview(X1, :, i1), uview(X2, :, i2))

Edit: Update with link to UnsafeArrays

Oscar_Smith · June 17, 2020, 2:41pm

Do uviews help at all once 1.5 is out?

cstich · June 17, 2020, 2:44pm

I don’t know for sure but it looks like in most cases uviews should indeed by unnecessary with 1.5

https://github.com/JuliaArrays/UnsafeArrays.jl/issues/8

stevengj · June 17, 2020, 3:08pm

First, realize that in your innermost loop you are allocating a little 3-component array on every iteration (every i) just to sum it, which is incredibly wasteful and inefficient. Second, you have a bug: although you are passing c as the parameter p in your function, your function actually refers to the global variable c rather than to the parameter p, and global variables are slow in Julia.

Simply fixing the bug (changing c to p in the function) speeds it up by a factor of 20 for me. Getting rid of the inner-loop allocations by changing the sum call to sum(j -> (i[j] - p[j])^2, 1:N)) speeds it up by another factor of 6:

function euclideandst2(sz, p)
    # sz = size of the Array
    # p coordinates of the center point
    N = length(sz) # number of dimension
    return [sqrt(sum(j -> (i[j] - p[j])^2, 1:N)) for i = CartesianIndices(sz)]
end

I get another factor of 2 (down to 129ms) by exploiting the fact that the length N of the tuples is known at compile-time, so the loop can be unrolled by the compiler if we use some tricks:

function euclideandst3(sz::NTuple{N}, p::NTuple{N}) where {N}
    # sz = size of the Array
    # p coordinates of the center point
    return [sqrt(sum(ntuple(j -> (i[j] - p[j])^2, Val{N}()))) for i = CartesianIndices(sz)]
end

Automating this sort of trick, using a compile-time array length, is one of the reasons why the StaticArrays package is the method of choice for computations with small coordinate vectors.

Of course, you can get further speedups by multiple threads, etcetera, and if there is an optimized package that does what you need then by all means use it. But hopefully the euclideandst2 function convinces you that you can obtain quite decent in performance in Julia just by following a few simple guidelines without advanced wizardry.

Geoffrey · June 17, 2020, 3:59pm

Thanks a lot for your time in providing me such a detailed answer ! I didn’t know about the Distances.jl package and as I’m in the learning phase of the julia language I think it’s good practice to try to implement such basic function. Thanks to you I also understand a bit more how julia works and how to get the maximum of it !

Topic		Replies	Views
Faster squared euclidean distance calculation Performance	11	1755	October 2, 2021
Understanding why one function is faster than the other New to Julia	16	1169	November 1, 2020
Getting the benchmarked speedup from Distances.jl General Usage performance	8	544	September 7, 2020
Computing L1 norm between vectors: julian way Performance	14	1808	April 6, 2021
Pairwise distances from a single column or vector New to Julia	7	3993	October 2, 2019

Euclidean distance from one point

Related topics