I am facing a really unexpected result, that is that the pairwise distance computation using a matrix is actually faster than a vector of static vectors, for small dimensions. here is my simple setup:
using Distances, StaticArrays
D = 10
data = [rand(SVector{D}) for i in 1:1000]
metrix = Chebyshev()
datam = zeros(D, 1000)
for i in 1:1000
datam[:, i] .= data[i]
end
let’s now use the pairwise
implementation of Distances
. For reference, the generic method has the amazingly small source code:
function pairwise!(r::AbstractMatrix, metric::PreMetric, a::AbstractMatrix, b::AbstractMatrix)
na = size(a, 2)
nb = size(b, 2)
size(r) == (na, nb) || throw(DimensionMismatch("Incorrect size of r."))
@inbounds for j = 1:size(b, 2)
bj = view(b, :, j)
for i = 1:size(a, 2)
r[i, j] = evaluate(metric, view(a, :, i), bj)
end
end
r
end
and compare it with the just as simple:
const VV = Vector{<:SVector}
function pairwiseDS(x::VV, y::VV, metric::Metric)
d = zeros(eltype(eltype(x)), length(x), length(y))
for i in 1:length(x)
for j in 1:length(y)
@inbounds d[i,j] = evaluate(metric, x[i], y[j])
end
end
return d
end
Benchmarks:
julia> @btime pairwiseDS($data, $data, $metric)
47.305 ms (2 allocations: 7.63 MiB
# method from Distances:
julia> @btime pairwise($metric, $datam, $datam)
36.199 ms (2 allocations: 7.63 MiB)
Even though the actual distance evaluation is faster:
v = rand(D)
sv = SVector{D}(v)
@btime evaluate($metric, $v, $v)
30.293 ns (0 allocations: 0 bytes)
@btime evaluate($metric, $sv, $sv)
25.173 ns (0 allocations: 0 bytes)
Was it just an unreasonable expectation from my side that the method with Vector{SVector} should be faster? But I mean, the actual distance evaluations are faster. Why isn’t the entire result faster?