Findnearest function?

cormullion · June 7, 2017, 6:07pm

Is there a simple findnearest() function somewhere in Julia-land?

A = [1, 5, 12, 23];
findnearest(A, 14)  
# ->  3
findnearest(A, 18)
# ->  4

ohsonice · June 7, 2017, 6:36pm

I’ll take a shot at it:

findnearest(A::AbstractArray,t) = findmin(abs(A-t))[2]

Remove the [2] to return nearest value as well. Not sure how to accommodate ties though?

Dan · June 7, 2017, 11:30pm

If the array is sorted, like in the post, then the following will return a range with the nearest values:

function findnearest(a,x)
       n = length(a)
       n > 0 || return 0:-1
       i1 = searchsortedlast(a,x)
       i0 = i1
       if i1>0
           while i0>1 && a[i0-1]==a[i0]
               i0 -= 1
           end
           d = x-a[i1]
       else
           i0 = i1+1
           d = a[i1+1]-x
       end
       i2 = i1
       if i2<n && a[i2+1]-x<d
           i0 = i2+1
           d = a[i2+1]-x
           i2 += 1
       end
       while i2<n && a[i2+1]-x==d
           i2 += 1
       end
       return i0:i2
end

Note the details, such as returning a zero-length range when array is empty, and returning a range with values on both sides of x when they are of equal distance.

The OP’s example reproduces as:

julia> A = [1, 5, 12, 23];
julia> findnearest(A, 14)
3:3
julia> # ->  3
       findnearest(A, 18)
4:4
julia> # ->  4

Dan · June 7, 2017, 11:57pm

Just to clean things a bit, the last version works, but can take linear time on some inputs (such as a constant long vector).
The following is cleaner and uses just the sorted search functions which are sublinear.

function findnearest(a,x)
       length(a) > 0 || return 0:-1
       r = searchsorted(a,x)
       length(r) > 0 && return r
       last(r) < 1 && return searchsorted(a,a[first(r)])
       first(r) > length(a) && return searchsorted(a,a[last(r)])
       x-a[last(r)] < a[first(r)]-x && return searchsorted(a,a[last(r)])
       x-a[last(r)] > a[first(r)]-x && return searchsorted(a,a[first(r)])
       return first(searchsorted(a,a[last(r)])):last(searchsorted(a,a[first(r)]))
end

Again, we have:

julia> A = [1, 5, 12, 23];
julia> findnearest(A, 14)
3:3
julia> findnearest(A, 18)
4:4

traktofon · June 8, 2017, 2:08am

Like @cormullion, I’ve had need for such a function many times, and here is my personal solution for it. Unlike @Dan’s version, this doesn’t care about ties and returns just a single index. Also, as it works on sorted arrays, the function name should be searchsortednearest to be consistent with Julia’s function names (I would expect findnearest to work on any array).

function searchsortednearest(a,x)
   idx = searchsortedfirst(a,x)
   if (idx==1); return idx; end
   if (idx>length(a)); return length(a); end
   if (a[idx]==x); return idx; end
   if (abs(a[idx]-x) < abs(a[idx-1]-x))
      return idx
   else
      return idx-1
   end
end

cormullion · June 8, 2017, 7:03am

Neat! By the way, currently Julia 0.6 wants me to:

WARNING: abs{T <: Number}(x::AbstractArray{T}) is deprecated, use abs.(x) instead.

cormullion · June 8, 2017, 7:30am

Some good solutions - the simpler offering (@traktofon) benchmarks a bit faster, but both are fine for my needs.

(How about adding a version to Julia, since it appears to be useful? :))

Tamas_Papp · June 8, 2017, 7:58am

Since it assumes a sorted container, the speed advantage is not surprising.

AFAIK the core developers want to make Julia smaller, and put such functionality in packages. So perhaps you could find a package to submit this to.

joshday · November 2, 2021, 3:17pm

@traktofon Thanks for your solution!

I’ve copied it enough times that I’m putting it into a package (and adding the keyword args supported by searchsortedfirst/searchsortedlast). If anyone has improvements, please let me know!

https://github.com/joshday/SearchSortedNearest.jl

rafael.guerra · November 3, 2021, 7:17am

@joshday, as you have studied this topic, could you please highlight the advantage(s) of the the proposed function compared to the fast and more general solution (similar to one given above):

findnearest(A,x) = argmin(abs.(A .- x))

Thank you.

joshday · November 3, 2021, 1:36pm

Fast is relative, and since you’re essentially competing with searchsortedfirst in Base, you probably won’t be faster.

If you try some benchmarks, I’d expect your findnearest to have a lot more allocations.
Since you’re not assuming the collection is sorted, you must visit every element.

rafael.guerra · November 3, 2021, 2:27pm

Thanks for the feedback.
Wondering what results you get for the following example where the input array is sorted in reverse order?

A = [23, 12, 5, 1]
x = 14
using BenchmarkTools
@btime findnearest($A, $x)  # 25 ns (1 allocation: 96 bytes)

halleysfifthinc · November 3, 2021, 2:47pm

Would you be willing to open a PR to add this function to base? Especially in light of how complementary it is to the existing searchsorted functions, I shouldn’t expect much/any objection to its addition.

mbauman · November 3, 2021, 3:16pm

searchsortednearest obviously won’t work for an unsorted array. These are two very different functions with different use-cases. If you’re interested in optimizing the general findnearest, you can take a look at this workbook: https://github.com/JuliaComputing/Training/blob/master/Parallel/020%20Serial%20Performance.jl

joshday · November 3, 2021, 3:29pm

I think it will take some battle-testing/more thought before adding to Base.

The difference from the other searchsorted functions is that they require only that the type is ordered.

searchsortednearest also requires that a difference/distance exists. There probably needs to be a distance keyword that falls back to -.

At a more abstract level, what we’re doing is finding the argmin of a convex function over a finite domain. Using upcoming 1.7 syntax, we’re doing

julia> A = 1:100; x=5.1;

julia> argmin(ai -> abs(ai - x), A) 
5

But because we know the function is convex we want to stop searching through A once the function stops decreasing.

All that to say I’m not sure whether this functionality should belong with the searchsorted family or if there should be a way to tell argmin my function is convex (or tell argmax my function is concave) and my collection is sorted.

EDIT: I’ve mistaken the behavior of argmin(f, collection) (returns value) for argmin(collection) (returns index).

rafael.guerra · November 3, 2021, 3:31pm

Matt, the array in the example is sorted in reverse order and in the package linked by Josh, the syntax seems to allow this?

searchsortednearest(a, x; by=<transform>, lt=<comparison>, rev=false)

mbauman · November 3, 2021, 3:40pm

Oh that’s great — then you can just test it yourself. For what it’s worth, I’d expect the simple linear search to be faster than a sorted search for very tiny arrays like this, especially when optimized to remove allocations as is done in those training materials.

aplavin · November 4, 2021, 10:08am

joshday:

At a more abstract level, what we’re doing is finding the argmin of a convex function over a finite domain. Using upcoming 1.7 syntax, we’re doing
julia> A = 1:100; x=5.1;

julia> argmin(ai -> abs(ai - x), A)
5

This returns an element of A and not its index - unlike other solutions.

joshday · November 4, 2021, 11:17am

Oh, you’re right! Hopefully my line of thought was clear enough.

zsz00 · November 20, 2021, 4:00am

about

joshday/ SearchSortedNearest.jl

Is this a subset of ANN((approximate nearest neighbor) or KNN or similarity search ??
Can a / x be extended to higher dimensions, not only a number ??

@joshday

Topic		Replies	Views
How to search an array for the index coresponding to the nearest value? General Usage question	4	1730	June 22, 2022
How to use the `by` argument to `searchsortedlast()` General Usage	3	380	April 15, 2022
What exactly is the meaning of this one line code written in Julia? General Usage	3	837	May 13, 2019
Finding Position of Element in an Array General Usage question	46	104712	October 9, 2019
Determine whether an element is in an array New to Julia array	4	1179	February 3, 2024

Findnearest function?

joshday/ SearchSortedNearest.jl

Related topics