How to compute neighbour network using NearestNeighbors.jl

pablosanjose · January 22, 2017, 7:46pm

Using NearestNeighbor.jl I can find which points within a collection col are closest to a (different) given point or set of points. What I need is slightly different. I would like to compute the points p_i in col that are within a certain distance of each point p_j, also in col. I suspect this is an “easier” problem, hence the question below.

I can currently get what I want by (1) building a KDTree of col, tree=KDTree(col), and then (2) running e.g. knn(tree, col, 3) over all points in col.

It turns out that step (1) is typically ~ six times faster than (2) for me, and I suspect all the information I need is already contained in tree. So the question: is it possible to obtain the n nearest neighbors for each point in a collection without invoking knn, just from the data inside the KDTree of that collection?

[Please complain if the question is not clear. Also let me know if it should be posted elsewhere]

I take the liberty of pinging @kristoffer.carlsson, the author of NearestNeighbors.jl, I hope that’s ok!

kristoffer.carlsson · January 22, 2017, 8:45pm

If you are interested in a certain distance you presumably want to use inrange and not knn. Also, to make sure, is what you want to do equal to this function in scipy?

pablosanjose · January 22, 2017, 9:46pm

Hi Kristoffer, yes, most of the time inrange is what I need, and that’s almost twice as fast as knn for me. In those situations I indeed need all pairs within a given distance, so exactly what that scipy function does. Is that functionality already somewhere inside NearestNeighbors.jl?

Amazing work, by the way. I’m blown away by the performance of your KDTree constructor.

kristoffer.carlsson · January 23, 2017, 8:12am

I had some code in a previous package that did this: https://github.com/JuliaGeometry/KDTrees.jl/blob/master/src/kd_tree.jl#L617.

I could probably get it working in NearestNeighbors but it will be a while because I don’t have time to spend on it now. If I remember correctly, it wasn’t that much faster than just checking point by point.

pablosanjose · January 23, 2017, 8:17am

Ok, good to know. I will just keep using inrange for the moment then. Many thanks.

Topic		Replies	Views
K nearest neighbor utility in Julia Community	13	3848	February 3, 2021
NearestNeighbors.jl - Can I formulate this in a way which makes it allocate less? General Usage	2	247	February 7, 2021
Creating graphs from data Statistics	7	1764	August 7, 2019
Help with NearestNeighbours and array reduction Performance	0	302	July 15, 2020
Neighborhood.jl: Unified API for finding nearest neighbors in Julia Package Announcements package	15	2399	February 13, 2021

How to compute neighbour network using NearestNeighbors.jl

Related topics