How to compute neighbour network using NearestNeighbors.jl

Using NearestNeighbor.jl I can find which points within a collection col are closest to a (different) given point or set of points. What I need is slightly different. I would like to compute the points p_i in col that are within a certain distance of each point p_j, also in col. I suspect this is an “easier” problem, hence the question below.

I can currently get what I want by (1) building a KDTree of col, tree=KDTree(col), and then (2) running e.g. knn(tree, col, 3) over all points in col.

It turns out that step (1) is typically ~ six times faster than (2) for me, and I suspect all the information I need is already contained in tree. So the question: is it possible to obtain the n nearest neighbors for each point in a collection without invoking knn, just from the data inside the KDTree of that collection?

[Please complain if the question is not clear. Also let me know if it should be posted elsewhere]

I take the liberty of pinging @kristoffer.carlsson, the author of NearestNeighbors.jl, I hope that’s ok!

If you are interested in a certain distance you presumably want to use inrange and not knn. Also, to make sure, is what you want to do equal to this function in scipy?

Hi Kristoffer, yes, most of the time inrange is what I need, and that’s almost twice as fast as knn for me. In those situations I indeed need all pairs within a given distance, so exactly what that scipy function does. Is that functionality already somewhere inside NearestNeighbors.jl?

Amazing work, by the way. I’m blown away by the performance of your KDTree constructor.

I had some code in a previous package that did this:

I could probably get it working in NearestNeighbors but it will be a while because I don’t have time to spend on it now. If I remember correctly, it wasn’t that much faster than just checking point by point.

Ok, good to know. I will just keep using inrange for the moment then. Many thanks.