K nearest neighbor utility in Julia

Hi. Is there a working K-nearest neighbors package or sub-package available in Julia (apart from Scikit Learn)? There used to be kNN but that is not currently working. A search within the past year yields essentially nothing (except a proposed interface). Maybe it exists as a part of another package? Thanks for any help.

GitHub - KristofferC/NearestNeighbors.jl: High performance nearest neighbor data structures and algorithms for Julia. might suit your needs.

5 Likes

Thanks very much for that. I am looking for something for elementary students in applied Data Science to use, with a simple interface. This package certainly looks very powerful, and potentially useful for academics and serious researchers! I was hoping for something like the (defunct) kNN.jl package.

Perhaps I can write them a wrapper using this, that doesn’t require them to understand trees, etc.

Yeah it doesn’t look like rocket science ( imho it’s never bad to understand the parts that make a method work), but you could easily set up an API similar to scikit where fit builds a tree and predict does the search over input points.

Thanks, yes, in this class we are reviewing and comparing the performance of lots of methods, and don’t have time to go into depth, unfortunately. I probably can write something, but I wondered if something already existed (as I mentioned, the old kNN.jl would have been perfect).

I agree it’s not Rocket science, and am glad of that, because it means that even I might be able to understand it well enough to write the wrapper!

They don’t have to “understand” trees. They just call one constructor and then the knn function. I think they can handle it :slight_smile:

8 Likes

I still think there are quite a few non-obvious steps involved if one wanted to use this for kNN classification/regression (it would take me a day to figure it out and debug it).

MLJ has kNN classifiers and regressers based on NeatestNeighbors.jl

In general it’s a go-to for “classical machine learning” (and more)

1 Like

Okay, it seems that you are not really interested in just doing a knn search (which was my initial assumption) but you want a more full fledged classifier framework. Then yes, you probably have to look “higher up” the stack.

MLJ has kNN classifiers and regressers based on NeatestNeighbors.jl

In general it’s a go-to for “classical machine learning” (and more)

Thanks for that. I thought MLJ was only Neural Networks, etc. (I quite like Flux), but I’ll definitely take a look! This might be what I was hoping to find.

Hi. Thank you for your input (and also for the package, which seems to drive the suggestion I will probably go with) I’m sorry I didn’t make my ‘Data Science’ requirement clearer above.

Some nearest neighbor implementations for various data can be found here: https://github.com/JuliaNeighbors

MLJ is actually mostly a tool to use and compose “general ML” models though it does in particular interface Flux via MLJFlux. Otherwise it exposes most models from ScikitLearn as well as a bunch of Julia models like NearestNeighbors, DecisionTrees, GLM, LightGBM, MLJLinearModels, …

4 Likes

Edit: I read too fast you message, I didn’t implemented the KNN algorithm (not yet…), altought it is discussed in the notes cited below. (here)… If you want to collaborate in writing it… :wink:

I believe you could find interesting the cluster module of BetaML… I have wrote it with exactly your user case in mind: something simple for learning the algorithm and a easy to read code, altought not as much performant as in other packages …

Note that the algorithms in BetaML have a companion repository of the notes of a MITx course in machine learning where the algorithms are explained in greater detail…

1 Like