Creating graphs from data

A common procedure in data science and mathematics is building graphs/networks from a large collection of data points. I am interested in the constructing the adjacency matrix for the k-nearest neighbor graph of a collection of data points. Is there a library for this ?

I couldn’t find this on Graphs.jl or on NearestNeighbors.jl.

Any help would be much appreciated.

Please check LightGraphs.jl, it is the de facto standard package for graphs in Julia. Then, combine NearestNeighbors.jl with LightGraphs.jl in the way you wish to achieve the solution.

1 Like

Thank you very much for the quick reply. I went over the introduction and descriptions on, but did not find any method that takes an input a data file, or a 2d matrix of data points. Could you direct me to the correct place ?

You won’t find methods that take as input data files. This is outside the scope of any graph package. What matrix are you referring to? The coordinates of the points? The adjacency matrix?

Please take the time to formulate your problem. After that, take the time to read through the docs of the above mentioned packages. If you have more specific questions, preferably with code, please ask, and we can try help.

Okay, here is what I am looking for,

Input : A NXd Float64 array, containing N data points x_1, …, x_N in R^d, a number k<<N

Create a graph with N vertices corresponding to each point, and and edge from i to j if x_j is among the k closest points to x_i.

Output : The adjacency matrix of this graph.

Did you actually read the documentation for NearestNeighbors.jl? It seems to do exactly the calculation that you need.

As Julio suggested, just take the information that it provides and build the graph using Lightgraphs.jl if you need some kind of graph algorithm on the resulting graph.

The function
NNTree(data, metric; leafsize, reorder)
does not do what I am looking for. It wouldn’t work for example, for huge data sets.
Did you actually read the documentation for NearestNeighbors.jl?

Did you see the section just below that called “k Nearest Neighbor (kNN) searches”? It’s in the README of NearestNeighbors.jl. It sounds like exactly what you are asking about? cc @iamsuddhasattwa