Good Julia function / algorithm idea to find the nearest node to a given point location based on some given road network (for example, open street map data)?

Hello. In python, we have the nice function to get the nearest node to a given point location using the osmnx package as the following.

import osmnx as ox
ox.config(log_console=True, use_cache=True)
G = ox.graph_from_place(‘Buffalo, New York, USA’, network_type=‘walk’)
point1 = (42.891310, -78.871355)
ox.get_nearest_node(G, point1, return_dist=True)

I have not dig deeply of the function


yet. I was just wondering, under the hood, what is the algorithm / idea to get the nearest node to a given point location. I believe this is also implemented in Julia package,


@pszufe. It should be easy with a not so huge data set. But with a huge data set, like several millions of potential nodes in the road network which could be chosen as the closest potentials, is there an efficient idea to attack it? Or do you know some references / links which might be helpful.

OpenStreetMapX provides this functionality via point_to_nodes(::LLA, ::MapData)

Hence you can simply provide coordinates as the first paramater, the map as the second and get the node.

However under the hood it is using the nearest_node function that does a full scan. For a network of new millions nodes it will be still fast.
However if you need to scan millions of times you should built a spatial index. Consider using LibSpatialIndex.jl. The drawback is that building an index of course takes some time. If you decide to do it you are welcome to propose point_to_nodes(::LLA, ::MapData, ::SpatialIndex) function for OpenStreetMapX.


@pszufe, that’s very helpful. Two questions remain:

1: In the nearest_node function, which distance measure did you use? Is it great circle distance or the Euclidean distance. For several millions of nodes, how did you ensure that it is still fast? Is it because of the “natural property” :smile: of Julia?

2: I hope it’s not a naive question. But what is a “spatial index”? What’s the idea that using it can make scan millions of times possible?

Many thanks indeed.

Ad. 1.
It is Euclidean distance that include also altitude. The point is that each road segment is so small that we do not care about great circle.
“how did you ensure that it is still fast?” - a full scan over 1 million floats is a cheap operation (if you need to do it once)

Ad. 2.
Spatial index is a data structure that makes to possible to efficiently search data aligned in more than one dimension.
An excellent explanation with all examples and self explaining pictures can be found here: DM-66 - Spatial Indexing | GIS&T Body of Knowledge

Hi @pszufe, I have around 8M coordinates I’d like to associate with nearest node on an OpenStreetMapX with around 800,000 nodes, but I’m finding it hard to compute even 10,000 points. Here is a graph of benchmarks

computed like this:
b4 = @benchmark [ point_to_nodes(pt, mad) for pt in ptsOrig ] setup=(ptsOrig = zip($(df.dLat[1:10_000]), $(df.dlng[1:10_000])))

Is there a quick way around this? if spatial indexing is the only solution, there’s a native Julia option here GitHub - alyst/SpatialIndexing.jl: Spatial data indexing in pure Julia (R*-trees etc) and others listed here

Do you recommend any one in particular nowadays and is there a quick tutorial somewhere as to how to build the index given an OpenStreetMap and any of these libraries?

Thanks in advance.

Dear @ctrebbau,
the spatial indexing that you have mentioned is exactly the way to go.
You should get the list of coordinates for all nodes on the map, build the spatial index and then query the index.
Since you have only 10k points perhaps it does not matter which spatial indexing library you use. I have not used the one you mentioned but it looks fine. Please write what performance you got!

All the best!