Hello. In python, we have the nice function to get the nearest node to a given point location using the osmnx package as the following.

import osmnx as ox
ox.config(log_console=True, use_cache=True)
G = ox.graph_from_place(‘Buffalo, New York, USA’, network_type=‘walk’)
point1 = (42.891310, -78.871355)
ox.get_nearest_node(G, point1, return_dist=True)

I have not dig deeply of the function

ox.get_nearest_node()

yet. I was just wondering, under the hood, what is the algorithm / idea to get the nearest node to a given point location. I believe this is also implemented in Julia package,

OpenStreetMapX.jl,

@pszufe. It should be easy with a not so huge data set. But with a huge data set, like several millions of potential nodes in the road network which could be chosen as the closest potentials, is there an efficient idea to attack it? Or do you know some references / links which might be helpful.

OpenStreetMapX provides this functionality via point_to_nodes(::LLA, ::MapData)

Hence you can simply provide coordinates as the first paramater, the map as the second and get the node.

However under the hood it is using the nearest_node function that does a full scan. For a network of new millions nodes it will be still fast.
However if you need to scan millions of times you should built a spatial index. Consider using LibSpatialIndex.jl. The drawback is that building an index of course takes some time. If you decide to do it you are welcome to propose point_to_nodes(::LLA, ::MapData, ::SpatialIndex) function for OpenStreetMapX.

@pszufe, that’s very helpful. Two questions remain:

1: In the nearest_node function, which distance measure did you use? Is it great circle distance or the Euclidean distance. For several millions of nodes, how did you ensure that it is still fast? Is it because of the “natural property” of Julia?

2: I hope it’s not a naive question. But what is a “spatial index”? What’s the idea that using it can make scan millions of times possible?

Ad. 1.
It is Euclidean distance that include also altitude. The point is that each road segment is so small that we do not care about great circle.
“how did you ensure that it is still fast?” - a full scan over 1 million floats is a cheap operation (if you need to do it once)

Ad. 2.
Spatial index is a data structure that makes to possible to efficiently search data aligned in more than one dimension.
An excellent explanation with all examples and self explaining pictures can be found here: DM-66 - Spatial Indexing | GIS&T Body of Knowledge

Hi @pszufe, I have around 8M coordinates I’d like to associate with nearest node on an OpenStreetMapX with around 800,000 nodes, but I’m finding it hard to compute even 10,000 points. Here is a graph of benchmarks

Do you recommend any one in particular nowadays and is there a quick tutorial somewhere as to how to build the index given an OpenStreetMap and any of these libraries?

Dear @ctrebbau,
the spatial indexing that you have mentioned is exactly the way to go.
You should get the list of coordinates for all nodes on the map, build the spatial index and then query the index.
Since you have only 10k points perhaps it does not matter which spatial indexing library you use. I have not used the one you mentioned but it looks fine. Please write what performance you got!

When I use the functions point_to_nodes and nearest_node, the outputs are some node ids. for instance 56684. I can call any traversal function on them and works perfectly fine.

However, If I want to check whether two node id form a links or not, how to check it? This (i,j) in map.e (where iand j are node ids), doesn’t seem to be correct and always return Tuple{Int64, Any}[]

How to find out which of the node id s that are coming from nearest_node function form an edge in the map?

I have one other issue that wanted to solve with nearest_node command but it’s quite computationally expensive. Do you know any altenative?

Suppose I have a map with a boundy of min_lat= x1, min_lon= y1, max_lat= x2 max_lon= y2 and have a csv file contanning 6M rows of latitude longitude coordinated points across USA. I’ curious to know if OSMx or julia in general can automaticlly find all rows that are inside the map? I think from that 6M only 2M are within the map boundary I have.