You have three options:
- Cartesian Euclidian distance, which is an easy calculation but accuracy degrades with distance, but first you must convert lat/lon to meters, from the Prime Meridian (E +, W-) and the Equator (N +, S-). You don’t want to do that.
- Haversine distance, using the
Haversine.jl
package, which traces the distance along a spherical representation of the Earth, so it will trace a longer path than Cartesian and its relative accuracy to Cartesian increases with distance. But it works directly with lat/lon. - Geodesic distance, using the
SimpleFeatutes.jl
package, which accounts for the Earth being an oblate spheroid, rather than a perfect sphere. It will allow you to specify the standard US NAD83 CRS (coordinate reference system) and has a function to calculate distances between two points.
The big advantage of SimpleFeatures.jl
is that it is just a DataFrame object that accommodates a Geometry
object to represent points, lines, polygons or multipolygons. That means you can have a single DataFrame with your GEOID, County Name, State Name if you want, lat/lon from the TIGER files, the Geometry object and whatever attributes you are collecting all in the same object.
Depending on what you’re planning to do, there are a couple of other considerations.
If for some reason you want pairwise data for all counties, even though many pairs will be zero valued, consider using a SparseArray
. For choropleth mapping, I haven’t found a tool I really like. There’s an implementation of Plotly for roadmap type work and GeoStats.jl
is expert-level. So, I usually resort to GGPlot2
in R
through RCall
or natively. I have’t tried TidierPlots.jl
yet but it doesn’t have the sf_geom
to work with SimpleFeatures, nor does Gadfly.jl
or Makie.jl
. (even with GeoMakie.jl
).