Moving to GeometryBasics

Hi all! I am Arsh Sharma, an electronics undergraduate and I am working on the geospatial ecosystem, in this year’s JSoC under the mentorship of Martijn Visser @visr, Bogumił Kamiński @bkamins and Maarten Pronk @evetion .

I’ll be talking about my work and the future possibilities, and invite feedback.

Packages like Shapefile/GeoJSON/ArchGDAL have been the prime parsers for geospatial data into Julia. But there have been a lot of discussions about having a tabular representation for the geospatial data. R has sf and Python has GeoPandas. In Julia there has been interest in similar functionality, but so far no general solution. Thanks to the Tables interface for Shapefile and GeoJSONTables, with ArchGDAL as a work in progress we are getting closer.

One of the main features of GeoPandas and sf is their special treatment of geometry columns.

This is quite helpful in performing spatial operations like joins on the geometry and has been under discussion for quite sometime now. In this JSoC we’d like to work towards an implementation.

@visr initially had plans of having metadata support specific to the geospatial ecosystem, treating geometry as a speciality and a GeoDataFrames package borrowing the concept from GeoPandas in Python was thought of.

Currently many packages define their own geometry types, and rely on the GeoInterface to exchange between different representations. Two downsides to this approach are that conversions are often needed, and that these go through an inefficient GeoJSON based nested array representation that lives in GeoInterface. So at the same time we are working on renewing this approach in GeoInterfaceRFC, which removes the central nested array representation. At the same time we want to strive to reduce the number of conversions that are needed, by promoting packages to adopt GeometryBasics when it is a good fit, and saving the conversions for when there is a good reason for an alternative representation, for example because geometries are defined in a C++ library like GDAL.

GeometryBasics has been designed by @sdanisch from the beginning to work well for geospatial applications. It has well defined standard geometry types along with a good metadata support. Currently my plan is migrate Shapefile from using its own geometry types to GeometryBasics types and do the same for GeoJSONTables.

In addition to that, GeometryBasics supports attaching metadata (attributes/properties) to geometries, and supports the Tables interface through StructArrays. So geospatial data based on GeometryBasics type can be converted to a DataFrame through the Tables interface.

Currently Makie supports GeometryBasics, so much of the plotting can be done with it. I still want to work the possibility of having GeoMakie support GeometryBasics types since that would be an additional perk!

The above features pretty much add up to support going against the Python/R convention, i.e. having a different GeoDataFrames package since that functionality can now be made available in individual packages via GeometryBasics.

Other interesting plans include having GeometryBasics support for ArchGDAL that would obviously be possible with GeoInterfaceRFC and Turf

There’s definitely a lot to do, the plans can certainly be improved and we welcome any suggestions, comments and PRs. :slight_smile:

20 Likes

Sounds like something @juliohm might be interested in

2 Likes

This is really exciting development, thanks for sharing!

1 Like

Indeed, this is very exciting! I am a geopandas maintainer who explored early julia for graph-theoretic spatial statistics common in geography, econometrics, and statistics. I would be very interested in seeing a similar stable set of geographic geometric primitives + tabular interface land in Julia, and wish you luck! I will be following this thread, too.

2 Likes

Hey all, the Tables interface for ArchGDAL is nearing completion, https://github.com/yeesian/ArchGDAL.jl/pull/118. We invite everyone to try it out. Any feedback regarding the same would be appreciated. :smile:

3 Likes

I wanted to as, if there is somewhere a description of the GeometryBasics.AbstractMesh interface? At least the list of functions that should be implemented.

And, what is the relation between Meshes.jl and GeometryBasics? How is the interoperability between them accomplished? It doesn’t look like Meshes are build on top of GeometryBasics.AbstractMesh. Is that correct?

@ljwolf @phlavenk please watch my JuliaCon2021 talk for examples of tables with geometry columns:

2 Likes

@juliohm , where can I found more details on how to plug-in to the Meshes / GeoStats ecosystem? I see GeometryBasics.jl is not used by Meshes.jl, however, MeshIO only depends on GeometryBasics. And, with GeometryBasics.jl documentation being pretty sparse, and my little knowledge of the Meshes.jl history and architecture, I’m pretty confused now.

My aim is to prepare code that reads several binary files produced by CST studio (a big commercial FEM) containing 3D volume mesh (tetrahedrons), surface meshes(triangles), field values, and particle trajectories (1D path) with various metadata at each node and also values inherent to whole bodies/trajectories. The end goal is to do the analysis in the way, you do it in your talk above.

Could you recommend a package or a part of code that I might follow to implement the required interfaces for the CST studio files?

And, I’ve forgotten to to tell how I was amazed by your approach to treat the 2D/3D data as a DataFrame, basically having all the heavy group/combine machinery right on hand when doing spatially structured data analysis - a truly brilliant step. Thank you.

We are continuously improving the documentation, but I suggest reading the extensive test suite of Meshes.jl for examples of usage. We are quite pedantic with tests so you can learn a lot from there.

Can you clarify the file format used by the software? Currently I am using meshlab to convert any file format to PLY and then using PlyIO.jl to read these meshes from disk.

You mean a package to read the file format exported by CST studio? What is the format?

Thank you, but I think this tabular approach is not that new :slight_smile: I think the novelty is in the unifying interface for all kinds of domains (2D/3D meshes, point sets, geometry collections, trajectories, …). It is pretty important to be able to discuss geostatistics without multiple complicated APIs that differ as a function of the domain type. :+1:t4:

1 Like