What is the state of Rasterization and Rasterstats in Julia

Maybe some of it?

I dont test Rasters on GPU, but I write everything using broadcast as much as possible, so some things might work on GPU as-is. But others will need some care to keep all arrays on GPU. Currently mask and rasterize uses a BitArray as the template from the polygon mask, and broadcasts that over the main array afterwards. So that might break. If its actually something useful for people it’s not much work to have most of Rasters.jl working on GPUs.

If you want to try it, using CUDA; modify(CuArray, raster) will move a raster to the gpu, or using Adapt; Adapt.adapt(CuArray, raster) works to.

Because we are just using Base reducing methods for zonal stats, the CUDA.jl methods should work exactly the same. I mean, I havent tried it, but they should. I use GPUs for simulations, not really for data handling

Edit: reductions like mean dont seem that fast on CUDA on my laptop. You may do better with a real GPU. mask doesn’t currently work (you will get a “scalar indexing” warning and it will take forever. But it theory it could work fairly easily.

Another edit: it was trivial to make mask work on GPU, and it seems pretty fast. I’ll look at how many of the other methods will work on GPU as well, probably most (but not e.g. mosaic)

I just wanted to report my findings:

for the Geometries I imported with GeoDataFrames I tried the equivalent Python calls (as I could not find a documentation, what works and how it its called), As it is stated in GeoInterface, tha tit is oriented on Python?

Some(most) of the binary predicates worked. buffer and centroid worked from the construction methods. Fom the set-theroetic methods difference and intersection worked, but not union and not symmetric_difference.
What I didn’t manage is to construct a bbox from a raster. bbox is included in GeoInterface, and the coordinates are of the raster are in the attribute .dim. The bboix needs Features and I don’t know what these are and where to find the reference. I want to use it to Create a Polygon of the extend of the raster to Filter the Vector based on the Raster boundary.

Probably @evetion is the person to talk to about these things.

And yes, bbox standardisation needs to be more of a thing. But GeoInterface.jl is kind of stalled midway in a major change so I’m not sure what to do there. I’m keen to introduce an Extent object at some stage. See https://github.com/JuliaGeo/GeoInterface.jl/pull/33

In Rasters.jl you can get bounds(raster) Which is not a traditional bbox, it is a Tuple of Tuple, one for each dimension, each holding the lower and upper bounds for that dimension. You can make any bbox you need from this. It’s like this because it makes much more sense in the arbitrary dimensionality of DimensionalData.jl (that Rasters.jl is built on) to just list the bounds for each axis, rather than mix them together. We can probably define GeoInterface.bbox(::Raster) in Rasters.jl for now.

GeoDataFrames.jl only makes the ArchGDAL geometries available in a Table. Manipulation and use of a geometry is described in Geometric Operations · ArchGDAL.jl.

Hi, I am finally back home and did the timing on my Windows PC (compared to my Linux Laptop). It seams that, the loading of the geojson compares better in the new set up compared to the Linux one.

Compare Execution time Python / Julia for Raster processing / zonal stat

using btime from BenchmarkTools for Julia and time in Python using the medium time of three runs.
I only see the speed-up while loading the vector on Windows, on Linux it seams to be slower.
In Rasters Python outperforms Julia in up to 4 orders. As shown of the Zonal stats before.

Python Julia Speed-up
unit s s
Loading vector (geojson) 18.15 11.07 1.55
Loading raster (tif) 0.0330 0.0915 0.36
Merging 2 raster 0.176 18.71 0.009
Zonal stat 0.1070 44.22 0.0024

Although speed is normally defined for multi threading/-processing. :wink: Are there other things, I shall try the timing for? :slight_smile:

Thanks at @evetion for the link for the documentation. I tried it out on Linux, Julia 1.7.0 packages newly updated. Some functioned referenced in the source, other did not. Don’t know why. I did not try every function, but most of them.
By the way, how can I get the union of all geometries in a GeoDataFrame? Is it the unary union?

Functions on Polygons

using ArchGDAL as reference.

Unary

Why union here?

attributes predicates Immutable Operations mutable Operations
found geomlength, geomarea issimple, isring boundary, convexhull , buffer, centroid
not found geomdim, getcoorddim, envelope isempty, isvalid simplify, union closerings!

Binary

predictors Immutable Operations
found intersects, equals, disjoint, touches, crosses, within, contains, overlaps intersection, difference , symdifference
not found union

I don’t know, why some could not be found. Btw. are there some anual (virtuell) meetings/ discussions in the Julia gis community?

Its not really worth doing thes Rasters.jl benchmarks on master, the branch that will merge soon is very different and much faster…

By the way, how can I get the union of all geometries in a GeoDataFrame? Is it the unary union?

Well yes, you need to use union, but you do it yourself, either you need to loop over the geometries 2 at a time, building up the union. Or have a strategy that resembles a reversed binary tree. I’ve made an issue for it to make this easier.

Why union here?

Not sure if I follow you, but union can also take a multigeom and collapse it? This is why it’s also a unary operation.

I don’t know, why some could not be found

Well I don’t export them all, see exports.jl. I’ve made another issue for it, thanks for reporting. Note you can just use ArchGDAL.yourmissingfunction on these geometries in the meantime.

Btw. are there some anual (virtuell) meetings/ discussions in the Julia gis community?

Not for now, but there is the geo channel on the Julia Slack for short questions, the best place is right here on the forum.

1 Like