How to use Rasterize in GeoStats.jl

(note: I am splitting this thread from this one in order to avoid misunderstanding)

I can confirm that the latest version of GeoIO can now successfully import a geopackage where the geometries are provided in a Shape column. This results in a GeoTable.

But then how do I use GeoStats.Rasterize in order to create pixels with the value of the largest area of land use in the pixel and save as a tiff ? Land use in my case is an integer code in column “Code_18”.

Thank you for the update on GeoIO, I confirm it works!

1 Like

Thanks for splitting the thread, and for trying the packages. I am assuming you did read the docstring of the transform:

  Rasterize(grid, var₁ => agg₁, ..., varₙ => aggₙ)

  Rasterize geometries within specified grid.

  Rasterize(nx, ny)
  Rasterize(nx, ny, var₁ => agg₁, ..., varₙ => aggₙ)

  Alternatively, use the grid with size nx by ny obtained with discretization of the bounding box.

  Duplicates of a variable varᵢ are aggregated with aggregation function aggᵢ. If an aggregation function is not defined for variable varᵢ, the default aggregation function will be used.
  Default aggregation function is mean for continuous variables and first otherwise.


  grid = CartesianGrid(10, 10)
  Rasterize(10, 10)
  Rasterize(grid, 1 => last, 2 => maximum)
  Rasterize(10, 10, 1 => last, 2 => maximum)
  Rasterize(grid, :a => first, :b => minimum)
  Rasterize(10, 10, :a => first, :b => minimum)
  Rasterize(grid, "a" => last, "b" => maximum)
  Rasterize(10, 10, "a" => last, "b" => maximum)

You can specify the grid (e.g. CartesianGrid) on which the rasterization will occur, and then pass aggregation functions per variable as in the example, to decide what to do with the matches.

Do I understand correctly that you want to take the mode of the land types within a pixel?

I would try something like:

geotable |> Rasterize(grid, "Code_18" => mode)

That will return a new geotable over the grid with the most frequent land type in each pixel.

1 Like

Thank you, this indeed works:

table = GeoIO.load(clc2018_path) 
grid = GeoStats.CartesianGrid(100, 100)
table |> GeoStats.Rasterize(grid, "Code_18" => GeoStats.mode)

(I did read the docstring, but I didn’t understand it)

Is it slow? Suggestions to improve the docstring?

There are various issues related to real data. I remember seeing NaN coordinates in this file, is it affecting the algorithms?

I would also use a Select before Rasterize to avoid aggregating variables you don’t care:

geotable |> Select("Code_18") |>
Rasterize(grid, "Code_18" => mode)

We didn’t have a chance to optimize these functions yet. We are almost there in terms of public API, and design.

No, it took ab 30 minutes. A bit slower but still feasable.
I have tried only using the toy CartesianGrid, would it work also when grid is a raster made with a certain {xtop,ytop,xres,yres,crs} ? I mean, would it reproject the geometries on the fly to make the spatial match ?

It would be nice to identify bottlenecks. We will try to fix all of them after we propagate the new CRS infrastructure throughout the stack. Can you share a MWE reproducing the 30min wait? That would be awesome to speed things up later.

We are working on the new Proj transform, which will take any geotable and convert into another CRS of interest. It should be available soon.

using Pkg

import Downloads, GeoIO, GeoStats

clc_url = "" 
clc2018_path = "clc2018.gpkg",clc2018_path)
table = GeoIO.load(clc2018_path) 
grid = GeoStats.CartesianGrid(100, 100)
table |> GeoStats.Rasterize(grid, "Code_18" => GeoStats.mode)

The bounding box of the polygons is somewhere else:

julia> boundingbox(table.geometry)
├─ min: Point(x: 918863.6500000004 m, y: 916357.4100000001 m)
└─ max: Point(x: 7.316593380000001e6 m, y: 5.440568300000001e6 m)

Notice that the grid starts at origin (0,0) by default:

julia> CartesianGrid(100,100)
100×100 CartesianGrid
├─ minimum: Point(x: 0.0 m, y: 0.0 m)
├─ maximum: Point(x: 100.0 m, y: 100.0 m)
└─ spacing: (1.0 m, 1.0 m)

Please make sure that you defined the correct grid, or simply use the option Rasterize(nx, ny) to discretize the bounding box of the polygons into a grid.