How to use Rasterize in GeoStats.jl

sylvaticus · June 7, 2024, 2:12pm

(note: I am splitting this thread from this one in order to avoid misunderstanding)

I can confirm that the latest version of GeoIO can now successfully import a geopackage where the geometries are provided in a Shape column. This results in a GeoTable.

But then how do I use GeoStats.Rasterize in order to create pixels with the value of the largest area of land use in the pixel and save as a tiff ? Land use in my case is an integer code in column “Code_18”.

Thank you for the update on GeoIO, I confirm it works!

juliohm · June 7, 2024, 2:46pm

Thanks for splitting the thread, and for trying the packages. I am assuming you did read the docstring of the transform:

  Rasterize(grid)
  Rasterize(grid, var₁ => agg₁, ..., varₙ => aggₙ)


  Rasterize geometries within specified grid.

  Rasterize(nx, ny)
  Rasterize(nx, ny, var₁ => agg₁, ..., varₙ => aggₙ)


  Alternatively, use the grid with size nx by ny obtained with discretization of the bounding box.

  Duplicates of a variable varᵢ are aggregated with aggregation function aggᵢ. If an aggregation function is not defined for variable varᵢ, the default aggregation function will be used.
  Default aggregation function is mean for continuous variables and first otherwise.

  Examples
  ≡≡≡≡≡≡≡≡

  grid = CartesianGrid(10, 10)
  Rasterize(grid)
  Rasterize(10, 10)
  Rasterize(grid, 1 => last, 2 => maximum)
  Rasterize(10, 10, 1 => last, 2 => maximum)
  Rasterize(grid, :a => first, :b => minimum)
  Rasterize(10, 10, :a => first, :b => minimum)
  Rasterize(grid, "a" => last, "b" => maximum)
  Rasterize(10, 10, "a" => last, "b" => maximum)

You can specify the grid (e.g. CartesianGrid) on which the rasterization will occur, and then pass aggregation functions per variable as in the example, to decide what to do with the matches.

Do I understand correctly that you want to take the mode of the land types within a pixel?

I would try something like:

geotable |> Rasterize(grid, "Code_18" => mode)

That will return a new geotable over the grid with the most frequent land type in each pixel.

sylvaticus · June 7, 2024, 3:57pm

Thank you, this indeed works:

table = GeoIO.load(clc2018_path) 
grid = GeoStats.CartesianGrid(100, 100)
table |> GeoStats.Rasterize(grid, "Code_18" => GeoStats.mode)

(I did read the docstring, but I didn’t understand it)

juliohm · June 7, 2024, 4:11pm

Is it slow? Suggestions to improve the docstring?

There are various issues related to real data. I remember seeing NaN coordinates in this file, is it affecting the algorithms?

I would also use a Select before Rasterize to avoid aggregating variables you don’t care:

geotable |> Select("Code_18") |>
Rasterize(grid, "Code_18" => mode)

We didn’t have a chance to optimize these functions yet. We are almost there in terms of public API, and design.

sylvaticus · June 7, 2024, 8:04pm

No, it took ab 30 minutes. A bit slower but still feasable.
I have tried only using the toy CartesianGrid, would it work also when grid is a raster made with a certain {xtop,ytop,xres,yres,crs} ? I mean, would it reproject the geometries on the fly to make the spatial match ?

juliohm · June 7, 2024, 8:16pm

It would be nice to identify bottlenecks. We will try to fix all of them after we propagate the new CRS infrastructure throughout the stack. Can you share a MWE reproducing the 30min wait? That would be awesome to speed things up later.

We are working on the new Proj transform, which will take any geotable and convert into another CRS of interest. It should be available soon.

sylvaticus · June 7, 2024, 8:37pm

using Pkg
Pkg.activate(@__DIR__)
cd(@__DIR__)

import Downloads, GeoIO, GeoStats

clc_url = "https://nc.beta-lorraine.fr/s/9NrqQBmR5ib2wmd/download" 
clc2018_path = "clc2018.gpkg"
Downloads.download(clc_url,clc2018_path)
table = GeoIO.load(clc2018_path) 
grid = GeoStats.CartesianGrid(100, 100)
table |> GeoStats.Rasterize(grid, "Code_18" => GeoStats.mode)

juliohm · June 7, 2024, 11:47pm

The bounding box of the polygons is somewhere else:

julia> boundingbox(table.geometry)
Box
├─ min: Point(x: 918863.6500000004 m, y: 916357.4100000001 m)
└─ max: Point(x: 7.316593380000001e6 m, y: 5.440568300000001e6 m)

Notice that the grid starts at origin (0,0) by default:

julia> CartesianGrid(100,100)
100×100 CartesianGrid
├─ minimum: Point(x: 0.0 m, y: 0.0 m)
├─ maximum: Point(x: 100.0 m, y: 100.0 m)
└─ spacing: (1.0 m, 1.0 m)

Please make sure that you defined the correct grid, or simply use the option Rasterize(nx, ny) to discretize the bounding box of the polygons into a grid.

Topic		Replies	Views
How to rasterize a geopackage (Corine Land Cover)? Geo	24	423	June 8, 2024
What is the state of Rasterization and Rasterstats in Julia Geo question	26	3295	April 8, 2022
Polygonize a Raster Geo	11	458	June 18, 2024
"Raster" syntax in GeoStats.jl Geo indexing , array , geo , rasters	0	114	September 9, 2024
[ANN] GeoStats.jl - Geospatial Data Science and Geostatistical Modeling in Julia Package Announcements package , announcement , statistics , geo , geostatistics	39	4486	August 8, 2025

How to use Rasterize in GeoStats.jl

Related topics