How to rasterize a geopackage (Corine Land Cover)?

Hello, being only marginally on the domain, I am a bit lost with the various GIS packagesโ€ฆ
I need to rasterize a geopackage (gpkg) vector file (Corine Land Cover).
Visualizing the gpkg file in QGis I can see there are various layers in a single file (in this case the main one is the European land use cover, the one I am interested in, and the others refer to overseas French territories).
My final objective is to take the rasterization at high resolution and count the various pixels of a given class within a low resolution pixel, so I have as final product a set of rasters, each one indicating the percentual of land use for a specific class within each pixel.

I think I can do the second step, but I need first to do the first rasterization. I have tried in qgis to trsnform the geopackage layer as Shapefile, for which there is an example of Rasterization in Rasters.jl, but QGis complain that the file is too big for the Shapefile format.

An alternative approach that needs more testing is to load the geopackage directly with GeoIO.jl and then use the Rasterize transform from GeoStats.jl

If you encounter issues or performance problems, please report.

You should be able to open the geopackage using ArchGDAL and then it will be the same as the shapefile example for rasterize. There is no need to convert the data beforehand, because both packages adhere to the GeoInterface and therefore these geo types are interchangeable.

Yes ArchGDAL or GeoDataframes.jl will open that fine, or also GeoIO.jl.

Then Rasters.jl should just rasterize any of those objects directly in rasterize without modification, because GeoInterface.jl handles the conversions for you.

There are no comprehensive benchmarks at this stage but it looks like Rasters.rasterize its one of the fastest implementations that exists. If you find anything else as fast, make an issue :wink:

(Also we started on a native julia GitHub - JuliaGeo/GeoPackage.jl: A Julia reader and writer for `.gpkg` files but didnโ€™t get so far yet)

Thank you both @Fliks and @Raf for the quick answers.

So, trying to work on your answers. The file I am trying to read is the Corine Land Cover 2018, in geopackage format.

using Downloads, Rasters,  GeoIO, ArcGDAL
url = "https://nc.beta-lorraine.fr/s/9NrqQBmR5ib2wmd/download"
clc2018_path = "clc2018.gpkg"
Downloads.download(url,clc2018_path)
table = GeoIO.load(clc2018_path) # ERROR: geometry column not found
table = ArchGDAL.read(clc2018_path)

The ArchGDAL call produces the following object (superquickly, so it is a lazy object):

julia> table
GDAL Dataset (Driver: GPKG/GeoPackage)
File(s): 
  /home/lobianco/CloudFiles/beta-lorraine-sync/Documents/DatiGeografici/EU/corine/clc2018/u2018_clc2018_v2020_20u1_geoPackage/DATA/U2018_CLC2018_V2020_20u1.gpkg

Number of feature layers: 6
  Layer 0: U2018_CLC2018_V2020_20u1 (wkbMultiPolygon)
  Layer 1: U2018_CLC2018_V2020_20u1_FR_REU (wkbMultiPolygon)
  Layer 2: U2018_CLC2018_V2020_20u1_FR_GLP (wkbMultiPolygon)
  Layer 3: U2018_CLC2018_V2020_20u1_FR_GUF (wkbMultiPolygon)
  Layer 4: U2018_CLC2018_V2020_20u1_FR_MTQ (wkbMultiPolygon)
  Remaining layers:
    U2018_CLC2018_V2020_20u1_FR_MYT, 

However, I then have an error in Rasters.raster():

clc_raster = Rasters.rasterize(last, table,fill=1) # ERROR: MethodError: no method matching iterate(::ArchGDAL.IDataset)
clc_raster = Rasters.rasterize(last, table[1],fill=1) # ERROR: MethodError: no method matching getindex(::ArchGDAL.IDataset, ::Int64)

Any clue ?

I managed to get the layer, but still it seems the ArchGDAL object is not compatible:

layer1 = ArchGDAL.getlayer(table, 0)
clc_raster = Rasters.rasterize(last, layer1,res=10000, missingval=0, fill=1, progress=true) 

Results in :

ERROR: ArgumentError: Object is not a GeoInterface.jl compatible geometry: ()

Hm seems these objects are not identifying as a GeoInterface.jl geometry/feature/feature collection or with a Tables.jl table.

Rasters.jl doesnโ€™t actually know about ArchGDAL objects, it just follows the Tables.jl and GeoInterface.jl interfaces.

Try using GeoDataFrames.jl instead? That should handle it for you. ArchGDAL is pretty low level.

Thanks @juliohm but GeoIO.load(filepath) results in an ERROR: geometry column not found :frowning:

Thank you for reporting the error @sylvaticus , it is probably in the file itself. We will investigate and report our findings.

Notice that, at present, GeoIO.jl simply relies on ArchGDAL.jl for .gpkg:

thanksโ€ฆ to note, I can open it with qgis without issuesโ€ฆ

Iโ€™ve updated my comment above with more info so that you can inspect the issue in more depth, in parallel.

My guess is that the file itself (still downloading here) lacks the expected structure for the backend packages to load it.

Ok, apparently the geometry column is named Shape in this file, which is not covered yet:

Let me try to work on a hot fix for this.

1 Like

yes, it is a multi-vector layer.
The read call of ArchGDAL just get the whole package, but then one has to use, as I discovered, use ArchGDAL.getlayer() to get the specific layer.

I will try 2 more attempts now:

  1. use GeoDataframes.jl to load the layer
  2. trying to use the Rasterize transform from GeoStats.jl on the ArchGDAL.getlayer outputโ€ฆ

Edit:
First attempt succeeded in creating a df (got a warning " Warning: This file has multiple layers, you only get the first layer by default now." but thatโ€™s fine for me).
I obtained a df that looks like:

julia> table
2375406ร—5 DataFrame
     Row โ”‚ Shape                      Code_18  Remark   Area_Ha   ID         
         โ”‚ IGeometrโ€ฆ                  String   String?  Float64   String     
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
       1 โ”‚ Geometry: wkbMultiPolygon  111      missing  130.864   EU_1
       2 โ”‚ Geometry: wkbMultiPolygon  111      missing   53.7445  EU_2
       3 โ”‚ Geometry: wkbMultiPolygon  111      missing   30.7191  EU_3
       4 โ”‚ Geometry: wkbMultiPolygon  111      missing   50.2018  EU_4
       5 โ”‚ Geometry: wkbMultiPolygon  111      missing  481.849   EU_5
       6 โ”‚ Geometry: wkbMultiPolygon  111      missing   42.6509  EU_6
    โ‹ฎ    โ”‚             โ‹ฎ                 โ‹ฎ        โ‹ฎ        โ‹ฎ          โ‹ฎ
 2375402 โ”‚ Geometry: wkbMultiPolygon  512      missing  807.041   EU_2375402
 2375403 โ”‚ Geometry: wkbMultiPolygon  512      missing  141.363   EU_2375403
 2375404 โ”‚ Geometry: wkbMultiPolygon  512      missing  246.634   EU_2375404
 2375405 โ”‚ Geometry: wkbMultiPolygon  512      missing   41.8458  EU_2375405
 2375406 โ”‚ Geometry: wkbMultiPolygon  512      missing   43.3464  EU_2375406
                                                         2375395 rows omitted

However passing this df to Rasters.rasterize I got the same error as the gdal object (ERROR: ArgumentError: Object is not a GeoInterface.jl compatible geometry: ())

Ok, prepared a hotfix to handle this unexpected geometry column name:

# helper function to find the
# geometry column of a table
function geomcolumn(names)
  snames = string.(names)
  gnames = ["geometry","geom","shape"]
  gnames = [gnames; uppercasefirst.(gnames)]
  gnames = [gnames; uppercase.(gnames)]
  gnames = [gnames; [""]]
  select = findfirst(โˆˆ(snames), gnames)
  if isnothing(select)
    throw(ErrorException("geometry column not found"))
  else
    Symbol(gnames[select])
  end
end

Managed to load the file without problems with the master branch of GeoIO.jl:

julia> GeoIO.load("/home/juliohm/clc2018.gpkg")
2375406ร—5 GeoTable over 2375406 GeometrySet
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Code_18   โ”‚   Remark    โ”‚  Area_Ha   โ”‚     ID      โ”‚     geometry      โ”‚
โ”‚ Categorical โ”‚ Categorical โ”‚ Continuous โ”‚ Categorical โ”‚   MultiPolygon    โ”‚
โ”‚  [NoUnits]  โ”‚  [NoUnits]  โ”‚ [NoUnits]  โ”‚  [NoUnits]  โ”‚                   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚     111     โ”‚   missing   โ”‚  130.864   โ”‚    EU_1     โ”‚ Multi(1ร—PolyArea) โ”‚
โ”‚     111     โ”‚   missing   โ”‚  53.7445   โ”‚    EU_2     โ”‚ Multi(1ร—PolyArea) โ”‚
โ”‚     111     โ”‚   missing   โ”‚  30.7191   โ”‚    EU_3     โ”‚ Multi(1ร—PolyArea) โ”‚
โ”‚     111     โ”‚   missing   โ”‚  50.2018   โ”‚    EU_4     โ”‚ Multi(1ร—PolyArea) โ”‚
โ”‚     111     โ”‚   missing   โ”‚  481.849   โ”‚    EU_5     โ”‚ Multi(1ร—PolyArea) โ”‚
โ”‚     111     โ”‚   missing   โ”‚  42.6509   โ”‚    EU_6     โ”‚ Multi(1ร—PolyArea) โ”‚
โ”‚     111     โ”‚   missing   โ”‚  139.015   โ”‚    EU_7     โ”‚ Multi(1ร—PolyArea) โ”‚
โ”‚     111     โ”‚   missing   โ”‚  76.7515   โ”‚    EU_8     โ”‚ Multi(1ร—PolyArea) โ”‚
โ”‚     111     โ”‚   missing   โ”‚  308.071   โ”‚    EU_9     โ”‚ Multi(1ร—PolyArea) โ”‚
โ”‚     111     โ”‚   missing   โ”‚  127.451   โ”‚    EU_10    โ”‚ Multi(1ร—PolyArea) โ”‚
โ”‚     111     โ”‚   missing   โ”‚  84.7652   โ”‚    EU_11    โ”‚ Multi(1ร—PolyArea) โ”‚
โ”‚      โ‹ฎ      โ”‚      โ‹ฎ      โ”‚     โ‹ฎ      โ”‚      โ‹ฎ      โ”‚         โ‹ฎ         โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                                        2375395 rows omitted

You can forward the layer option if necessary with GeoIO.load(file, layer=1).

Triggered a patch release so that you can update your environment:

Ok one problem is the geometry column name is weird. You can manually pass that to Rasters.jl using the keyword geometrycolumn=:Shape

Or just get that vector out of the data frame manually and rasterize it.

@sylvaticus if you decide to try the Rasterize transform on the loaded geotable with GeoIO.jl, please feel free to report further issues. These reports really help improve robustness of the packages.

If people want to use another package thatโ€™s fine, but honestly I find these threads with two competing methods from different packages are pretty distracting.

The question was for Rasters.jl and the solution is very simple. Letโ€™s not do this all the time. We can wait to see if there is no easy solution to the direct question.

In this case there is a keyword for exactly this problem.

1 Like

I wouldnโ€™t join the thread with an alternative solution if the title was requesting Rasters.jl specifically. I thought that @sylvaticus just wanted to accomplish a generic task as a first-time user of GIS software.

If this is a Rasters.jl question, sorry for the confusion.

I am very sorry, I didnโ€™t want to spark an issue between GIS packages.
Just GIS is not my main domain, but I needed to do โ€œsome GISโ€ within a larger model.
I have manually renamed the df outputed by GeoDataFrames and this then indeed worked with Rasters.Rasterize(), I will also try the GeoIO โ†’ GeoStats.Rasterize method and report if it worksโ€ฆ

I have a related issue but I am now afraid to askโ€ฆ :-/ How do I rasterize by setting the pixel value the one with the majority land use in the pixel (land use is in the vector format in the Code_18 column).

2 Likes

Op mentioned Rasters.jl in main comment and then shows a broken Rasters.rasterize example in the next commentโ€ฆ Iโ€™m just here trying to make that work.

@sylvaticus no worries at all

What do you mean the majority land use? Do you mean here multiple geometries cover the same pixel?