New package to map GRIB files to the Unidata's Common Data Model v4 following the CF Conventions

Hi all!

At ECMWF we are starting a new project/Julia package to map GRIB files (typically used in meteorology and climate science) to the Unidata’s Common Data Model v4 following the CF Conventions.

We have already a Python interface on GitHub, called cfgrib and based on the ECMWF eccodes library. Is there any existing Julia package in this realm?

I’d like this project to align as close as possible to the existing efforts in JuliaGeo. Could you please point me to any resource on the common interface between Julia geospatial packages?

Many thanks,
Claudia

8 Likes

It is great to hear that ECMWF is interested in this and I would love to see direct grib integration in Julia. Unfortunately I can not point you to the geospatial data type because there is none yet. There has been a lot of discussion recently towards connecting different grid type implementations through a common interface, you can follow the discussion here https://github.com/JuliaGeo/meta/issues/6 which ended in an initial implementation of https://github.com/JuliaGeo/DimensionalArrayTraits.jl trying to define a common interface to talk to dimensional data.

However this is all not yet complete and not yet used across the ecosystem. Others will probably chime in here ( @Alexander-Barth @Raf @visr ) but I would personally suggest to either start by first defining a data type that fits the Grib data model best and afterwards implement the functions from DimensionalArrayTraits once they are more consolidated and more widely used in other packages.

Alternatively you could subtype some data type from https://github.com/rafaqz/GeoData.jl which implements the traits from DimensionalArray as well.

A side question: Are you planning to write this package in pure Julia or as a wrapper around the eccodes C library?

1 Like

Hi, welcome, great to see ECMWF interest!

I agree with Fabian’s tips above. Also for vector formats, it is often easiest to define format specific vector types that match well to the format, and providing interoperability with the rest of the ecosystem through an interface package. A good example of such an effort is https://github.com/JuliaData/Tables.jl for different tabular types. For vector types we are attempting a similar approach in https://github.com/yeesian/GeoInterfaceRFC.jl.

Just now I discovered https://github.com/weech/GRIB.jl, which is an interface to the ECMWF ecCodes library. So pinging @weech as well.

Some more related libraries that are good to be aware of:
https://github.com/Alexander-Barth/NCDatasets.jl,
https://github.com/JuliaGeo/CFTime.jl,
https://github.com/JuliaGeo/NetCDF.jl,
https://github.com/meggart/Zarr.jl/

Of course GRIB I/O through GDAL is already possible through the GDAL wrappers, as noted in Is there interest in having a Julian API for the NCDC's Climate Data Online (CDO) - #19 by visr, but that is probably not what you are after.

3 Likes

https://github.com/weech/GRIB.jl could useful. Weech apparently intends to publish it but I haven’t gotten a response from them yet about it.

I would like to include GRIB as a data source in GeoData.jl. That would lead to interop with NetCDF, GDAL, and grd formats, and eventually more. GeoData.jl is an abstraction layer over raster data types that presents a common interface to users and other packages - for single files or large multi-file datasets. It’s just not quite published yet, as it’s mostly a side project at the moment. Hopefully in the next month or two I can make a release.

But I hope we can work collaboratively. There are a bunch of efforts at unifying the geospatial ecosystem happening at the moment, but things are a while from being settled.

1 Like

Hi, I would definitely be interested some sort of collaboration for better GRIB support in Julia, though I’m no GRIB expert. I just wrote my package so I wouldn’t have to PyCall the pygrib package anymore. It still lacks important things like Windows support. I did find the cfgrib package interesting, though I’m not very familiar with how it works.

1 Like

You folks are awesome! Thanks for all the links and pointers to related projects.
I’ll look at all of them and post here any progress on ECMWF side.

@fabiangans for the time being the plan is to create a Julia wrapper around the eccodes library.

@visr thanks for the link to GRIB.jl!

@weech @Raf there seems to be scope for collaboration. I now need to get my head around all this new info. Will get back to you on this soon!

1 Like

Hi all,
I am a colleague of @cvitolo and also digest your input :wink: To give a bit more info, what we achieved with ‘cfgrib’ in Python was to enable users who do not (need to) know what a GRIB file is to read and work easily with the data. In the Python world the ‘xarray’ data cube abstraction was very easy for users to understand and manipulate data. In the end you should not need to know the data structures in a GRIB to create a time series for temperature of your home town. We would love to achieve something similar for Julia.
We will play over the next weeks with the various Array structures in Julia. I see there has been already some efforts to emulate xarray functionality in Julia. Of course the geospatial reference is key to us.
We let you know how we progress …

1 Like

That’s pretty much the reason for writing GeoData.jl, to provide abstractions so that users, other devs and my other packages don’t have to know GDAL/NetCDF/HDF5 specific functions. It should end up something like Xarray (not that I really use it) but as an extensible interface rather than a set of fixed data types, although it has those too. You can do dimension-based indexing and selection by coordinates and datetime etc on arbitrary combinations of dimensions. It could probably do with more memorable name like Xarray, before I publish it…

One question: how does chunking work in grib files? Can you subset them directly from disk instead of loading the whole file? It can help with running simulations over large datasets.

I put in this issue with weech the other day. Presenting a standard julia array interface is the easiest way to select data. But I’m not sure it will apply to grib.
https://github.com/weech/GRIB.jl/issues/1

Would be a good project for ECMWF Summer Code 2020?

https://www.ecmwf.int/en/learning/workshops/ecmwf-summer-weather-code-2019

Hi @Balinus, glad you know about the ECMWF Summer of Weather Code!
We have already hired a contractor to work on this in 2020. But it’s definitely a good idea and we might propose further developments in the context of the Summer of Weather Code in 2021, if there is another edition of the programme next year.

1 Like

Hi!

How is it going with your new eccodes-wrapper?

Just posting here that GRIB.jl has just been merged into the General repo.

3 Likes

Hi!

I’m working with GRIB package and, I think I have noticed a bug. While reading hourly data from ERA5, the date key is given by a maximum temporal resolution of days, instead of hours. So, each 24 messages for each hour of a given day has exactly the same date value (the same YYYYmmdd value).

However, I don’t know why this is happening.

Thank you for reading and hope to have explained myself properly :s

Probably better to create an issue at Github in the GRIB.jl repo.

2 Likes

Hi,

As a meteorologist and model developer who is learning the Julia language, I am very pleased to hear about a support for reading Grib files.
Unfortunately, weather observations are more and more stored in the for humans hard to read BUFR format (formerly, there were Synop and Metar formats).
Are there any plans to develop a Julia BUFR reader?

Hi @Dieter and thanks for the feedback! I’m not aware of a plan to develop a Julia BUFR reader but I’ll let the developer team know this would be helpful.

3 Likes

I don’t know about any plans for a BUFR reader, but it looks like eccodes supports reading of BUFR files and there is already a julia binary package https://github.com/JuliaBinaryWrappers/eccodes_jll.jl which is also used by GRIB.jl.

You could either use this binary package directly by hand-coding the respective ccalls, which can be tedious or use a package like Clang.jl to wrap the eccodes header file and already get julia versions of the functions and constants defined mentioned in the header. There are quite a few people in JuliaGeo who have experience wrapping geo-related C libraries like NetCDF, GDAL, PROJ4, etc so in case you need advice feel free to ask here.

1 Like

Thanks for your reply. I have seen the comment in Grib.jl in https://github.com/weech/GRIB.jl where “Add support for BUFR files” has been mentioned as a future plan.

Please note that GitHub - JuliaGeo/GRIBDatasets.jl: A high level interface to GRIB encoded files. now exists. It is being integrated into Rasters.jl, so the source abstraction works for GRIB files as well (only reading is supported for now).

3 Likes