Error in resampling with Rasters.jl

I’m trying to resample a data of lambert_azimuthal_equal_area projection to normal Lat/Lon coordinates by using Rasters.jl as following:

using NCDatasets
using ArchGDAL
using DimendionalData
using Rasters

ds = Dataset(“”)
x = ds[“x”][:]
y = ds[“y”][:]
swe = ds[“swe”][:,:]
msv = ds[“swe”].attrib["_FillValue"]
p = ds[“crs”].attrib[“spatial_ref”]
gt = ds[“crs”].attrib[“GeoTransform”]

source = AG.importWKT(p)
@show source

target = AG.importEPSG(4326,order=:trad)
@show target

dims = (X(x,crs=source), Y(y,crs=source))

swe_ra = Raster(swe, dims)

#B = X(-180:1:180,crs=EPSG(4326)),Y(-90:1:90,crs=EPSG(4326))

swe_x = resample(swe_ra,1; crs=EPSG(4326))

The last line prompts: ERROR: LoadError: TypeError: in keyword argument dtype, expected DataType, got Type{Union{Missing, Int32}}

Any clue how to fix it? Thanks.

The sample data is available at:

Yeah, this is a hard thing that could be easier.

First NetCDF projections are a mess. NCDatasets.jl does not load the native CF standards projection, probably because its arcane and pretty hard to implement. The “crs / spatial_ref” fields you are retrieving is not a CF standard so I haven’t written anything specific to get it automatically (but we easily could if this is common enough, please make a github issue). The dimension variables are often projected to EPSG(4326) already even if the underlying projection is different, making things even harder to get right when you cant automatically detect the crs.

Second, NCDatasets.jl uses missing as the missing value by default (See Is it possible to not use `missing`? · Issue #132 · Alexander-Barth/NCDatasets.jl · GitHub), and GDAL (that does the resampling in this case) cannot handle missing. One day I will automate replacing the missing values, but for now you need to use replace_missing manually on your Raster.

Last, because netcdf is by default a Mapped projection that can handle the variable having a different projection to the underlying data, you need to specify both crs and mappedcrs are your well known text values. (I should make that optional seeing your dimension variables here are actually the raw geotransforn).

The good news is Rasters.jl automates more than you think, so you can shorten that code a bit. This works for me:

using Rasters, NCDatasets
fn = download("", "")
# Get the crs with NCDatasets, and wrap it in `WellKnownText`:
wkt = WellKnownText(Dataset(fn)["crs"].attrib["spatial_ref"])
# Choose the :swe layer from the netcdf file
swe = Raster(fn; name=:swe, crs=wkt, mappedcrs=wkt)
# Use typemin(Int32) as the missing value.
swe = replace_missing(swe, typemin(Int32))
# And resample to EPSG(4326)
resampled = resample(swe, 1; crs=EPSG(4326))

Thanks. It works smartly.

I’m not sure the “crs spatial/_ref” information is common enough, it’s my first time running into this dataset.

Btw, if plotting the resampled variable with GMT, how could I extract the longitude/latitude information?

dims(x) gives you the dimensions of the raster or stack - the X/Y dims hold the lon/lat values. They indexable but not <: AbstractAray - they wrap an Abstractarray lookup. So you can use parent to get them. So something like:

lons, lats = map(parent, dims(raster, (X, Y)))

Thanks! I can use dims(x) to get the lon/lat info now.

Btw, when I use GDAL’s gdalwarp command to transform the dataset, it prompts the following information, while it doesn’t happen when using Rasters.resample. I think the Rasters.jl call GDAL’s gdalwarp as well.

gdalwarp -s_srs ‘+proj=laea +lat_0=90 +lon_0=0 +x_0=0 +y_0=0 +R=6371228 +units=m +no_defs’ -t_srs ‘+proj=longlat +datum=WGS84’ -tr 0.5 0.5

Warning 1: Several drivers matching nc extension. Using NETCDF
ERROR 1: Point outside of projection domain
ERROR 1: Point outside of projection domain
ERROR 1: Point outside of projection domain
ERROR 1: Point outside of projection domain
Creating output file that is 720P x 360L.
Processing [1/1] : 0Using internal nodata values (e.g. -100000) for image
Copying nodata values from source to destination
ERROR 1: PROJ: laea: Invalid latitude
ERROR 1: PROJ: laea: Invalid latitude
ERROR 1: PROJ: laea: Invalid latitude
ERROR 1: PROJ: laea: Invalid latitude

Rasters.jl doesn’t use GDAL on the netcdf directly, because GDAL isn’t great for handling complex netcdf files. Instead it iteratively loads each lat/lon slice of the (potentially multi-dimensional) netcdf variable using NCDatasets.jl, creates an in-memory GDAL dataset, resamples it with gdalwarp, and cats them back together. So the X/Y dims are resampled, and the others remain unchanged.

This is actually generalised for any kind of input - resample doesn’t know it’s working with a netcdf file, its just a generic operation.

With the latest changes soon to be merged this will be possible to do direct from/to disk (besdes the slices being resampled) whenever NCDatasets gets a DiskArrays.jl interface. But it’s currently a memory-backed operation.

I’m not sure what GDAL is trying to do with a raw netcdf file in your example, I don’t use it for netcdf.

Thanks for your detailed information.
Rasters.jl is really amazing in doing this job.

1 Like