I have some netcdf files containg temperature and precipitation, with size 360 x 180. There isn’t inside the .nc file the dimension of time, but it is on the name of each file (2012.01.nc, 2012.02.nc, … ). I want to open all files and read all variables along the dimension of time.
Is it possible for you to edit the files? Then you could write a function that opens a netCDF called “2012.01.nc”, and use that name to create a new variable called “time” with the value 2012-01-01, and save the file. If you run such a function on all files then you should be able to use aggdim="time" afterwards.
Hi, thanks.
It is possible to edit, but temperature and precipitation has only Lon & Lat as dimensions. When i define the time variable and set aggdim="time", can’t merge in time my 2 variables.
However, if i firstly define a new variable “aod”, with dimensions Lon x Lat x Time, then i can merge. So the new task is how can i append a new dimension to an already existed variable with already defined dimensions ? ( I want to edit the dimensions of temperature as Lon x Lat x Time without rewriting again the whole netcdf file)
Ah ok. I assumed that it could use scalar values variables called “time”. I updated the NCDatasets issue, because that would be quite helpful already.
So the new task is how can i append a new dimension to an already existed variable with already defined dimensions ?
I don’t think you can add a dimension in place. You could create a variable based on what you have, but with the extra dimension, and then just copy over the data.
You can use YAXArrays.jl. When you know that your files have all the same dimensions and sizes you can do the following:
using YAXArrays, NetCDF, Dates
"""
getdate(x,reg = r"[0-9]{8}T[0-9]{6}", df = dateformat"yyyymmddTHHMMSS")
Return a DateTime object from a string where the time stamp is found by `reg`
and it is parsed according to the `df` dateformat.
"""
function getdate(x,reg = r"[0-9]{8}T[0-9]{6}", df = dateformat"yyyymmddTHHMMSS")
m = match(reg,x).match
date =DateTime(m,df)
end
filelist = readdir(pwd()) # This gets you a list of the files in the current directory.
timestamps = getdate.(filelist)
cubelist = Cube.(filelist)
timeaxis = RangeAxis("Time", timestamps)
cube = concatenatecubes(cubelist, timeaxis)
With this you would get a YAXArray with which you can then do your data analysis. Or you can use savecube(cube, "pathtosave.nc") to store the data on a single netcdf on disk.
Hi,interesting way with YAXArrays.jl but when i Cube.(filelist) , i am getting the error:
ERROR: NetCDF file /home/inna/Desktop/tes.2018.06.nc does not have a variable named lon
I found a way :
using NCDatasets, Glob
paths = sort(glob("home/inna/Desktop/tes*.nc","/"))
a=[Dataset(i) for i in paths] # an array with many datasets
a[1]["tmp"][:,:,:] # the first dataset with one specific variable
Is that solution supposed to work with an arbitrary dimension? Let’s say I have 2 files member0.nc and member1.nc that I want to combine along the mydim = Dim{:member}([0, 1]). I’m doing:
series = RasterSeries(files, mydim)
combined_raster = Rasters.combine(series, Dim{:member})
When doing so, I’m getting a strange
LoadError: BoundsError: attempt to access NTuple{7, Int64} at index [8]
Is it supposed to work and if so, would you have any idea what I’m doing wrong? Thanks!
Hi @tcarion, I’m not sure what is happening there, and don’t really have enough information to reproduce. Please make a github issue for this for Rasters.jl, and include a link to the files and full stack trace of the error, and I’ll help you resolve it.
If your MWE includes all of this as a working script (including downloading the file) I’ll be able to help more easily.