Reducing monthly data to yearly

Hello All. I have a geospatial data in netCDF format with monthly frequency. Time variable is an array of DateTime objects. I want a selected variable var[LAT, LON, DEPTH, MONTH] to be reduced to var[LAT, LON, DEPTH, YEAR], where the value for each YEAR is an average of all months of this year. I don’t need to write data back into netCDF. What is the most efficient way to do this in Julia?

1 Like

Have you tried just writing a loop? e.g.

newvar = zeros(eltype(var), size(var,1), size(var,2), size(var,3), nyears)
count = zeros(Int, size(newvar))
for i in CartesianIndices(var)
    inew = CartesianIndex(i[1], i[2], i[3], month2year(i[4]))
    newvar[inew] += var[i]
    count[inew] += 1
end
newvar ./= count # mean

where nyears and month2year are defined appropriately.

2 Likes

Yes, I can do it a straightforward way with a loop. I thought if any package already has some function like coarsen in xarray, or some one liner is possible.

Convert to a DataFrame, and use DataFramesMeta to group by year and avg over the months, with @combine

A solution without DataFramesMeta could look like

using DataFrames, Dates, Statistics
d = DataFrame(month = now()-Year(1):Month(1):now(), lat = rand(13), lon = rand(13), depth = rand(13))
vars = [:lat, :lon, :depth]
d[!,:year] = floor.(d.month, Year)
r = combine(groupby(d, :year), vars .=> mean)
1 Like

My data is a huge 4d array. I don’t think a data frame is an appropriate data structure for it.

maybe you should explain precisely the structure of your data.
In the meantime you will have some attempts to interpret, among which I add mine, imagining a matrix structure sorted by the last column: the date.

[mean(m[1+(i-1)*12:12+(i-1)*12,:],dims=1) for i in 1:Int(size(m,1)/12)]

You could use the YAXArrays package for that. See Estimating statistics per month · Issue #217 · JuliaDataCubes/YAXArrays.jl · GitHub for an example of doing time aggregation. Beware, that this example will only work on Version 0.4.We recently switched the package to use DimensionalData as the array type and need to polish some edges.

This does exactly what I need, thanks! (How could I forget about comprehensions??)

I’ll look at YAXarrays, thanks. I am an active python xarray user, who tries to port some of its functionality to Julia in a domain specific way.