Hello All. I have a geospatial data in netCDF format with monthly frequency. Time variable is an array of DateTime
objects. I want a selected variable var[LAT, LON, DEPTH, MONTH]
to be reduced to var[LAT, LON, DEPTH, YEAR]
, where the value for each YEAR
is an average of all months of this year. I don’t need to write data back into netCDF. What is the most efficient way to do this in Julia?
Have you tried just writing a loop? e.g.
newvar = zeros(eltype(var), size(var,1), size(var,2), size(var,3), nyears)
count = zeros(Int, size(newvar))
for i in CartesianIndices(var)
inew = CartesianIndex(i[1], i[2], i[3], month2year(i[4]))
newvar[inew] += var[i]
count[inew] += 1
end
newvar ./= count # mean
where nyears
and month2year
are defined appropriately.
Yes, I can do it a straightforward way with a loop. I thought if any package already has some function like coarsen
in xarray, or some one liner is possible.
Convert to a DataFrame, and use DataFramesMeta to group by year and avg over the months, with @combine
A solution without DataFramesMeta could look like
using DataFrames, Dates, Statistics
d = DataFrame(month = now()-Year(1):Month(1):now(), lat = rand(13), lon = rand(13), depth = rand(13))
vars = [:lat, :lon, :depth]
d[!,:year] = floor.(d.month, Year)
r = combine(groupby(d, :year), vars .=> mean)
My data is a huge 4d array. I don’t think a data frame is an appropriate data structure for it.
maybe you should explain precisely the structure of your data.
In the meantime you will have some attempts to interpret, among which I add mine, imagining a matrix structure sorted by the last column: the date.
[mean(m[1+(i-1)*12:12+(i-1)*12,:],dims=1) for i in 1:Int(size(m,1)/12)]
You could use the YAXArrays package for that. See Estimating statistics per month · Issue #217 · JuliaDataCubes/YAXArrays.jl · GitHub for an example of doing time aggregation. Beware, that this example will only work on Version 0.4.We recently switched the package to use DimensionalData as the array type and need to polish some edges.
This does exactly what I need, thanks! (How could I forget about comprehensions??)
I’ll look at YAXarrays, thanks. I am an active python xarray
user, who tries to port some of its functionality to Julia in a domain specific way.