I am trying to optimize a calculation of correlation on a geo-grid of size 16x16:
using NetCDF
function corr1vall(s, lon1, lat1) # scenario, longitude 1, latitude 1
pathPv = myPath * "pv/" * ssps[s] * "/day/EU/regr/"
rmcp(pathPv)
corr = zeros(16,16,28)
for m in 1:28 # model
fnam = pathPv * readdir(pathPv)[m]
pv = ncread(fnam, "pv") # size (16,16,7300)
@views pv1 = pv[lon1,lat1,:] # 20-yr time series of grid 1
for lat in 1:16
for lon in 1:16
@views pv2 = pv[lon,lat,:] # 20-yr time series of grid 2
corr[lon,lat,m] = cor(pv1/areaLat[1], pv2/areaLat[lat]) # normalize
end
end
end
return mean(corr, dims=3)[:,:,1] # multi-model mean
end
One grid takes currently 6.6 s:
@time corr1vall(1,1,1)
> 6.625234 seconds (89.95 k allocations: 1.171 GiB, 0.90% gc time)
Further I have 256 grids and 3 scenarios:
corrAllGrids = zeros(16,16,16*16)
g = 1
for lat in 1:16
for lon in 1:16
corrAllGrids[:,:,g] = corr1vall(1, lon, lat)
g += 1
end
end
I hope to reduce the time for the first function before trying parallelizing, is there evident modifications I could make?