Hello,
I was “refactoring” my code by re-ordering the data and did some benchmarking to make sure I wasn’t doing too much harm to the speed of the code. I have some “conflicting” results. Short story for the refactoring, I want to comply to a standard used by my field (Climate sciences) by putting the 3- dimensions in the relative order “longitude”, “latitude” and “time”. Previously, my code was going for “time”, “longitude” and “latitude”.
Here’s the code for the 2 versions, with their respective benchmarking. Strangely, the version following the standards is faster, but has an order of magnitude more “allocs estimate”. Note that I ran the benchmarking 2 or 3 times and the results are consistent.
I guess my question is: How could the updated code be faster with so many more allocations?
Initial version (non-standard way) – relative order of “time”, “latitude” and “longitude”
using a 51134 x 50 x 50 Array.
function annualmin(data::Array{Float64, 3}, timeV::StepRange{Date, Base.Dates.Day})
years = Dates.year(timeV)
numYears = unique(years)
FD = zeros(Float64, (length(numYears), size(data, 2), size(data, 3)))
Threads.@threads for i in 1:length(numYears)
idx = searchsortedfirst(years, numYears[i]):searchsortedlast(years, numYears[i])
Base.minimum!(view(FD,i:i,:,:), view(data,idx,:,:))
end
return FD
end
julia> @benchmark annualmin(data3, d)
BenchmarkTools.Trial:
memory estimate: 3.09 mb
allocs estimate: 422
--------------
minimum time: 54.215 ms (0.00% GC)
median time: 58.214 ms (0.00% GC)
mean time: 59.934 ms (0.20% GC)
maximum time: 76.660 ms (0.00% GC)
--------------
samples: 84
evals/sample: 1
time tolerance: 5.00%
memory tolerance: 1.00%
Updated version (standard way) – relative order of “longitude”, “latitude” and “time”
Using a 50 x 50 x 51134 Array.
function annualmin(data::Array{Float64, 3}, timeV::StepRange{Date, Base.Dates.Day})
years = Dates.year(timeV)
numYears = unique(years)
FD = zeros(Float64, (size(data, 1), size(data, 2), length(numYears)))
Threads.@threads for i in 1:length(numYears)
idx = searchsortedfirst(years, numYears[i]):searchsortedlast(years, numYears[i])
Base.minimum!(view(FD, :, :, i:i), view(data, :, :, idx))
end
return FD
end
julia> @benchmark annualmin(data3, d)
BenchmarkTools.Trial:
memory estimate: 3.21 mb
allocs estimate: 4401
--------------
minimum time: 30.439 ms (0.00% GC)
median time: 40.055 ms (0.00% GC)
mean time: 40.987 ms (0.00% GC)
maximum time: 75.771 ms (0.00% GC)
--------------
samples: 123
evals/sample: 1
time tolerance: 5.00%
memory tolerance: 1.00%