Understanding benchmark results

Balinus · January 9, 2017, 8:26pm

Hello,

I was “refactoring” my code by re-ordering the data and did some benchmarking to make sure I wasn’t doing too much harm to the speed of the code. I have some “conflicting” results. Short story for the refactoring, I want to comply to a standard used by my field (Climate sciences) by putting the 3- dimensions in the relative order “longitude”, “latitude” and “time”. Previously, my code was going for “time”, “longitude” and “latitude”.

Here’s the code for the 2 versions, with their respective benchmarking. Strangely, the version following the standards is faster, but has an order of magnitude more “allocs estimate”. Note that I ran the benchmarking 2 or 3 times and the results are consistent.

I guess my question is: How could the updated code be faster with so many more allocations?

Initial version (non-standard way) – relative order of “time”, “latitude” and “longitude”

using a 51134 x 50 x 50 Array.

function annualmin(data::Array{Float64, 3}, timeV::StepRange{Date, Base.Dates.Day})
  years    = Dates.year(timeV)
  numYears = unique(years)
  FD       = zeros(Float64, (length(numYears), size(data, 2), size(data, 3)))

  Threads.@threads for i in 1:length(numYears)
    idx = searchsortedfirst(years, numYears[i]):searchsortedlast(years, numYears[i])
    Base.minimum!(view(FD,i:i,:,:), view(data,idx,:,:))
  end
  return FD
end

julia> @benchmark annualmin(data3, d)
BenchmarkTools.Trial: 
  memory estimate:  3.09 mb
  allocs estimate:  422
  --------------
  minimum time:     54.215 ms (0.00% GC)
  median time:      58.214 ms (0.00% GC)
  mean time:        59.934 ms (0.20% GC)
  maximum time:     76.660 ms (0.00% GC)
  --------------
  samples:          84
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

Updated version (standard way) – relative order of “longitude”, “latitude” and “time”
Using a 50 x 50 x 51134 Array.

function annualmin(data::Array{Float64, 3}, timeV::StepRange{Date, Base.Dates.Day})
  years    = Dates.year(timeV)
  numYears = unique(years)
  FD       = zeros(Float64, (size(data, 1), size(data, 2), length(numYears)))

  Threads.@threads for i in 1:length(numYears)
    idx = searchsortedfirst(years, numYears[i]):searchsortedlast(years, numYears[i])
    Base.minimum!(view(FD, :, :, i:i), view(data, :, :, idx))
  end
  return FD
end

julia> @benchmark annualmin(data3, d)
BenchmarkTools.Trial: 
  memory estimate:  3.21 mb
  allocs estimate:  4401
  --------------
  minimum time:     30.439 ms (0.00% GC)
  median time:      40.055 ms (0.00% GC)
  mean time:        40.987 ms (0.00% GC)
  maximum time:     75.771 ms (0.00% GC)
  --------------
  samples:          123
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

ChrisRackauckas · January 9, 2017, 9:50pm

Allocations aren’t the final answer. For example, if the new indexing makes it so that way you are now much more likely to grab the next element (columnwise) from the array, this locality can greatly benefit speed.

I would suggest not using high dimensional arrays for this kind of stuff though, that same reason. Julia is column-ordered, so in memory things which are local go down columns. This means that the pointer has to move 51134 (number of rows) to go over to the next column, and even more to move in 3D. That is an insane lack of locality, so using : isn’t that great here.

Instead, since these data values are grouped, you should think about using Vectors of specific types for storing your data. If you always associate time with a latitude and longitude and you do not change these values, you can make an immutable:

immutable ClimateEvent{T}
  time::T
  latitude::T
  longitude::T
end

If you do sometimes change the values, for a small decrease in speed make it a type instead. Now you can make a Vector{ClimateEvent{Float64} or whatever you need, and Julia will make a memory layout which efficient for storing these kinds of values.

You can make this even easier by setting up some indexing on these. For example:

function getindex(c::ClimateEvent,i::Int)
  if i == 1
    return c.time
  elseif i == 2
    return c.latitude
  elseif i == 3
    return c.longtitude
  else 
    throw(error("You can't index like that, dude!"))
end

and now c[1] is time, c[2] is latitude, and c[3] is longtitude. Then you can make a

immutable ClimateVector{T} <: AbstractVector{T}
  arr::Vector{ClimateEvent{T}}
end

which just holds a vector. So now instead of constructing a Vector, you make ClimateVector (while it’s immutable, its array is immutable, so you can still push! into cvec.arr, etc.). What you can then do is setup an array interface on here:

getindex(c::ClimateVector,i1,i2) = c.arr[i1][i2]

which will bootstrap the ClimateEvent indexing to the ClimateVector to allow for indexing like c[i,j]. But since this indexing is done by functions, you can make sure it’s done in the most efficient manner, unlike what is going on with arrays.

Then, all of your functions can be written to have special dispatches on ClimateVector so that way you can replace some search functions with something that’s faster (or just something with easier syntax). Speed and readability result.

I implore you to try developing a type-structure with array interfaces instead of directly using arrays of numbers. You can really gain a lot here from dropping the old idea of just using arrays of numbers.

Edit: I didn’t read the whole change, but yes, it looks like the difference came down to the fact that 51134 x 50 x 50 Array has lots of rows, while 50 x 50 x 51134 Array has more locality in a column-wise ordering.

kristoffer.carlsson · January 9, 2017, 10:01pm

Arrays in julia are stored contiguously in the first index (column major). Order of accessing data matters for performance. Accessing data not in cache is much more expensive.

Balinus · January 10, 2017, 2:01pm

Thanks Chris!

These ideas are quite interesting. It’s still over my head for the time being but I’m willing to dig into a better programming paradigm.

The only thing I’m not sure is where do I put the actual data? I think that I could put it inside immutable ClimateEvent ? Where data is a m x n x p Array. The data array is directly extracted from netCDF files which stores the information on a 3D grid/Array (usually).

immutable ClimateEvent{T}
  data::T
  time::T
  latitude::T
  longitude::T
end

Or is there something I don’t understand?

Right now, I was using a custom type ClimGrid

type ClimGrid
  lat::Array{Float64} # vector of length n
  lon::Array{Float64} # vector of length m
  data::Array{Float64} # array of size m x n x p
  timeV::StepRange{Date,Base.Dates.Day} # length p
  model::String 
  experiment::String # e.g. "historical"
  run::String
  filename::String # netCDF filename
  dataunits::String # e.g. Kelvin
  latunits::String # e.g. degrees north
  lonunits::String # e.g. degree east
end

Well, lots of homework to do! Thanks for the ideas, I’ll try to implement them and see where it leads me.

Balinus · January 10, 2017, 8:59pm

So, I’ve succeeded in constructing the ClimateEvent immutable. However, I don’t seem to be able to construct the ClimateVector from the ClimateEvent type.

For example, I build a ClimateEvent with 4 entry.

julia> t2 = ClimateEvent(randn(1), randn(1), randn(1), randn(1))
ClimateTools.ClimateEvent{Array{Float64,1}}([-0.929815],[-0.130163],[0.24705],[0.325987])

I can access them with the function getindex as t2[i]:

julia> t2[1]
1-element Array{Float64,1}:
 -0.929815

Now, I thought that t build the ClimateVector, it was simply a matter of calling ClimateVector with t2:

julia> ClimateVector(t2)
------ MethodError --------------------- Stacktrace (most recent call last)

 [1] — ClimateTools.ClimateVector{T}(::ClimateTools.ClimateEvent{Array{Float64,1}}) at sysimg.jl:53

MethodError: Cannot `convert` an object of type ClimateTools.ClimateEvent{Array{Float64,1}} to an object of type ClimateTools.ClimateVector{T}
This may have arisen from a call to the constructor ClimateTools.ClimateVector{T}(...),
since type constructors fall back to convert methods.

But it’s not the case. I think I’m missing something here, but I don’t know what. The immutable types are the one suggested by Chris in his post. But I added a data entry in ClimateEvent type.

Any help is greatly appreciated!

ChrisRackauckas · January 10, 2017, 9:14pm

Are you sure you want those as arrays instead of numbers? randn()?

The way I had it written you’d have a Vector{ClimateEvent}, to ClimateVector([t2]) would have to be how it’s written. It would probably be smart to add a constructor ClimateVector(t2) which automatically puts it into a Vector, and then a push!(cv,t) that will push a ClimateEvent into cv.arr. That will make it much more natural to build such ClimateVectors.

Balinus · January 11, 2017, 2:39pm

You’re right, I thought that randn was giving Float64. Using arbitrary numbers, representing a data value, time, latitude and longitude:

julia> t2 = ClimateEvent(23.,34.,45.,212.)
ClimateTools.ClimateEvent{Float64}(23.0,34.0,45.0,212.0)

I can build a ClimateVector using the bracket (but not the parentheses, see edit below):

julia> ClimateVector{t2}
ClimateTools.ClimateVector{ClimateTools.ClimateEvent{Float64}(23.0,34.0,45.0,212.0)}

Is that the intended way to construct it? I mean using your initial suggestion?

I can see the benefit of your latest suggestion on defining a constructor for ClimateVector(t2) which would push a ClimateEvent into cv.arr. This would “bypass” the use of ClimateVector{t2}.

Again, thanks for your help! It’s really appreciated.

edit - Using the parentheses, I can’t build a ClimateVector:

julia> ClimateVector([t2])
Error showing value of type ClimateTools.ClimateVector{Float64}:
------ MethodError --------------------- Stacktrace (most recent call last)

 [1] — _start() at client.jl:360

 [2] — run_repl(::Base.REPL.LineEditREPL, ::Base.##930#931) at REPL.jl:188

 [3] — run_frontend(::Base.REPL.LineEditREPL, ::Base.REPL.REPLBackendRef) at REPL.jl:903

 [4] — run_interface(::Base.Terminals.TTYTerminal, ::Base.LineEdit.ModalInterface) at LineEdit.jl:1579

 [5] — (::Base.REPL.##22#23{Bool,Base.REPL.##33#42{Base.REPL.LineEditREPL,Base.REPL.REPLHistoryProvider},Base.REPL.LineEditREPL,Base.LineEdit.Prompt})(::Base.LineEdit.MIState, ::Base.AbstractIOBuffer{Array{UInt8,1}}, ::Bool) at REPL.jl:652

 [6] — print_response(::Base.REPL.LineEditREPL, ::Any, ::Void, ::Bool, ::Bool) at REPL.jl:139

 [7] — print_response(::Base.Terminals.TTYTerminal, ::Any, ::Void, ::Bool, ::Bool, ::Void) at REPL.jl:154

 [8] — display(::ClimateTools.ClimateVector{Float64}) at multimedia.jl:143

 [9] — display(::Base.REPL.REPLDisplay{Base.REPL.LineEditREPL}, ::ClimateTools.ClimateVector{Float64}) at REPL.jl:135

 [10] — display(::Base.REPL.REPLDisplay{Base.REPL.LineEditREPL}, ::MIME{Symbol("text/plain")}, ::ClimateTools.ClimateVector{Float64}) at REPL.jl:132

 [11] — #showarray#330(::Bool, ::Function, ::IOContext{Base.Terminals.TTYTerminal}, ::ClimateTools.ClimateVector{Float64}, ::Bool) at show.jl:1599

MethodError: no method matching size(::ClimateTools.ClimateVector{Float64})
Closest candidates are:
  size{T,N}(::AbstractArray{T,N}, ::Any) at abstractarray.jl:47
  size{N}(::Any, ::Integer, ::Integer, ::Integer...) at abstractarray.jl:48
  size(::BitArray{1}) at bitarray.jl:39
  ...

ChrisRackauckas · January 11, 2017, 3:56pm

Oh, it’s looking for a size function. It’s working, but it can’t display since it think it’s a vector so it’s trying to use the full vector interface.

http://docs.julialang.org/en/stable/manual/interfaces/#abstract-arrays

That page explains the full interface. You just define a few methods (size, etc.), and the functions which work on AbstractArrays will now work on your ClimateVector, like giving it a nice show method.

Balinus · January 11, 2017, 4:12pm

ok!

Lots of work ahead, thanks again!

This is a great exercise to learn Julia.

ChrisRackauckas · January 11, 2017, 4:20pm

You only need to implement a few methods (should take a few minutes), and the rest “will be for free” by bootstrapping from those.

Yes, learning to use Julia’s array/iterator tools to organize data is both a productivity and performance win in many cases (performance for the “locality” reason. If you’re doing a bunch of BLAS/Matrix multiplication on these, then having the right vectors of numbers is better for performance, but that sounds like it’s a different story). It’s probably one of the most essential things to learn.

Balinus · January 12, 2017, 2:41pm

I have another question regarding the Base.getindex(c::ClimateVector,i1,i2) = c.arr[i1][i2] extension. With this extension I can access every data contained in ClimateVector with cv[i,j], but I can’t access say all data from the whole timeseries. For example, let’s imagine we have only 1 grid point (hence, latitude and longitude stays constant) for 100 years daily values (i.e. we have one ClimateEvent for each daily value). I’d like to be able to use a syntax cv[:,1] which would give me the whole timeseries (in other words, accessing every ClimateEvent.data in the ClimateVector, assuming that the ClimateVector has been built for a single grid point.

Unless there is another clever way to do it?

ChrisRackauckas · January 12, 2017, 4:48pm

You define the method for ::Colon. I think these lines of code show how to do it:

https://github.com/JuliaDiffEq/DiffEqBase.jl/blob/master/src/solutions/solution_interface.jl#L7

And just to complete this reference, the end values are determined by size. With that, the full array interface should be complete.

Balinus · January 12, 2017, 5:10pm

Thanks! it gives me good material to work on. There’s still some bug here and there but I’m getting there!

Topic		Replies	Views
Benchmarking questions General Usage benchmark	3	327	September 18, 2023
Comparing performance of 2 simple averaging functions - why is one faster? Performance	5	502	August 31, 2020
3 different ways to create an array have different performance Performance	1	967	March 10, 2018
Confusing benchmark time results and memory allocation depending on number of calls for function with zip Performance question	12	1126	March 12, 2019
Strange behavior with array calls General Usage question , performance	8	493	January 2, 2019

Understanding benchmark results

Related topics