What causes the performance of this function to tank?

I am doing some work to optimize a slow function call.

I have reduced this function down to a simple demonstration example.

Here is the “original function”.

# alloc: 668.59k, 393.649 MiB
function demo(
    dict::Dict{Symbol, Vector},
    datetime_first::DateTime,
    datetime_last::DateTime,
)
    # 1: unneccessary copy
    # 2: no type hint to destination variable, despite type hint on constructor
    time = Vector{DateTime}(dict[:time])
    data = Vector{Float32}(dict[:data])

    # println("typeof(time)=$(typeof(time))")
    # println("typeof(data)=$(typeof(data))")

    sum_value = 0.0f0
    sum_count = 0

    for ii in 1:length(time)
        if time[ii] >= datetime_first
            if time[ii] < datetime_last
                sum_value += data[ii]
                sum_count += 1
            end
        end
    end

    average_value = sum_value / sum_count

    return average_value
end

And here is the “fixed” version

# alloc: 615.09 k, 9.386 MiB
function demo_fixed(
    dict::Dict{Symbol, Vector},
    datetime_first::DateTime,
    datetime_last::DateTime,
)
    # 1: unneccessary copy
    # 2: no type hint to destination variable, despite type hint on constructor
    time = dict[:time]
    data = dict[:data]

    # println("typeof(time)=$(typeof(time))")
    # println("typeof(data)=$(typeof(data))")

    sum_value = 0.0f0
    sum_count = 0

    for ii in 1:length(time)
        if time[ii] >= datetime_first
            if time[ii] < datetime_last
                sum_value += data[ii]
                sum_count += 1
            end
        end
    end

    average_value = sum_value / sum_count

    return average_value
end

Here’s a main function which calls both.

function (@main)(args)

    time = collect(DateTime("2024-01-01T00:00:00"):Dates.Second(1):DateTime("2024-01-01T23:59:59"))

    data = rand(Float32, 100000000)

    println(typeof(data)) # Vector{DateTime}
    println(typeof(time)) # Vector{Float32}

    dict = Dict(:time => time, :data => data)

    datetime_first = DateTime("2024-01-01T07:00:00")
    datetime_last = DateTime("2024-01-01T23:00:00")

    println("demo")
    @time demo(dict, datetime_first, datetime_last)

    println("demo_fixed")
    @time demo_fixed(dict, datetime_first, datetime_last)

    return nothing

In terms of time, the difference is about 2.3 seconds vs 45 milliseconds.

I thought most of this time might be used by memory allocations, and some latency in asking the OS for memory. However the number of allocations is pretty much the same, so this does not make sense.

I added two further @time statements to the original demo function.

function demo(
    dict::Dict{Symbol, Vector},
    datetime_first::DateTime,
    datetime_last::DateTime,
)
    @time time = Vector{DateTime}(dict[:time])
    @time data = Vector{Float32}(dict[:data])

The first of these takes almost zero time to run. It performs 3 allocations, a total of less than 1 MB.

The second of these is very slow. It also performs 3 allocations, but a total of 380 MB is copied.

This seems slightly strange. Both elements of the Dict are already in the correct memory layout. (Vector{DateTime} and Vector{Float32}.) However copying one is slow while the other is fast.

If you don’t need to pass time and data by copy, the idiomatic way to achieve type stability would be

time = convert(Vector{DateTime}, dict[:time])
data = convert(Vector{Float32}, dict[:data])

Both actually copy the data. It’s just that dict[:time] is small (86400 elements, 691 kB) and dict[:data] is 100M elements.

Interestingly, it appears that constructor Vector{DateTime}(::Vector) is type-unstable, hence demo has so many allocations (time and data’s types are derived as Any).

As far as I understand, both are nearly equivalent. A type hint makes any later assignment to the lhs variable implicitly convert to the target type.

Yes, the idea was to improve type deduction.
If you check @code_warntype demo(dict, datetime_first, datetime_last), you’ll see that both time and data are deduced as ::Any, so that the subsequent iteration over them is slow. Another way to mitigate that would be to add a so-called function barrier (Performance Tips · The Julia Language):

function demo_kernel(
    time::AbstractVector{DateTime},
    data::AbstractVector{Float32},
    datetime_first::DateTime,
    datetime_last::DateTime,
)
    sum_value = 0.0f0
    sum_count = 0

    for ii in 1:length(time)
        if time[ii] >= datetime_first
            if time[ii] < datetime_last
                sum_value += data[ii]
                sum_count += 1
            end
        end
    end

    average_value = sum_value / sum_count

    return average_value
end

function demo_fixed(
    dict::Dict{Symbol, Vector},
    datetime_first::DateTime,
    datetime_last::DateTime,
)
    time = dict[:time]
    data = dict[:data]

    return demo_kernel(time, data, datetime_first, datetime_last)
end

By doing that, you move iteration to a scope where the types of time and data are fixed.

4 Likes

If you already know that the data is of the expected type and no conversion is needed, the idiom for relaying this knowledge to the compiler is a right-hand-side type assert:

time = dict[:time]::Vector{DateTime}
data = dict[:data]::Vector{Float32}
1 Like

I have not seen anyone mention this yet, but Dict{Symbol,Vector} refers to an abstract Vector of unknown element type. This can cause performance issues because we do not know what specific type of Vector that get or getindex will return when indexing into the dict. A concrete Vector such as Vector{Int} has a known element type allowing us to use that knowledge during compilation to machine code.

julia> abstract_dict = Dict{Symbol, Vector}()
Dict{Symbol, Vector}()

julia> concrete_dict = Dict{Symbol, Vector{Int}}()
Dict{Symbol, Vector{Int64}}()

julia> @code_warntype abstract_dict[:foo]
MethodInstance for getindex(::Dict{Symbol, Vector}, ::Symbol)
  from getindex(h::Dict{K, V}, key) where {K, V} @ Base dict.jl:496
Static Parameters
  K = Symbol
  V = Vector
Arguments
  #self#::Core.Const(getindex)
  h::Dict{Symbol, Vector}
  key::Symbol
Locals
  val::Union{}
  index::Int64
Body::Vector
...

julia> @code_warntype concrete_dict[:foo]
MethodInstance for getindex(::Dict{Symbol, Vector{Int64}}, ::Symbol)
  from getindex(h::Dict{K, V}, key) where {K, V} @ Base dict.jl:496
Static Parameters
  K = Symbol
  V = Vector{Int64}
Arguments
  #self#::Core.Const(getindex)
  h::Dict{Symbol, Vector{Int64}}
  key::Symbol
Locals
  val::Union{}
  index::Int64
Body::Vector{Int64}
...

If you use Dict{Symbol,Vector} as an argument type you also force the caller to pass a dict with an abstract value type. For an argument type you probably want Dict{Symbol, <: Vector}. The difference is that Dict{Symbol, Vector} has values of abstract type Vector requiring dynamic dispatch. Where as Dict{Symbol, <: Vector} can accept a Dict with a concrete value type such as Vector{Int}.

julia> Dict{Symbol, Vector{Int}} <: Dict{Symbol, Vector}
false

julia> Dict{Symbol, Vector{Int}} <: Dict{Symbol, <: Vector}
true

julia> f(d::Dict{Symbol, Vector}) = nothing
f (generic function with 1 method)

julia> f(concrete_dict)
ERROR: MethodError: no method matching f(::Dict{Symbol, Vector{Int64}})

Closest candidates are:
  f(::Dict{Symbol, Vector})
   @ Main REPL[21]:1

Stacktrace:
 [1] top-level scope
   @ REPL[22]:1

julia> g(d::Dict{Symbol, <: Vector}) = nothing
g (generic function with 1 method)

julia> g(concrete_dict)
# no MethodError

In your case, you want values of two distinct types, Vector{DateTime} and Vector{Float64}. Because the keys :time and :data refer to values of distinct types, I suggest using a NamedTuple or perhaps declare a struct instead.

julia> nt = (; time = DateTime[], data = Float32[])
(time = DateTime[], data = Float32[])

julia> typeof(nt)
@NamedTuple{time::Vector{DateTime}, data::Vector{Float32}}

julia> f(nt::@NamedTuple{time::Vector{DateTime}, data::Vector{Float32}}) = nt[:time]
f (generic function with 2 methods)

julia> @code_warntype f(nt)                                           MethodInstance for f(::@NamedTuple{time::Vector{DateTime}, data::Vector{Float32}})
  from f(nt::@NamedTuple{time::Vector{DateTime}, data::Vector{Float32}}) @ Main REPL[40]:1
Arguments
  #self#::Core.Const(f)
  nt::@NamedTuple{time::Vector{DateTime}, data::Vector{Float32}}
Body::Vector{DateTime}
1 ─ %1 = Base.getindex(nt, :time)::Vector{DateTime}
└──      return %1


julia> g(nt::@NamedTuple{time::Vector{DateTime}, data::Vector{Float32}}) = nt[:data]
g (generic function with 2 methods)

julia> @code_warntype g(nt)                                           MethodInstance for g(::@NamedTuple{time::Vector{DateTime}, data::Vector{Float32}})
  from g(nt::@NamedTuple{time::Vector{DateTime}, data::Vector{Float32}}) @ Main REPL[42]:1
Arguments
  #self#::Core.Const(g)
  nt::@NamedTuple{time::Vector{DateTime}, data::Vector{Float32}}
Body::Vector{Float32}
1 ─ %1 = Base.getindex(nt, :data)::Vector{Float32}
└──      return %1

Here the type of the NamedTuple encodes the concrete type of each field. Because of constant propagation of the keys :time and :data we can resolve their corresponding value types.

The macro @NamedTuple may look strange. This is shortcut notation for

julia> NamedTuple{(:time, :data), Tuple{Vector{DateTime}, Vector{Float32}}}
@NamedTuple{time::Vector{DateTime}, data::Vector{Float32}}
1 Like