Code seems slow

I have a nested loop that takes the required loop values from a Dict created in a previous function. The code seems slow compared to a version where the loop values are explicitly given. I have tried to create a MWE

function v1()
    iMax = 1000
    jMax = 1000
    for i = 1:iMax, j = 1:jMax
    end
end

function v2()
    loopVals = Dict{String, Any}("iMax" => 1000, "jMax" => 1000)
    for i = 1:loopVals["iMax"], j = 1:loopVals["jMax"]
    end
end

function v3()
    loopVals = Dict{String, Any}("iMax" => 1000, "jMax" => 1000)
    iMax = loopVals["iMax"]
    jMax = loopVals["jMax"]
    for i = 1:iMax, j = 1:jMax
    end
end

using BenchmarkTools

@btime v1()
@btime v2()
@btime v3()

v1 seems fine whereas v2 and v3 are slow + increased memory allocation. What am I doing wrong?

This is a classic “type instability”

https://docs.julialang.org/en/v1/manual/performance-tips/#Write-“type-stable”-functions-1

Julia is fastest when it knows what types things are going to be ahead of time. Here, you’re pulling the loop range out of a dictionary that can hold Any value, so Julia generates generic code that could work for any type, not just a range from one integer to another. This reduces the number of optimizations Julia can make — in fact, I’d bet in your first example the loop is removed entirely.

2 Likes

Thanks for the response. I somehow thought the type of iMax, jMax in v3 is known before the loop is run. And even if I explicitly specify the type of iMax, jMax after taking these values from the Dict in v3 before the loop, it is slow. Back to the manual for me!

Have a look at the typing of the values stored in the dictionaries:

julia> include("junk.jl")
  2.000 ns (0 allocations: 0 bytes)
  52.190 ms (1980987 allocations: 45.52 MiB)
  52.297 ms (1980987 allocations: 45.52 MiB)
  14.000 μs (6 allocations: 672 bytes)
  176.305 ns (6 allocations: 672 bytes)

where junk.jl is


function v1()
    iMax = 1000
    jMax = 1000
    for i = 1:iMax, j = 1:jMax
    end
end

function v2()
    loopVals = Dict{String, Any}("iMax" => 1000, "jMax" => 1000)
    for i = 1:loopVals["iMax"], j = 1:loopVals["jMax"]
    end
end

function v3()
    loopVals = Dict{String, Any}("iMax" => 1000, "jMax" => 1000)
    iMax = loopVals["iMax"]
    jMax = loopVals["jMax"]
    for i = 1:iMax, j = 1:jMax
    end
end

using BenchmarkTools

@btime v1()
@btime v2()
@btime v3()


function v4()
    loopVals = Dict{String, Int64}("iMax" => 1000, "jMax" => 1000)
    for i = 1:loopVals["iMax"], j = 1:loopVals["jMax"]
    end
end

function v5()
    loopVals = Dict{String, Int64}("iMax" => 1000, "jMax" => 1000)
    iMax = loopVals["iMax"]
    jMax = loopVals["jMax"]
    for i = 1:iMax, j = 1:jMax
    end
end

@btime v4()
@btime v5()

By the way, it is probably also possible to assert the type of the loop bounds:

function v3()
    loopVals = Dict{String, Any}("iMax" => 1000, "jMax" => 1000)
    iMax = loopVals["iMax"]::Int64
    jMax = loopVals["jMax"]::Int64
    for i = 1:iMax, j = 1:jMax
    end
end

This runs in

198.020 ns (8 allocations: 704 bytes)
1 Like

Good point about the loops getting removed altogether by the compiler. Try this:

function v1()
    iMax = 1000
    jMax = 1000
    s = 0.0
    for i = 1:iMax, j = 1:jMax
        s += i * j
    end
    s
end

function v2()
    loopVals = Dict{String, Any}("iMax" => 1000, "jMax" => 1000)
    s = 0.0
    for i = 1:loopVals["iMax"], j = 1:loopVals["jMax"]
        s += i * j
    end
    s
end

function v3()
    loopVals = Dict{String, Any}("iMax" => 1000, "jMax" => 1000)
    iMax = loopVals["iMax"]
    jMax = loopVals["jMax"]
    s = 0.0
    for i = 1:iMax, j = 1:jMax
        s += i * j
    end
    s
end

using BenchmarkTools

@btime v1()
@btime v2()
@btime v3()


function v4()
    loopVals = Dict{String, Int64}("iMax" => 1000, "jMax" => 1000)
    s = 0.0
    for i = 1:loopVals["iMax"], j = 1:loopVals["jMax"]
        s += i * j
    end
    s
end

function v5()
    loopVals = Dict{String, Int64}("iMax" => 1000, "jMax" => 1000)
    iMax = loopVals["iMax"]
    jMax = loopVals["jMax"]
    s = 0.0
    for i = 1:iMax, j = 1:jMax
        s += i * j
    end
    s
end

function v6()
    loopVals = Dict{String, Any}("iMax" => 1000, "jMax" => 1000)
    iMax = loopVals["iMax"]::Int64
    jMax = loopVals["jMax"]::Int64
    s = 0.0
    for i = 1:iMax, j = 1:jMax
        s += i * j
    end
    s
end

@btime v4()
@btime v5()
@btime v6()

The timings are:

julia> include("junk.jl")
  1.217 ms (0 allocations: 0 bytes)
  98.831 ms (3977717 allocations: 75.99 MiB)
  95.796 ms (3977717 allocations: 75.99 MiB)
  1.219 ms (6 allocations: 672 bytes)
  1.218 ms (6 allocations: 672 bytes)
  1.218 ms (8 allocations: 704 bytes)
1 Like

Excellent. I think this is what I need. My real Dict structure needs to be of type Any so the use of ::Int64 to assert type fixes the issue. Many thanks!