Optimizing or replacing IF-ELSEIF-ELSE statements

I am developing a simulator but I dont have a lot experience and tricks about Julia. I have a case that seems like it has some kind of other way around that is much more efficient and optimized.

I will give you and example that is similar to my case.

struct Struct1
    value::Float64
    type::Symbol
end

obj1 = Struct1(1.0, :A)
obj2 = Struct1(1.0, :B)
obj3 = Struct1(1.0, :C)

objs = [obj1, obj2, obj3]
results = zeros(length(objs)*10)


for i in eachindex(objs)
        obj = objs[i]
        type = obj.type
    for n in 1:10
        position = (i-1)*10+n
        ############Here we have 20-40 lines of code block############
        if type == :A
            val = obj.value
            results[position] = val # or much more complex operations in a few lines
        elseif type == :B
            val = obj.value
            results[position] = val*2 # or much more complex operations in a few lines
        elseif type == :C
            val = obj.value
            results[position] = val*3 # or much more complex operations in a few lines
        end

    end
end

Explanation:

Vector β€œobjs” always stays constant. So types always will be same in that order. In that way, we always need to check if type is A or B or C for each iteration of loop 2 even we know that type will be always A through β€œloop 2” since we choose β€œobj1” at first iteration of β€œloop 1”.

one way to reduce meaningless condition checks 10 times in β€œloop2” is writing conditional statements before the β€œloop 2”:

for i in eachindex(objs)
    obj = objs[i]
    type = obj.type
    if type == :A
        val = obj.value
        for n in 1:10
            position = (i-1)*10+n
            ############Here we have 20-40 lines of code block############
            results[position] = val # or much more complex operations in a few lines
        end

    elseif type == :B
        val = obj.value
        for n in 1:10
            position = (i-1)*10+n
            ############Here we have 20-40 lines of code block############
            results[position] = val # or much more complex operations in a few lines
        end

    elseif type == :C
        val = obj.value
        for n in 1:10
            position = (i-1)*10+n
            ############Here we have 20-40 lines of code block############
            results[position] = val # or much more complex operations in a few lines
        end

    end
end

But in this way we need to write a loop for each condition so we need to write " 20-40 lines of code block" for each which is not efficient. is there a way around?

Thank You

DRY ? make β€œ20-40 lines of code block” into a function
since Julia can’t dispatch on value ?
I don’t have a lot experience and tricks about Julia either :joy_cat:

1 Like

Before worrying about this sort of micro-optimization, you’ve got some bigger fish to fry. Namely, you should put this computation into a function instead of running it at top-level:

objs = [obj1, obj2, obj3]
function do_computation(objs)
    results = zeros(length(objs)*10)
    for i in eachindex(objs)
        #...
    end
    return results
end
do_computation(objs)

A chain if if-elseif-elseif-elseif often performs better than you’d expect. The compiler may even re-implement it into a jump table if it thinks that’d be helpful. But the difference here will almost surely be peanuts compared to optimizing the β€œreal work” of your algorithm. Be sure to check out the performance tips β€” which stresses the importance of functions like I suggested above, as well as many other tips:

4 Likes

You can use multiple dispatch if you make Struct1 a parametric type.

struct Struct1{type}
    value::Float64
end

f(obj::Struct1{:A}) = obj.value
f(obj::Struct1{:B}) = 2*obj.value
f(obj::Struct1{:C}) = 3*obj.value

obj1 = Struct1{:A}(1.0)
obj2 = Struct1{:B}(1.0)
obj3 = Struct1{:C}(1.0)

objs = [obj1, obj2, obj3]
results = zeros(length(objs)*10)

for (i,obj) in enumerate(objs)
  for n in 1:10
    position = (i-1)*10+n
    results[position] = f(obj)
  end
end

Depending on the bigger context, this may not be the best solution, but it does shorten your loop.

That will certainly perform worse :slight_smile:

3 Likes

Even in short you can also dispatch on value

f(::Val{:a}) = 1
f(::Val{:b}) = 2
f(s::Symbol) = f(Val(s))
[f(:a), f(:b)] # a long dway to wite [1,2]

but if you are lucky that improves readability somewhere, but not necessarily performance.

For exactly the case you have you could still store a factor = Dict(:A=>1, :B=>2, :C=>3)
and do a

results[position] = factor[type] * val

thing? Depends of course bit what the concrete example needs, you could also store that factor in the struct for example.

1 Like

I checked because it wasn’t obvious to me. At least in this simple example, I’m finding there’s no difference. To measure, I wrapped the loop in function, changed 10 to a much bigger number, and preallocated results to focus on the difference in the conditional. I get identical benchmarks.

julia> @benchmark dispatch!(results,objs,multiple)

BenchmarkTools.Trial: 1942 samples with 1 evaluation per sample.
 Range (min … max):  2.477 ms …   3.203 ms  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     2.513 ms               β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   2.557 ms Β± 108.804 ΞΌs  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

  β–†β–ˆβ–ˆβ–†β–…β–†β–…β–…β–ƒβ–ƒβ–‚    ▁▁▂▂▁ ▂▁                                     ▁
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–†β–ˆβ–‡β–‡β–‡β–…β–…β–†β–‡β–‡β–ˆβ–†β–‡β–‡β–‡β–‡β–‡β–†β–†β–„β–…β–β–„β–†β–†β–…β–„β–„β–„β–„ β–ˆ
  2.48 ms      Histogram: log(frequency) by time      2.99 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark conditional!(results,objs2,multiple)
BenchmarkTools.Trial: 1923 samples with 1 evaluation per sample.
 Range (min … max):  2.477 ms …   3.328 ms  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     2.519 ms               β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   2.582 ms Β± 145.734 ΞΌs  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

  β–‡β–ˆβ–‡β–†β–†β–…β–ƒβ–‚ ▁▂▂▁▂▂▂▂▁▁▁▂▁    ▁▁ ▁                              ▁
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–‡β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆβ–ˆβ–‡β–‡β–‡β–†β–‡β–†β–†β–…β–‡β–†β–‡β–‡β–…β–„β–…β–„β–β–„β–…β–…β–†β–„β–„β–…β–„ β–ˆ
  2.48 ms      Histogram: log(frequency) by time      3.16 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

I can imagine that (ab)using multiple dispatch in this way might cause performance problems in more complex situations, but I’m not sure where that line is.

Full code below.

struct Struct1{type}
    value::Float64
end

f(obj::Struct1{:A}) = obj.value
f(obj::Struct1{:B}) = 2*obj.value
f(obj::Struct1{:C}) = 3*obj.value

obj1 = Struct1{:A}(1.0)
obj2 = Struct1{:B}(1.0)
obj3 = Struct1{:C}(1.0)

objs = [obj1, obj2, obj3]

function dispatch!(results,objs,multiple)
  for (i,obj) in enumerate(objs)
    for n in 1:multiple
      position = (i-1)*multiple+n
      results[position] = f(obj)
    end
  end
  return(results)
end


struct Struct2
    value::Float64
    type::Symbol
end

obj1 = Struct2(1.0, :A)
obj2 = Struct2(1.0, :B)
obj3 = Struct2(1.0, :C)

objs2 = [obj1, obj2, obj3]

function conditional!(results,objs,multiple)
  for i in eachindex(objs)
    obj = objs[i]
    type = obj.type
    for n in 1:multiple
      position = (i-1)*multiple+n
      ############Here we have 20-40 lines of code block############
      if type == :A
        val = obj.value
        results[position] = val # or much more complex operations in a few lines
      elseif type == :B
        val = obj.value
        results[position] = val*2 # or much more complex operations in a few lines
      elseif type == :C
        val = obj.value
        results[position] = val*3 # or much more complex operations in a few lines
      end
    end
  end
  return(results)
end

using BenchmarkTools, Test

results = zeros(length(objs)*100)
dispatch!(results,objs,100)
results2 = zeros(length(objs)*100)
conditional!(results2,objs2,100)
@test results β‰ˆ results2

multiple=1_000_000
results = zeros(length(objs)*multiple)
@benchmark dispatch!(results,objs,multiple)
@benchmark conditional!(results,objs2,multiple)

You’re dancing right at the edge of a 1000x tall cliff there:

julia> @benchmark dispatch!($results,$objs,$multiple)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  370.125 ΞΌs …   1.523 ms  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     411.916 ΞΌs               β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   450.258 ΞΌs Β± 105.652 ΞΌs  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

  β–…β–ˆβ–‡β–†β–„β–ƒβ–ƒβ–‚β–‚β–…β–‡β–‡β–…β–„β–ƒβ–‚β–‚β–β–                                        ▁  β–‚
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–‡β–‡β–†β–‡β–†β–…β–†β–…β–„β–…β–…β–…β–…β–…β–ƒβ–…β–…β–„β–…β–…β–„β–ƒβ–ƒβ–…β–ƒβ–ƒβ–β–β–ƒβ–ƒβ–ƒβ–…β–‡β–‡β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β–ˆ
  370 ΞΌs        Histogram: log(frequency) by time        964 ΞΌs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> f(obj::Struct1{:D}) = 4*obj.value
f (generic function with 4 methods)

julia> @benchmark dispatch!($results,$objs,$multiple)
BenchmarkTools.Trial: 22 samples with 1 evaluation per sample.
 Range (min … max):  227.317 ms … 240.387 ms  β”Š GC (min … max): 0.66% … 4.48%
 Time  (median):     229.196 ms               β”Š GC (median):    1.54%
 Time  (mean Β± Οƒ):   230.290 ms Β±   3.112 ms  β”Š GC (mean Β± Οƒ):  1.57% Β± 0.79%

        β–ˆ
  β–…β–β–…β–…β–β–ˆβ–ˆβ–…β–β–…β–ˆβ–β–β–…β–ˆβ–…β–β–β–β–β–β–β–…β–β–β–β–β–β–β–…β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–…β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–… ▁
  227 ms           Histogram: frequency by time          240 ms <

 Memory estimate: 91.54 MiB, allocs estimate: 5999489.

The more important point, though, is that parametric structs are more advanced techniques. Focus on the simple stuff first.

3 Likes

It is quite interesting to see the degradation of performance when you add a new type. The solution of @Paul_Schrimpf looks to me quite idiomatic. I don’t see why this could be an abuse of dispatch. I think one can also use union types, to keep the same performance:

f(obj::Struct1{:D}) = 4*obj.value

TU = Union{Struct1{:A},Struct1{:B},Struct1{:C},Struct1{:D}}
objs_union = TU[obj1, obj2, obj3]
@benchmark dispatch!(results,objs_union,multiple)
# same as conditional!(results,objs2,multiple)

The OP here came to us with code that didn’t even apply the very first and most important performance tip: use a function. I don’t think it’s very useful to dive straight into the deep end with parametric structs.

To be clear, I didn’t add a type. I added a method. When there are only a few methods that Julia might possibly call, it’ll convert dispatch into a bunch of ifelses for you. But you’re dancing on the edge of a cliff. Similarly, if an array only has a few types it could possibly contain, it’ll convert dispatch to a bunch of ifelses for you. But you’re once again dancing on the edge of a cliff. If you don’t know that the cliffs are there, you’re gonna fall.

12 Likes