Optimizing or replacing IF-ELSEIF-ELSE statements

I am developing a simulator but I dont have a lot experience and tricks about Julia. I have a case that seems like it has some kind of other way around that is much more efficient and optimized.

I will give you and example that is similar to my case.

struct Struct1
    value::Float64
    type::Symbol
end

obj1 = Struct1(1.0, :A)
obj2 = Struct1(1.0, :B)
obj3 = Struct1(1.0, :C)

objs = [obj1, obj2, obj3]
results = zeros(length(objs)*10)


for i in eachindex(objs)
        obj = objs[i]
        type = obj.type
    for n in 1:10
        position = (i-1)*10+n
        ############Here we have 20-40 lines of code block############
        if type == :A
            val = obj.value
            results[position] = val # or much more complex operations in a few lines
        elseif type == :B
            val = obj.value
            results[position] = val*2 # or much more complex operations in a few lines
        elseif type == :C
            val = obj.value
            results[position] = val*3 # or much more complex operations in a few lines
        end

    end
end

Explanation:

Vector β€œobjs” always stays constant. So types always will be same in that order. In that way, we always need to check if type is A or B or C for each iteration of loop 2 even we know that type will be always A through β€œloop 2” since we choose β€œobj1” at first iteration of β€œloop 1”.

one way to reduce meaningless condition checks 10 times in β€œloop2” is writing conditional statements before the β€œloop 2”:

for i in eachindex(objs)
    obj = objs[i]
    type = obj.type
    if type == :A
        val = obj.value
        for n in 1:10
            position = (i-1)*10+n
            ############Here we have 20-40 lines of code block############
            results[position] = val # or much more complex operations in a few lines
        end

    elseif type == :B
        val = obj.value
        for n in 1:10
            position = (i-1)*10+n
            ############Here we have 20-40 lines of code block############
            results[position] = val # or much more complex operations in a few lines
        end

    elseif type == :C
        val = obj.value
        for n in 1:10
            position = (i-1)*10+n
            ############Here we have 20-40 lines of code block############
            results[position] = val # or much more complex operations in a few lines
        end

    end
end

But in this way we need to write a loop for each condition so we need to write " 20-40 lines of code block" for each which is not efficient. is there a way around?

Thank You

DRY ? make β€œ20-40 lines of code block” into a function
since Julia can’t dispatch on value ?
I don’t have a lot experience and tricks about Julia either :joy_cat:

Before worrying about this sort of micro-optimization, you’ve got some bigger fish to fry. Namely, you should put this computation into a function instead of running it at top-level:

objs = [obj1, obj2, obj3]
function do_computation(objs)
    results = zeros(length(objs)*10)
    for i in eachindex(objs)
        #...
    end
    return results
end
do_computation(objs)

A chain if if-elseif-elseif-elseif often performs better than you’d expect. The compiler may even re-implement it into a jump table if it thinks that’d be helpful. But the difference here will almost surely be peanuts compared to optimizing the β€œreal work” of your algorithm. Be sure to check out the performance tips β€” which stresses the importance of functions like I suggested above, as well as many other tips:

You can use multiple dispatch if you make Struct1 a parametric type.

struct Struct1{type}
    value::Float64
end

f(obj::Struct1{:A}) = obj.value
f(obj::Struct1{:B}) = 2*obj.value
f(obj::Struct1{:C}) = 3*obj.value

obj1 = Struct1{:A}(1.0)
obj2 = Struct1{:B}(1.0)
obj3 = Struct1{:C}(1.0)

objs = [obj1, obj2, obj3]
results = zeros(length(objs)*10)

for (i,obj) in enumerate(objs)
  for n in 1:10
    position = (i-1)*10+n
    results[position] = f(obj)
  end
end

Depending on the bigger context, this may not be the best solution, but it does shorten your loop.

That will certainly perform worse :slight_smile:

Even in short you can also dispatch on value

f(::Val{:a}) = 1
f(::Val{:b}) = 2
f(s::Symbol) = f(Val(s))
[f(:a), f(:b)] # a long dway to wite [1,2]

but if you are lucky that improves readability somewhere, but not necessarily performance.

For exactly the case you have you could still store a factor = Dict(:A=>1, :B=>2, :C=>3)
and do a

results[position] = factor[type] * val

thing? Depends of course bit what the concrete example needs, you could also store that factor in the struct for example.

I checked because it wasn’t obvious to me. At least in this simple example, I’m finding there’s no difference. To measure, I wrapped the loop in function, changed 10 to a much bigger number, and preallocated results to focus on the difference in the conditional. I get identical benchmarks.

julia> @benchmark dispatch!(results,objs,multiple)

BenchmarkTools.Trial: 1942 samples with 1 evaluation per sample.
 Range (min … max):  2.477 ms …   3.203 ms  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     2.513 ms               β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   2.557 ms Β± 108.804 ΞΌs  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

  β–†β–ˆβ–ˆβ–†β–…β–†β–…β–…β–ƒβ–ƒβ–‚    ▁▁▂▂▁ ▂▁                                     ▁
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–†β–ˆβ–‡β–‡β–‡β–…β–…β–†β–‡β–‡β–ˆβ–†β–‡β–‡β–‡β–‡β–‡β–†β–†β–„β–…β–β–„β–†β–†β–…β–„β–„β–„β–„ β–ˆ
  2.48 ms      Histogram: log(frequency) by time      2.99 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark conditional!(results,objs2,multiple)
BenchmarkTools.Trial: 1923 samples with 1 evaluation per sample.
 Range (min … max):  2.477 ms …   3.328 ms  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     2.519 ms               β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   2.582 ms Β± 145.734 ΞΌs  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

  β–‡β–ˆβ–‡β–†β–†β–…β–ƒβ–‚ ▁▂▂▁▂▂▂▂▁▁▁▂▁    ▁▁ ▁                              ▁
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–‡β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆβ–ˆβ–‡β–‡β–‡β–†β–‡β–†β–†β–…β–‡β–†β–‡β–‡β–…β–„β–…β–„β–β–„β–…β–…β–†β–„β–„β–…β–„ β–ˆ
  2.48 ms      Histogram: log(frequency) by time      3.16 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

I can imagine that (ab)using multiple dispatch in this way might cause performance problems in more complex situations, but I’m not sure where that line is.

Full code below.

struct Struct1{type}
    value::Float64
end

f(obj::Struct1{:A}) = obj.value
f(obj::Struct1{:B}) = 2*obj.value
f(obj::Struct1{:C}) = 3*obj.value

obj1 = Struct1{:A}(1.0)
obj2 = Struct1{:B}(1.0)
obj3 = Struct1{:C}(1.0)

objs = [obj1, obj2, obj3]

function dispatch!(results,objs,multiple)
  for (i,obj) in enumerate(objs)
    for n in 1:multiple
      position = (i-1)*multiple+n
      results[position] = f(obj)
    end
  end
  return(results)
end


struct Struct2
    value::Float64
    type::Symbol
end

obj1 = Struct2(1.0, :A)
obj2 = Struct2(1.0, :B)
obj3 = Struct2(1.0, :C)

objs2 = [obj1, obj2, obj3]

function conditional!(results,objs,multiple)
  for i in eachindex(objs)
    obj = objs[i]
    type = obj.type
    for n in 1:multiple
      position = (i-1)*multiple+n
      ############Here we have 20-40 lines of code block############
      if type == :A
        val = obj.value
        results[position] = val # or much more complex operations in a few lines
      elseif type == :B
        val = obj.value
        results[position] = val*2 # or much more complex operations in a few lines
      elseif type == :C
        val = obj.value
        results[position] = val*3 # or much more complex operations in a few lines
      end
    end
  end
  return(results)
end

using BenchmarkTools, Test

results = zeros(length(objs)*100)
dispatch!(results,objs,100)
results2 = zeros(length(objs)*100)
conditional!(results2,objs2,100)
@test results β‰ˆ results2

multiple=1_000_000
results = zeros(length(objs)*multiple)
@benchmark dispatch!(results,objs,multiple)
@benchmark conditional!(results,objs2,multiple)

You’re dancing right at the edge of a 1000x tall cliff there:

julia> @benchmark dispatch!($results,$objs,$multiple)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  370.125 ΞΌs …   1.523 ms  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     411.916 ΞΌs               β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   450.258 ΞΌs Β± 105.652 ΞΌs  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

  β–…β–ˆβ–‡β–†β–„β–ƒβ–ƒβ–‚β–‚β–…β–‡β–‡β–…β–„β–ƒβ–‚β–‚β–β–                                        ▁  β–‚
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–‡β–‡β–†β–‡β–†β–…β–†β–…β–„β–…β–…β–…β–…β–…β–ƒβ–…β–…β–„β–…β–…β–„β–ƒβ–ƒβ–…β–ƒβ–ƒβ–β–β–ƒβ–ƒβ–ƒβ–…β–‡β–‡β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β–ˆ
  370 ΞΌs        Histogram: log(frequency) by time        964 ΞΌs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> f(obj::Struct1{:D}) = 4*obj.value
f (generic function with 4 methods)

julia> @benchmark dispatch!($results,$objs,$multiple)
BenchmarkTools.Trial: 22 samples with 1 evaluation per sample.
 Range (min … max):  227.317 ms … 240.387 ms  β”Š GC (min … max): 0.66% … 4.48%
 Time  (median):     229.196 ms               β”Š GC (median):    1.54%
 Time  (mean Β± Οƒ):   230.290 ms Β±   3.112 ms  β”Š GC (mean Β± Οƒ):  1.57% Β± 0.79%

        β–ˆ
  β–…β–β–…β–…β–β–ˆβ–ˆβ–…β–β–…β–ˆβ–β–β–…β–ˆβ–…β–β–β–β–β–β–β–…β–β–β–β–β–β–β–…β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–…β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–… ▁
  227 ms           Histogram: frequency by time          240 ms <

 Memory estimate: 91.54 MiB, allocs estimate: 5999489.

The more important point, though, is that parametric structs are more advanced techniques. Focus on the simple stuff first.

It is quite interesting to see the degradation of performance when you add a new type. The solution of @Paul_Schrimpf looks to me quite idiomatic. I don’t see why this could be an abuse of dispatch. I think one can also use union types, to keep the same performance:

f(obj::Struct1{:D}) = 4*obj.value

TU = Union{Struct1{:A},Struct1{:B},Struct1{:C},Struct1{:D}}
objs_union = TU[obj1, obj2, obj3]
@benchmark dispatch!(results,objs_union,multiple)
# same as conditional!(results,objs2,multiple)

The OP here came to us with code that didn’t even apply the very first and most important performance tip: use a function. I don’t think it’s very useful to dive straight into the deep end with parametric structs.

To be clear, I didn’t add a type. I added a method. When there are only a few methods that Julia might possibly call, it’ll convert dispatch into a bunch of ifelses for you. But you’re dancing on the edge of a cliff. Similarly, if an array only has a few types it could possibly contain, it’ll convert dispatch to a bunch of ifelses for you. But you’re once again dancing on the edge of a cliff. If you don’t know that the cliffs are there, you’re gonna fall.