Do C++ performance optimisations apply to Julia?

I felt I was making that exact point :slight_smile: If x were passed as a reference in Julia, it would behave just like the C++ program. Edit Bad example. If I want that behavior in Julia, I explicitly need to pass a Ref to modx and dereference it,
while in C++ I can write a function that accepts a reference unbeknownst to the caller with all the hideous side effects.

Again, no, that’s because x[] = is not pure assignment but a syntactic sugar for setindex! which is an in-place operation. The difference you’re showing is in the semantic of assignment between C/C++ and Julia.

In a sense I may agree with you that there isn’t much difference with passing by reference for mutable objects. My point was that there aren’t different evaluation strategies, it’s always pass-by-sharing, strategy which is fairly popular in dynamic languages, not even specific to Julia or other niche languages. I tend to summaries it with “mutable objects can be mutated and immutable ones can’t” (and in fact the use of Ref suggested above meant to wrap an immutable object inside a mutable container to simulate the ability to mutate it), but for example this is very different from fortran where you can modify pretty much everything (unless limited by in/out scoping probably?). In Julia it’s fairly easy to know whether a type is mutable or not, for example I find that more opaque in python (probably all classes are mutable?)

8 Likes

Hmm, I’ve been unable to confirm this. Looking at the code below, you find that the only case where anything other than a chain of conditions is evaluated is in the boundary case, where the value is determined before the function is compiled and so it can just return a single value every time. I’m not seeing any switch-like behaviour.

function switch(n)
    m = n ≤ 0.1 ? 0.0 :
        n ≤ 0.6 ? 1.0 :
        n ≤ 0.8 ? 1.1 :
        n ≤ 1.0 ? 1.2 :
        n ≤ 1.3 ? 1.35 :
        2.2        #=otherwise=#
    m
end

let
    n = 1.1
    @code_llvm switch(n) # Long
end

begin
    const a = 1.1
    @code_llvm switch(a) # Long
end

function boundary()
    a = 1.1
    switch(a)
end

@code_llvm boundary()

Output:

;  @ c:\Users\jmmsm\Documents\GitHub\speedy_julia\examples\switch_statement.jl:2 within `switch`
; Function Attrs: uwtable
define double @julia_switch_1237(double %0) #0 {
top:
;  @ c:\Users\jmmsm\Documents\GitHub\speedy_julia\examples\switch_statement.jl:3 within `switch`
; ┌ @ float.jl:447 within `<=`
   %1 = fcmp ugt double %0, 1.000000e-01
; └
  br i1 %1, label %L4, label %L21

L4:                                               ; preds = %top
; ┌ @ float.jl:447 within `<=`
   %2 = fcmp ugt double %0, 6.000000e-01
; └
  br i1 %2, label %L7, label %L21

L7:                                               ; preds = %L4
; ┌ @ float.jl:447 within `<=`
   %3 = fcmp ugt double %0, 8.000000e-01
; └
  br i1 %3, label %L10, label %L21

L10:                                              ; preds = %L7
; ┌ @ float.jl:447 within `<=`
   %4 = fcmp ugt double %0, 1.000000e+00
; └
  %5 = fcmp ugt double %0, 1.300000e+00
  %. = select i1 %5, double 2.200000e+00, double 1.350000e+00
  %value_phi3 = select i1 %4, double %., double 1.200000e+00
  br label %L21

L21:                                              ; preds = %L10, %L7, %L4, %top
  %value_phi = phi double [ 0.000000e+00, %top ], [ 1.000000e+00, %L4 ], [ 1.100000e+00, %L7 ], [ %value_phi3, %L10 ]
;  @ c:\Users\jmmsm\Documents\GitHub\speedy_julia\examples\switch_statement.jl:9 within `switch`
  ret double %value_phi
}
;  @ c:\Users\jmmsm\Documents\GitHub\speedy_julia\examples\switch_statement.jl:2 within `switch`
; Function Attrs: uwtable
define double @julia_switch_1239(double %0) #0 {
top:
;  @ c:\Users\jmmsm\Documents\GitHub\speedy_julia\examples\switch_statement.jl:3 within `switch`
; ┌ @ float.jl:447 within `<=`
   %1 = fcmp ugt double %0, 1.000000e-01
; └
  br i1 %1, label %L4, label %L21

L4:                                               ; preds = %top
; ┌ @ float.jl:447 within `<=`
   %2 = fcmp ugt double %0, 6.000000e-01
; └
  br i1 %2, label %L7, label %L21

L7:                                               ; preds = %L4
; ┌ @ float.jl:447 within `<=`
   %3 = fcmp ugt double %0, 8.000000e-01
; └
  br i1 %3, label %L10, label %L21

L10:                                              ; preds = %L7
; ┌ @ float.jl:447 within `<=`
   %4 = fcmp ugt double %0, 1.000000e+00
; └
  %5 = fcmp ugt double %0, 1.300000e+00
  %. = select i1 %5, double 2.200000e+00, double 1.350000e+00
  %value_phi3 = select i1 %4, double %., double 1.200000e+00
  br label %L21

L21:                                              ; preds = %L10, %L7, %L4, %top
  %value_phi = phi double [ 0.000000e+00, %top ], [ 1.000000e+00, %L4 ], [ 1.100000e+00, %L7 ], [ %value_phi3, %L10 ]
;  @ c:\Users\jmmsm\Documents\GitHub\speedy_julia\examples\switch_statement.jl:9 within `switch`
  ret double %value_phi
}
;  @ c:\Users\jmmsm\Documents\GitHub\speedy_julia\examples\switch_statement.jl:22 within `boundary`
; Function Attrs: uwtable
define double @julia_boundary_1241() #0 {
top:
;  @ c:\Users\jmmsm\Documents\GitHub\speedy_julia\examples\switch_statement.jl:24 within `boundary`
  ret double 1.350000e+00
}
3 Likes

I thought it existed because of this comment from Jeff in a issue about adding or not a switch keyword for Julia. However, in a duplicate Jeff admits he is not sure if LLVM had really done it at some point.

1 Like

On 1.8 I tested and it emits a jump table sometimes.

2 Likes

Edit: Marking this as the solution as it best summarises or “answers” the original question. Feel kinda weird doing it but I think for discoverability it might help.

While writing the article I mentioned in the OP, I ended up not being satisfied with the end result of many of the conclusions and I think a much more interesting article can be written about a subset of the tips. I’m going to focus on that subset for the final article.

With that said, I thought it might be useful or interesting to those who stumble upon this thread in the future to read a nicely formatted document with code examples so I’ll leave the link here to an unlisted draft article on Forem:

Also, there is a GitHub repository containing all of the code snippets.

Thanks to everyone who contributed in this thread!

3 Likes

Can you post an example of the code?

I saw this example from this old thread which I have been able to recreate:

const CODES = (2, 3, 5, 7, 11)
const RCODES = reverse(CODES)

f(x) = 0
f(x, y, ys...) = x == y ? 1 + length(ys) : f(x, ys...)
findcode3(x) = f(x, RCODES...)

@code_llvm findcode3(3)

Additionally, I can edit the code to get close to what I wanted originally as we can see that there’s a table lookup happening in the LLVM:

const RETURNVALUES = (0, 1, 5, 10, 11, 13)
const RRETURNVALUES = reverse(RETURNVALUES)

h(x) = RETURNVALUES[0]
h(x, y, ys...) = x == y ? RRETURNVALUES[length(ys)+1] : h(x, ys...)
hindcode3(x) = h(x, RCODES...)

hindcode3(3)

@code_llvm hindcode3(3)
; Function Attrs: uwtable
define i64 @julia_hindcode3_1356(i64 signext %0) #0 {
top:
; ┌ @ c:\Users\jmmsm\Documents\GitHub\speedy_julia\examples\ss_try2.jl:23 within `h`
   %switch.tableidx = add i64 %0, -2
   %1 = icmp ult i64 %switch.tableidx, 10
   br i1 %1, label %switch.hole_check, label %L16

L16:                                              ; preds = %switch.hole_check, %top
; │ @ c:\Users\jmmsm\Documents\GitHub\speedy_julia\examples\ss_try2.jl:23 within `h` @ c:\Users\jmmsm\Documents\GitHub\speedy_julia\examples\ss_try2.jl:23 @ c:\Users\jmmsm\Documents\GitHub\speedy_julia\examples\ss_try2.jl:23 @ c:\Users\jmmsm\Documents\GitHub\speedy_julia\examples\ss_try2.jl:23 @ c:\Users\jmmsm\Documents\GitHub\speedy_julia\examples\ss_try2.jl:23
   %2 = call nonnull {}* @j_h_1358(i64 signext %0) #0
   call void @llvm.trap()
   unreachable

switch.hole_check:                                ; preds = %top
; │ @ c:\Users\jmmsm\Documents\GitHub\speedy_julia\examples\ss_try2.jl:23 within `h`
   %switch.maskindex = trunc i64 %switch.tableidx to i16
   %switch.shifted = lshr i16 555, %switch.maskindex
   %3 = and i16 %switch.shifted, 1
   %switch.lobit.not = icmp eq i16 %3, 0
   br i1 %switch.lobit.not, label %L16, label %switch.lookup

switch.lookup:                                    ; preds = %switch.hole_check
   %switch.gep = getelementptr inbounds [10 x i64], [10 x i64]* @switch.table.julia_hindcode3_1356, i64 0, i64 %switch.tableidx
   %switch.load = load i64, i64* %switch.gep, align 8
   ret i64 %switch.load
; └
}

I have a couple of examples of optimizations.


#simple LUT

julia> function foo(x)
           if x == 1
               7
           elseif x == 2
               8
           elseif x == 3
               9
           elseif x == 4
               10
           elseif x == 5
               11
           elseif x == 6
               12
           elseif x == 7
               13
           elseif x == 8
               14
           elseif x == 9
               15
           elseif x == 10
               16
           elseif x == 11
               17
           else
               0
           end
       end
foo (generic function with 1 method)

julia> @code_llvm foo(4)
;  @ REPL[1]:1 within `foo`
define i64 @julia_foo_102(i64 signext %0) #0 {
top:
;  @ REPL[1]:2 within `foo`
  %switch.tableidx = add i64 %0, -1
  %1 = icmp ult i64 %switch.tableidx, 10
  %.not10 = icmp eq i64 %0, 11
  %. = select i1 %.not10, i64 17, i64 0
  %switch.offset = add i64 %0, 6
  %common.ret.op = select i1 %1, i64 %switch.offset, i64 %.
;  @ REPL[1] within `foo`
  ret i64 %common.ret.op
}

#jump table julia> function foo(x)
           if x == 1
               x * 7
           elseif x == 2
               x * 8
           elseif x == 3
               x * 9
           elseif x == 4
               x * 10
           elseif x == 5
               x * 11
           elseif x == 16
               x * 12
           elseif x == 71
               x * 13
           elseif x == 8
               x * 14
           elseif x == 9
               x * 15
           elseif x == 10
               x * 16
           elseif x == 11
               x * 17
           else
               x * 0
           end
       end
foo (generic function with 1 method)

julia> @code_llvm foo(4)
;  @ REPL[5]:1 within `foo`
define i64 @julia_foo_119(i64 signext %0) #0 {
top:
;  @ REPL[5]:2 within `foo`
  switch i64 %0, label %common.ret [
    i64 1, label %L3
    i64 2, label %L7
    i64 3, label %L11
    i64 4, label %L15
    i64 5, label %L19
    i64 16, label %L23
    i64 71, label %L27
    i64 8, label %L31
    i64 9, label %L35
    i64 10, label %L39
    i64 11, label %L43
  ]

common.ret:                                       ; preds = %L43, %L39, %L35, %L31, %L27, %L23, %L19, %L15, %L11, %L7, %L3, %top
  %common.ret.op = phi i64 [ 7, %L3 ], [ 16, %L7 ], [ 27, %L11 ], [ 40, %L15 ], [ 55, %L19 ], [ 192, %L23 ], [ 923, %L27 ], [ 112, %L31 ], [ 135, %L35 ], [ 160, %L39 ], [ 187, %L43 ], [ 0, %top ]
;  @ REPL[5] within `foo`
  ret i64 %common.ret.op

L3:                                               ; preds = %top
  br label %common.ret

L7:                                               ; preds = %top
  br label %common.ret

L11:                                              ; preds = %top
  br label %common.ret

L15:                                              ; preds = %top
  br label %common.ret

L19:                                              ; preds = %top
  br label %common.ret

L23:                                              ; preds = %top
  br label %common.ret

L27:                                              ; preds = %top
  br label %common.ret

L31:                                              ; preds = %top
  br label %common.ret

L35:                                              ; preds = %top
  br label %common.ret

L39:                                              ; preds = %top
  br label %common.ret

L43:                                              ; preds = %top
  br label %common.ret
}

#more sophisticated LUT

julia> @code_llvm foo(a)
;  @ REPL[9]:1 within `foo`
define i64 @julia_foo_150(i64 signext %0) #0 {
top:
;  @ REPL[9]:2 within `foo`
  %switch.tableidx = add i64 %0, -1
  %1 = icmp ult i64 %switch.tableidx, 10
  br i1 %1, label %switch.lookup, label %L31

switch.lookup:                                    ; preds = %top
  %switch.gep = getelementptr inbounds [10 x i64], [10 x i64]* @switch.table.julia_foo_150, i64 0, i64 %switch.tableidx
  %switch.load = load i64, i64* %switch.gep, align 8
  br label %common.ret

common.ret:                                       ; preds = %L31, %switch.lookup
  %common.ret.op = phi i64 [ %., %L31 ], [ %switch.load, %switch.lookup ]
;  @ REPL[9] within `foo`
  ret i64 %common.ret.op

L31:                                              ; preds = %top
;  @ REPL[9]:22 within `foo`
; ┌ @ promotion.jl:477 within `==`
   %.not10 = icmp eq i64 %0, 11
   %. = select i1 %.not10, i64 17, i64 0
   br label %common.ret
; └
}
2 Likes