Optimising function with const instead of struct?


#1

Hi,

I have a function that I used for modelling purposes (kinda using an ODE). Hence, I defined a mutable struct to encapsulate all parameters. I thought the compiler would optimised out the values of the const inside the function but it seems it doesnot. Is there a way to improve to performance of the following code f(p1) ?

I think this is related to this as well.

using BenchmarkTools

mutable struct Type1
       t_end::Float64         
end

struct Type2
       t_end::Float64         
end

p1 = Type1(100)
p2 = Type2(100)
const p3 = 100.

function f(p)
	return 2*p.t_end
end
	
function f_const()
	return 2*p3
end

@btime f(p1)
@btime f(p2)
@btime f_const()

results are

julia> @btime f(p1)
  33.832 ns (2 allocations: 32 bytes)
200.0

julia> @btime f(p2)
  34.211 ns (2 allocations: 32 bytes)
200.0

julia> @btime f_const()
  1.706 ns (0 allocations: 0 bytes)
200.0

#2

You need to interpolate your arguments into the benchmark expression: https://github.com/JuliaCI/BenchmarkTools.jl#quick-start otherwise the difference between the two results is just showing the overhead of accessing a non-constant global variable. Doing so makes all the results perform identically:

julia> @btime f_const()
  1.754 ns (0 allocations: 0 bytes)
200.0

julia> @btime f($p1)
  1.754 ns (0 allocations: 0 bytes)
200.0

Can you clarify what constant optimizations you’re expecting to see?


#3

Thank you! What I meant is that the function actuallty looks like f(x,p) where p does not change during the calls. Maybe wrapping like f_wrap = -> f(x,p) would do the trick. I will try.


#4

The const one isn’t doing anything. It realizes that there’s a constant solution of 200.0 and compiles a function that just spits that out instead of doing a computation (when const is declared, it doesn’t just assume constant type but also constant value). That can’t be done with the others since they aren’t declared constant.


#5

I think there is one typo. Then two nonconstant globals.
Maybe this is what you want?

using BenchmarkTools

mutable struct Type1
       t_end::Float64         
end

struct Type2
       t_end::Float64         
end

const p1 = Type1(100)
const p2 = Type2(100)
const p3 = 100.

function f(p)
	return 2*p.t_end
end
	
function f_const()
	return 2*p3
end

@btime f(p1)
@btime f(p2)
@btime f_const()

Results are

  1.831 ns (0 allocations: 0 bytes)
  0.014 ns (0 allocations: 0 bytes)
  0.014 ns (0 allocations: 0 bytes)

#7

Your code gives the same results my machine (v0.6.3)

julia> @btime f(p1)
  1.506 ns (0 allocations: 0 bytes)
200.0

julia> @btime f(p2)
  1.551 ns (0 allocations: 0 bytes)
200.0

julia> @btime f_const()
  1.312 ns (0 allocations: 0 bytes)
200.0

#8

There is a typo, so I repost the results:

using BenchmarkTools

mutable struct Type1
       t_end::Float64         
end

struct Type2
       t_end::Float64         
end

p1 = Type1(100)
p2 = Type2(100)
const p3 = 100.

function f(p)
	return 2*p.t_end
end
	
function f_const()
	return 2*p3
end

@btime f($p1)
@btime f($p2)
@btime f_const()

I get

julia> @btime f($p1)
  1.313 ns (0 allocations: 0 bytes)
200.0

julia> @btime f($p2)
  1.248 ns (0 allocations: 0 bytes)
200.0

julia> @btime f_const()
  1.488 ns (0 allocations: 0 bytes)
200.0

#9

My results on v0.6.3 agree with yours.
I am surprised that f(p2) and especially f_const() are not optimized as they are for v0.7

With this

function f_const2()
    return 200.0
end

@btime f_const2() returns

  0.014 ns (0 allocations: 0 bytes)

on both v0.6.3 and v0.7, as expected.
[EDIT: commented on f_const()]


#10

I guess I shouldn’t be surprised. But, I thought that compiler optimization had been implemented earlier. For this

const x = 1
f() = x

I get the same non-optimized result for v0.5 and v0.6.x. But, it is optimized for v0.7.


#11

I tried on two computers, with builds of master dating 7/5 and 7/6.
The older one gives me:

julia> @code_warntype f_const()
Body::Float64
2 1 ─     return 200.0

while both give

julia> @code_llvm f_const()

; Function f_const
; Location: REPL[4]:2
define double @julia_f_const_32945() {
top:
  ret double 2.000000e+02
}

but only the older one has:

julia> @btime f_const()
  0.030 ns (0 allocations: 0 bytes)
200.0

The other is a little over 1ns. It seems like it ought to be unnecessary, but @generated f_const() = 2p3 works as expected.

EDIT:
I rebuilt Julia master on the computer that computer, and now I’m getting the same >1ns runtime, with @code_typed saying

2 1 ─ %1 = Base.mul_float(2.0, 100.0)::Float64                                                                                                                                   │╻╷ *
  └──      return %1

So, there seems to have been a change between

Version 0.7.0-beta.157 (2018-07-05 00:54 UTC)
master/aba7068* (fork: 5 commits, 2 days)

and

Version 0.7.0-beta.188 (2018-07-06 20:54 UTC)
Commit a60119b57f (1 days old master)

(the fork and commits should be totally unrelated)