This might be a silly question but I something which I do not understand. I was playing around with different implementations of a function to check the performance changes and sometimes I left an unnecessary `where {T}`

in the function line. Then I was wondering about weird results in the `@btime`

output and could not reproduce the fastest run, until I figured out that `where {T}`

has in an impact, even if no `T`

is defined or used anywhere:

```
function f1(x) where {T}
return x*2
end
function f2(x)
return x*2
end
function f3(::Type{T}, x) where {T}
return x*2
end
```

I thought all three functions will compile to the same machine code but `f1`

does is not playing the game:

```
a = rand(1000);
@btime f1.($a);
14.736 μs (2001 allocations: 39.19 KiB)
@btime f2.($a);
471.755 ns (1 allocation: 7.94 KiB)
@btime f3.(Bool, $a);
445.223 ns (1 allocation: 7.94 KiB)
```

What are those allocations and what’s happening here?

2 Likes

oheil
November 14, 2019, 10:34am
#2
```
julia> @code_llvm f1(1.0)
; @ REPL[1]:2 within `f1'
; Function Attrs: uwtable
define nonnull %jl_value_t addrspace(10)* @japi3_f1_16234(%jl_value_t addrspace(10)**, %jl_value_t addrspace(10)*, %jl_value_t addrspace(10)**, i32) #0 {
top:
%4 = alloca %jl_value_t addrspace(10)**, align 8
store volatile %jl_value_t addrspace(10)** %2, %jl_value_t addrspace(10)*** %4, align 8
%5 = call %jl_value_t*** inttoptr (i64 1801343456 to %jl_value_t*** ()*)() #3
%6 = bitcast %jl_value_t addrspace(10)** %2 to double addrspace(10)**
%7 = load double addrspace(10)*, double addrspace(10)** %6, align 8
; ┌ @ promotion.jl:314 within `*' @ float.jl:399
%8 = load double, double addrspace(10)* %7, align 8
%9 = fmul double %8, 2.000000e+00
; └
%10 = bitcast %jl_value_t*** %5 to i8*
%11 = call noalias nonnull %jl_value_t addrspace(10)* @jl_gc_pool_alloc(i8* %10, i32 1744, i32 16) #1
%12 = bitcast %jl_value_t addrspace(10)* %11 to %jl_value_t addrspace(10)* addrspace(10)*
%13 = getelementptr %jl_value_t addrspace(10)*, %jl_value_t addrspace(10)* addrspace(10)* %12, i64 -1
store %jl_value_t addrspace(10)* addrspacecast (%jl_value_t* inttoptr (i64 114258656 to %jl_value_t*) to %jl_value_t addrspace(10)*), %jl_value_t addrspace(10)* addrspace(10)* %13
%14 = bitcast %jl_value_t addrspace(10)* %11 to double addrspace(10)*
store double %9, double addrspace(10)* %14, align 8
ret %jl_value_t addrspace(10)* %11
}
julia> @code_llvm f2(1.0)
; @ REPL[2]:2 within `f2'
; Function Attrs: uwtable
define double @julia_f2_16235(double) #0 {
top:
; ┌ @ promotion.jl:314 within `*' @ float.jl:399
%1 = fmul double %0, 2.000000e+00
; └
ret double %1
}
```

For me this looks like LLVM is creating a Float64 from whatever is coming, but its more guessing than understanding.

I don’t understand why the LLVM code differs at all

1 Like

I also find it strange, since

```
julia> @code_warntype f1(1.0)
Variables
#self#::Core.Compiler.Const(f1, false)
x::Float64
Body::Float64
1 ─ %1 = (x * 2)::Float64
└── return %1
julia> @code_warntype f2(1.0)
Variables
#self#::Core.Compiler.Const(f2, false)
x::Float64
Body::Float64
1 ─ %1 = (x * 2)::Float64
└── return %1
```

I can replicate this problem on `v"1.3.0-rc4.1"`

. Also note that broadcasting is not needed for an MWE, as

```
julia> @btime f1(1.0)
15.680 ns (1 allocation: 16 bytes)
2.0
julia> @btime f2(1.0)
0.027 ns (0 allocations: 0 bytes)
2.0
```

Please check if there is an existing issue about this, and if not, open one.

2 Likes

Thanks for the feedback, I was using 1.3rc2.

Yes the broadcasting was a copy paste leftover

I’ll check older Julia versions too and look through the issues then.

1 Like

I suspect it is because `@code_warntype`

is lying to you

`JuliaLang:master`

← `iamed2:ed/doc-specialization-tip`

opened 03:24PM - 07 Aug 19 UTC

1 Like

Alright, I could not find anything related so I quickly opened an issue.

opened 02:10PM - 14 Nov 19 UTC

First discussed here: https://discourse.julialang.org/t/unnecessary-where-t-causes-huge-performance-drop/31078
Given the following functions in a Julia 1.3rc4 session:
function f1(x) where {T}
return x*2
end
function f2(x)
return x*2
end
function f3(::Type{T},...

Update: I did a silly copy&paste mistake, which made me believe that in Julia 0.7 it’s “OK”. It is not… Julia 0.7 shows the same output.

3 Likes