Inconsistent llvm code


#1

Hi, i am using julia 0.5.0

If i do:

import Distributions
@code_llvm Distributions.quantile(Distributions.Normal(), 0.5)

I get:

; Function Attrs: uwtable
define double @julia_quantile_71921(%Normal*, double) #0 {
top:
  %2 = fmul double %1, 2.000000e+00
  %3 = call double @julia_erfcinv_71923(double %2) #1
  %4 = getelementptr inbounds %Normal, %Normal* %0, i64 0, i32 0
  %5 = load double, double* %4, align 8
  %6 = getelementptr inbounds %Normal, %Normal* %0, i64 0, i32 1
  %7 = load double, double* %6, align 8
  %8 = fmul double %3, 0xBFF6A09E667F3BCD
  %9 = fmul double %8, %7
  %10 = fadd double %5, %9
  ret double %10
}

However if i do:

import Optim
import Distributions
@code_llvm Distributions.quantile(Distributions.Normal(), 0.5)

I get:

; Function Attrs: uwtable
define %jl_value_t* @julia_quantile_71921(%Normal*, double) #0 {
top:
  %2 = call %jl_value_t*** @jl_get_ptls_states() #2
  %3 = alloca [14 x %jl_value_t*], align 8
  %.sub = getelementptr inbounds [14 x %jl_value_t*], [14 x %jl_value_t*]* %3, i64 0, i64 0
  %4 = getelementptr [14 x %jl_value_t*], [14 x %jl_value_t*]* %3, i64 0, i64 2
  %5 = getelementptr [14 x %jl_value_t*], [14 x %jl_value_t*]* %3, i64 0, i64 9
  %6 = getelementptr [14 x %jl_value_t*], [14 x %jl_value_t*]* %3, i64 0, i64 12
  %7 = getelementptr [14 x %jl_value_t*], [14 x %jl_value_t*]* %3, i64 0, i64 6
  %8 = bitcast %jl_value_t** %4 to i8*
  call void @llvm.memset.p0i8.i32(i8* %8, i8 0, i32 96, i32 8, i1 false)
  %9 = bitcast [14 x %jl_value_t*]* %3 to i64*
  store i64 24, i64* %9, align 8
  %10 = getelementptr [14 x %jl_value_t*], [14 x %jl_value_t*]* %3, i64 0, i64 1
  %11 = bitcast %jl_value_t*** %2 to i64*
  %12 = load i64, i64* %11, align 8
  %13 = bitcast %jl_value_t** %10 to i64*
  store i64 %12, i64* %13, align 8
  store %jl_value_t** %.sub, %jl_value_t*** %2, align 8
  %14 = getelementptr [14 x %jl_value_t*], [14 x %jl_value_t*]* %3, i64 0, i64 11
  %15 = getelementptr [14 x %jl_value_t*], [14 x %jl_value_t*]* %3, i64 0, i64 10
  %16 = getelementptr [14 x %jl_value_t*], [14 x %jl_value_t*]* %3, i64 0, i64 5
  %17 = getelementptr [14 x %jl_value_t*], [14 x %jl_value_t*]* %3, i64 0, i64 4
  %18 = getelementptr [14 x %jl_value_t*], [14 x %jl_value_t*]* %3, i64 0, i64 3
  %19 = getelementptr [14 x %jl_value_t*], [14 x %jl_value_t*]* %3, i64 0, i64 8
  %20 = getelementptr [14 x %jl_value_t*], [14 x %jl_value_t*]* %3, i64 0, i64 7
  %21 = getelementptr [14 x %jl_value_t*], [14 x %jl_value_t*]* %3, i64 0, i64 13
  %22 = getelementptr inbounds %Normal, %Normal* %0, i64 0, i32 1
  %23 = fmul double %1, 2.000000e+00
  %24 = call %jl_value_t* @julia_erfcinv_71923(double %23) #1
  store %jl_value_t* %24, %jl_value_t** %21, align 8
  store %jl_value_t* inttoptr (i64 2150928800 to %jl_value_t*), %jl_value_t** %6, align 8
  %25 = call %jl_value_t* @jl_apply_generic(%jl_value_t** %6, i32 2)
  store %jl_value_t* %25, %jl_value_t** %20, align 8
  store %jl_value_t* inttoptr (i64 2150931984 to %jl_value_t*), %jl_value_t** %7, align 8
  store %jl_value_t* inttoptr (i64 2152744728 to %jl_value_t*), %jl_value_t** %19, align 8
  %26 = call %jl_value_t* @jl_apply_generic(%jl_value_t** %7, i32 3)
  store %jl_value_t* %26, %jl_value_t** %16, align 8
  store %jl_value_t* inttoptr (i64 2152744696 to %jl_value_t*), %jl_value_t** %4, align 8
  %27 = bitcast %jl_value_t*** %2 to i8*
  %28 = call %jl_value_t* @jl_gc_pool_alloc(i8* %27, i32 1488, i32 16)
  %29 = getelementptr inbounds %jl_value_t, %jl_value_t* %28, i64 -1, i32 0
  store %jl_value_t* inttoptr (i64 2148805808 to %jl_value_t*), %jl_value_t** %29, align 8
  %30 = bitcast %Normal* %0 to i64*
  %31 = load i64, i64* %30, align 8
  %32 = bitcast %jl_value_t* %28 to i64*
  store i64 %31, i64* %32, align 8
  store %jl_value_t* %28, %jl_value_t** %18, align 8
  %33 = call %jl_value_t* @jl_gc_pool_alloc(i8* %27, i32 1488, i32 16)
  %34 = getelementptr inbounds %jl_value_t, %jl_value_t* %33, i64 -1, i32 0
  store %jl_value_t* inttoptr (i64 2148805808 to %jl_value_t*), %jl_value_t** %34, align 8
  %35 = bitcast double* %22 to i64*
  %36 = load i64, i64* %35, align 8
  %37 = bitcast %jl_value_t* %33 to i64*
  store i64 %36, i64* %37, align 8
  store %jl_value_t* %33, %jl_value_t** %17, align 8
  %38 = call %jl_value_t* @jl_apply_generic(%jl_value_t** %4, i32 4)
  store %jl_value_t* %38, %jl_value_t** %14, align 8
  store %jl_value_t* inttoptr (i64 2148851360 to %jl_value_t*), %jl_value_t** %5, align 8
  store %jl_value_t* inttoptr (i64 2147425520 to %jl_value_t*), %jl_value_t** %15, align 8
  %39 = call %jl_value_t* @jl_apply_generic(%jl_value_t** %5, i32 3)
  %40 = load i64, i64* %13, align 8
  store i64 %40, i64* %11, align 8
  ret %jl_value_t* %39
}

Performance takes a big hit. Any thoughts?

Thanks, Gus


#2

The LLVM IR is a bit to low level to easily see the problem. Take a look at @code_warntype to check the results of type inference.

%3 = call double @julia_erfcinv_71923(double %2) #1

vs

%24 = call %jl_value_t* @julia_erfcinv_71923(double %23) #1

Looks like the result of the erfc became type instable after importing Optim. Please format your posts and but code pieces in “`” so that it gets formatted properly


#3

Likely a fixed bug. https://github.com/JuliaLang/julia/issues/18465 You should also check @code_warntype first and quote your code.


#4

Here is the results of @code_warntype:

Without Optim:

Variables:
  #self#::Base.#quantile
  d::Distributions.Normal{Float64}
  q::Float64

Body:
  begin 
      # meta: location c:\program files\resolver_julia\packages\v0.5\StatsFuns\src\distrs\norm.jl norminvcdf 35
      # meta: location c:\program files\resolver_julia\packages\v0.5\StatsFuns\src\distrs\norm.jl norminvcdf 34
      SSAValue(0) = $(Expr(:invoke, LambdaInfo for erfcinv(::Float64), :(StatsFuns.erfcinv), :((Base.box)(Base.Float64,(Base.mul_float)((Base.box)(Float64,(Base.sitofp)(Float64,2)),q)))))
      # meta: pop location
      # meta: pop location
      return (Base.box)(Base.Float64,(Base.add_float)((Core.getfield)(d::Distributions.Normal{Float64},:μ)::Float64,(Base.box)(Base.Float64,(Base.mul_float)((Core.getfield)(d::Distributions.Normal{Float64},:σ)::Float64,(Base.box)(Base.Float64,(Base.mul_float)((Base.box)(Base.Float64,(Base.neg_float)(SSAValue(0))),$(QuoteNode(1.4142135623730951))))))))
  end::Float64

With Optim:

Variables:
  #self#::Base.#quantile
  d::Distributions.Normal{Float64}
  q::Float64

Body:
  begin 
      return (Distributions.convert)(Distributions.Real,(StatsFuns.xval)((Core.getfield)(d::Distributions.Normal{Float64},:μ)::Float64,(Core.getfield)(d::Distributions.Normal{Float64},:σ)::Float64,(-($(Expr(:invoke, LambdaInfo for erfcinv(::Float64), :(StatsFuns.erfcinv), :((Base.box)(Base.Float64,(Base.mul_float)((Base.box)(Float64,(Base.sitofp)(Float64,2)),q))))))::ANY * StatsFuns.sqrt2)::ANY)::ANY)::ANY
  end::ANY

Thanks for your help


#5

Just for future info: As you can see with @code_warntype Julia inferred the second version to return ANY instead Float64 indicating a problem.

Could you try to build Julia from the release-0.5 branch?


#6

This is known


#7

I am not setup for doing a build unfortunately. But it looks like this has been / will be resolved in the next release. Thanks again