Trivial code change causes vectorization failure

robsmith11 · January 31, 2019, 1:59am

I am porting a few approximate math functions that I’ve used before that vectorized well, but noticed that with Julia the direct translation fails to vectorize. However, a trivial change (expanding s = f * f) makes Julia produce nicely vectorized LLVM-IR.

Is Julia’s optimizer just that fragile? Or is there a reason why fastexp1 is more difficult to optimize? As a comparison, Rust is able to optimize both versions. I’ve tried using explicit muladd()s, but the results are the same.

@fastmath function fastlog1(x::Float32)::Float32
    xi = reinterpret(Int32, x)
    e = (xi - Int32(1059760811)) & Int32(-8388608)
    m = reinterpret(Float32, xi - e)
    i = e * 1.19209290f-7
    f = m - 1f0
    s = f * f
    r = 0.230836749f0 * f + -0.279208571f0
    t = 0.331826031f0 * f + -0.498910338f0
    r = r * s + t
    r = r * s + f
    i * 0.693147182f0 + r
end

@fastmath function fastlog2(x::Float32)::Float32
    xi = reinterpret(Int32, x)
    e = (xi - Int32(1059760811)) & Int32(-8388608)
    m = reinterpret(Float32, xi - e)
    i = e * 1.19209290f-7
    f = m - 1f0
    #s = f * f
    r = 0.230836749f0 * f + -0.279208571f0
    t = 0.331826031f0 * f + -0.498910338f0
    r = r * (f*f) + t  # replaced s here
    r = r * (f*f) + f  # and here
    i * 0.693147182f0 + r
end

@fastmath function test(f)
    s = 0.0
    for i in Int32(1):Int32(1_000_000_000)
        s += f(Float32(i))
    end
    s
end

julia> @btime test(fastlog1)
  2.366 s (0 allocations: 0 bytes)
1.9723269760895107e10
julia> @btime test(fastlog2)
  524.180 ms (0 allocations: 0 bytes)
1.9723269761215004e10

tkoolen · January 31, 2019, 2:21am

Good question. ~~If nothing else, it seems like dead code elimination may be run too early.~~ Edit: I misunderstood the difference, thanks.

tkoolen · January 31, 2019, 5:27pm

Absent any more comments, I’d file an issue about this.

robsmith11 · February 1, 2019, 9:03am

Filed:
https://github.com/JuliaLang/julia/issues/30933

Topic		Replies	Views
Simple loop won't vectorize New to Julia	12	1629	January 29, 2019
What's going on with exp() and --math-mode=fast? General Usage fast-math	29	4418	October 23, 2021
Comparing exp() performance on Julia versus numpy Performance	38	1923	August 8, 2022
Slow exp()? Numerics fast-math	12	2105	August 17, 2018
Is this function well optimized for speed? General Usage	13	906	November 13, 2019

Trivial code change causes vectorization failure

Related topics