# Trivial code change causes vectorization failure

I am porting a few approximate math functions that I’ve used before that vectorized well, but noticed that with Julia the direct translation fails to vectorize. However, a trivial change (expanding `s = f * f`) makes Julia produce nicely vectorized LLVM-IR.

Is Julia’s optimizer just that fragile? Or is there a reason why `fastexp1` is more difficult to optimize? As a comparison, Rust is able to optimize both versions. I’ve tried using explicit `muladd()`s, but the results are the same.

``````@fastmath function fastlog1(x::Float32)::Float32
xi = reinterpret(Int32, x)
e = (xi - Int32(1059760811)) & Int32(-8388608)
m = reinterpret(Float32, xi - e)
i = e * 1.19209290f-7
f = m - 1f0
s = f * f
r = 0.230836749f0 * f + -0.279208571f0
t = 0.331826031f0 * f + -0.498910338f0
r = r * s + t
r = r * s + f
i * 0.693147182f0 + r
end

@fastmath function fastlog2(x::Float32)::Float32
xi = reinterpret(Int32, x)
e = (xi - Int32(1059760811)) & Int32(-8388608)
m = reinterpret(Float32, xi - e)
i = e * 1.19209290f-7
f = m - 1f0
#s = f * f
r = 0.230836749f0 * f + -0.279208571f0
t = 0.331826031f0 * f + -0.498910338f0
r = r * (f*f) + t  # replaced s here
r = r * (f*f) + f  # and here
i * 0.693147182f0 + r
end

@fastmath function test(f)
s = 0.0
for i in Int32(1):Int32(1_000_000_000)
s += f(Float32(i))
end
s
end
``````
``````julia> @btime test(fastlog1)
2.366 s (0 allocations: 0 bytes)
1.9723269760895107e10
julia> @btime test(fastlog2)
524.180 ms (0 allocations: 0 bytes)
1.9723269761215004e10
``````
7 Likes

Good question. If nothing else, it seems like dead code elimination may be run too early. Edit: I misunderstood the difference, thanks.

Absent any more comments, I’d file an issue about this.

1 Like

Filed:

1 Like