Union splitting not working in Julia 1.10?

I’m getting the following results under 1.10:

julia> using BenchmarkTools

julia> x = rand(10_000);

julia> function badsum(x)
               s = 0
               for t in x
                   s += t
               end
               return s
           end
badsum (generic function with 1 method)

julia> function goodsum(x)
           s = zero(eltype(x))
           for t in x
               s += t
           end
           return s
       end
goodsum (generic function with 1 method)

julia> @btime goodsum($x)
  8.667 μs (0 allocations: 0 bytes)
5011.233699365655

julia> @btime badsum($x)
  17.400 μs (0 allocations: 0 bytes)
5011.233699365655

julia> versioninfo()
Julia Version 1.10.0
Commit 3120989f39 (2023-12-25 18:01 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 8 × Intel(R) Core(TM) i7-9700 CPU @ 3.00GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, skylake)
  Threads: 1 on 8 virtual cores
Environment:
  JULIA_EDITOR = runemacs.exe

badsum takes twice as long, even though in Julia 1.9.4 on the same machine the two functions are equally fast (8.7 μs), as expected due to union splitting.

2 Likes

This seems fixed on master again, so worth looking into what fixed this.

1 Like

I took this opportunity to learn how git bisect works :slight_smile: It points me to this commit:

I don’t think this can be backported easily as it sounds like a rather complex commit :sweat_smile: But I really have no idea. How should we proceed with this information?

Maybe we can see what broke it and fix it that way? I’ll bisect again and search commit introducing the regression.

Test script
#!/bin/bash
#=
make -s clean >> ~/buildlogs_julia.txt
make -s -C deps uninstall >> ~/buildlogs_julia.txt
if make -s -j 8 >> ~/buildlogs_julia.txt ; then
        echo "build success"
else
        echo "build failure"
        exit 125
fi
exec ~/julia/bisect/julia/julia --startup=no $0
=#
using BenchmarkTools
x = rand(10_000);
function badsum(x)
        s = 0
        for t in x
                   s += t
        end
        return s
end

function goodsum(x)
        s = zero(eltype(x))
        for t in x
                s += t
        end
        return s
end

bbad = @benchmark badsum($x)
bgood = @benchmark goodsum($x)
badmean = mean(bbad.times)
goodmean = mean(bgood.times)
@info "" goodmean badmean
# we search for the commit that fixes the performance
# so in git bisect lingo:
# performance difference -> normal -> good
# no performance difference -> "bug" -> bad
if 1.25*goodmean < badmean
        @info "GOOD COMMIT (performance difference)"
        exit(0) # good commit
else
        @info "BAD COMMIT (no performance difference)"
        exit(1) # bad commit
end
git bisect command
$ git bisect start HEAD v1.10.0
$ git bisect run ../script.jl
...
8e4221f676cc0cc242b93389b2f65084931e581c is the first bad commit
4 Likes

This appears to be the commit that fixes that issue. If you have the time, you also check which was the first “bad” commit that led to the regression?

Upgrade of LLVM will never be backported.

1 Like

Bisection points to

That makes sense I guess but also looks like a complicated commit.

That’s what I thought. So we need a different fix for 1.10.

1 Like

@PeterSimon: is there an issue for this? If not, would you please open one?

Issue created: Union splitting regression in 1.10 · Issue #52875 · JuliaLang/julia · GitHub

1 Like