Does Julia do common subexpression elimination?

This recent GitHub reads like Julia doesn’t do common subexpression elimination at all, is this a correct assumption? Here’s the relevant bits from the issue:

using BenchmarkTools

f(x) = exp(x)/(1+exp(x))

function g(x)
  e = exp(x)
  e/(1 + e)

h!(f::F, v) where {F} = map!(f, v, v)

@btime h!(f, v) setup=(v = rand(1 << 10);)
@btime h!(g, v) setup=(v = rand(1 << 10);)
  9.648 μs (0 allocations: 0 bytes)
  5.009 μs (0 allocations: 0 bytes)

julia> Base.infer_effects(exp, (Float64,))

julia> @code_lowered f(0.3)
1 ─ %1 = Main.exp(x)
│   %2 = Main.exp(x)
│   %3 = 1 + %2
│   %4 = %1 / %3
└──      return %4

julia> @code_llvm f(0.3)
; Function Signature: f(Float64)
;  @ /tmp/exp.jl:3 within `f`
define double @julia_f_3383(double %"x::Float64") #0 {
  %0 = call double @j_exp_3387(double %"x::Float64")
  %1 = call double @j_exp_3387(double %"x::Float64")
; ┌ @ promotion.jl:422 within `+` @ float.jl:404
   %2 = fadd double %1, 1.000000e+00
; └
; ┌ @ float.jl:407 within `/`
   %3 = fdiv double %0, %2
; └
  ret double %3

Yes, as you have very clearly demonstrated, and as is somewhat clearly stated in the issue you linked, julia does not typically do common subexpression elimination. There are some cases where it can do subespression elimination, but it’s probably best to just assume it never happens.


The reason why I was asking was that I really wasn’t expecting it to not happen at all. It’s such a basic optimization that I assumed it generally happens but this example is somehow “pathological”.

It’s not that Julia never does CSE. But it’s not very good at it right now.

julia> code_llvm(x -> exp(abs(x)) / (1+exp(abs(x))), Tuple{Float64}; debuginfo=:none)
define double @"julia_#36_2009"(double %0) #0 {
  %1 = call double @llvm.fabs.f64(double %0)
  %2 = call double @j_exp_2011(double %1) #0
  %3 = call double @j_exp_2011(double %1) #0
  %4 = fadd double %3, 1.000000e+00
  %5 = fdiv double %2, %4
  ret double %5

Notice that the call to abs was CSE’d, but the call to exp was not. I believe that the compiler is not yet able to recognize what exp can and can’t do to the program state. Absent that understanding, it must call it twice to ensure any side effects occur. This might improve with ongoing work to the effects system, but really this is all a bit beyond me to fully understand (much less explain). In particular, abs and exp have the same effects so I can’t tell you why one was CSE’d and the other was not.

julia> Base.infer_effects(abs, Tuple{Float64})

julia> Base.infer_effects(exp, Tuple{Float64})

If we inspect the un-optimized LLVM:

julia> code_llvm(x -> exp(abs(x)) / (1+exp(abs(x))), Tuple{Float64}; debuginfo=:none, optimize=false)
define double @"julia_#58_2531"(double %0) #0 {
  %1 = call {}*** @julia.get_pgcstack()
  %2 = bitcast {}*** %1 to {}**
  %current_task = getelementptr inbounds {}*, {}** %2, i64 -13
  %3 = bitcast {}** %current_task to i64*
  %world_age = getelementptr inbounds i64, i64* %3, i64 14
  %4 = call double @llvm.fabs.f64(double %0)
  %5 = call double @j_exp_2533(double %4) #0
  %6 = call double @llvm.fabs.f64(double %0)
  %7 = call double @j_exp_2533(double %6) #0
  %8 = fadd double 1.000000e+00, %7
  %9 = fdiv double %5, %8
  ret double %9

we see that the second abs call is still there. My wild speculation is that CSE is handled by LLVM and that Julia’s effects analysis is not yet (as of v1.9) integrated with LLVM, so LLVM does not currently recognize the CSE opportunity with exp.

EDIT: I now see that the issue linked in the original post has a bit more discussion of this topic.

As for code_lowered, it is just a reorganized version of the source. For example, the following calls a function nonexistent that doesn’t even exist. How could it know to eliminate the second call without even knowing whether nonexistent is a real function? We could define nonexistent to mutate the global RNG, print, change processor flags, spawn tasks, or have all sorts of other side effects.

julia> code_lowered(x -> nonexistent(abs(x)) / (1+nonexistent(abs(x))), Tuple{Float64})
1-element Vector{Core.CodeInfo}:
1 ─ %1 = Main.abs(x)
│   %2 = Main.nonexistent(%1)
│   %3 = Main.abs(x)
│   %4 = Main.nonexistent(%3)
│   %5 = 1 + %4
│   %6 = %2 / %5
└──      return %6

Oh yeah I just included the code_lowered bits from the original because I did a lazy cut’n’paste :sweat_smile:

It’s interesting that exp doesn’t get CSE’d but abs does despite the inferred effects being the same. I wonder what the difference is?

But in any case the takeaway here is, as @Mason said, to assume CSE doesn’t happen as the cases where it does happen are fairly obscure.

Julia has two optimization stages. The first one implemented in Julia and does things like type-inference and some optimizations like scalar-replacement-of-aggregates (SROA). This is also the level effect analysis happens on.

The second level is LLVM based and does a lot more optimization like CSE or LICM. The problem is that LLVM doesn’t know how to model the Julia exp function (because we haven’t told it…)

So LLVM doesn’t know that exp has no side effects and that it could CSE it. And the part of the pipeline that knows about the effects doesn’t do CSE yet, and thus funky things like this pop up.

There is ongoing work it inform LLVM of the Julia effects , but that has turned out into a rather tricky project.

The alternative would be to implement CSE in higher level :slight_smile: That sounds like a fun project if someone is looking for one.


2 posts were split to a new topic: Towards implementing CSE in the Julia compiler