How does the compiler know not to constant-fold methods like rand?

It seems like for the most part, the compiler tries very hard to compute constant expressions at compile-time, though I’ve seen it give up on loops. But there are situations where it sensibly doesn’t even try, like rand(). How does it know to always compute rand at runtime? Maybe an LLVM thing?

Here’s an example where I call rand() in a custom + method. The B(0)+B(1) part seems absent in the LLVM bitcode, so + is computed at compile-time, but the compiler knows to keep the rand() part for runtime.

struct A end
struct B x::Int end
B() = B(1) # just returns a constant

Base.:+(::A, ::A) = B(1)
Base.:+(::B, ::B) = rand()

f() = A() + A() # @code_llvm same as B()
g() = B(0) + B(1) # @code_llvm same as rand()

I’m not a compiler guy, so take what I say here with a grain of salt.

It’s my understanding that compilers keep track of the side effects associated with code. A side effect could be something like mutating an observable object like an array or a method table, or writing to stdout or a file. That is, something that can be observed from the outside. Code with side effects is often called “impure”.

The important thing about impure functions, and ONLY impure functions, you can tell the difference between it having been evaluated at compile time vs runtime. For pure functions, there is no way to tell semantically when it was evaluated.

That means the compiler can be certain that the program’s logic will not be changed by evaluating pure code at compile time. In contrast, if it were to evaluate impure code at compile time, the semantics (i.e. behaviour) of the code would change.

In your example, calling rand is impure, because it mutates the global RNG object. This mutation is visible from the outside, both because the RNG object can be directly observed, but also because mutating the global RNG object changes any subsequent calls to rand. This means the behaviour of code where rand was evaluated at compile time vs runtime is different, and so the compiler will not constant fold it.

In contrast, if we create a new RNG inside the function such that it is not observable from the outside, then the compiler can freely compute it at compile time. And indeed it does:

julia> function foo()
           rng = Random.Xoshiro(
           return rand(rng)
foo (generic function with 1 method)

julia> @code_llvm debuginfo=:none dump_module=false foo()
define double @julia_foo_316() #0 {
  ret double 0x3FB2C8232D1285F0

Note that there is no guarantee that the compiler will compute all side pure functions at compile time. The reason is that the compiler is not all-knowing - it only constant folds code if it knows it’s pure, and its knowledge is limited to whatever reasoning mechanisms are built into the compiler.

Edit: In Julia 1.8, the compiler’s understanding of side effects has gotten a major upgrade (link Take purity modeling seriously by Keno · Pull Request #43852 · JuliaLang/julia · GitHub), and there is now a dedicated compiler-internal system to talk about which assumptions the compiler can make about the code. On the master branch, you can read more about this by looking at the docstring for Base.@assume_effects.

Here is a snippet of the very long docstring, that explains the macro setting Base.@assume_effects :foldable. It explains that Julia will constant fold expressions if they are:

  • Consistent, meaning that for identical inputs, it will always return identical outputs
  • Side-effect free, and
  • Will either return or throw an exception in a reasonable amount of time, i.e. no infinite looping.

  This setting is a convenient shortcut for the set of effects that the compiler requires to be
  guaranteed to constant fold a call at compile time. It is currently equivalent to the following

    •  :consistent

    •  :effect_free

    •  :terminates_globally

  │ Note
  │  This list in particular does not include :nothrow. The compiler will still attempt
  │  constant propagation and note any thrown error at compile time. Note however, that by
  │  the :consistent-cy requirements, any such annotated call must consistently throw given
  │  the same argument values.

The compiler will not perform any optimization that it cannot prove is legal, so the question is not how does it know how to not constant-fold, but rather how does it know how to constant-fold at all. As far as I know, julia’s compiler used to only rely on inlining and constant propagation (ie, on your example, inlining moves everything to a single block of code with explicit constants, which is then optimized), but is starting to model the purity of functions starting in 1.8, see the last item in julia/ at master · JuliaLang/julia · GitHub