How to interpret lowered code?

question

#1

When trying to understand what happens under Julia’s hood, lowered code is a first step (the “entrance to the rabbit hole”, as @mbauman put it) . But while CodeInfo objects print relatively nicely in the REPL, there seem to be some subtleties in the notations.

For example:

julia> Meta.@lower for i in 1:N
           println(i)
           i%2 == 0 && continue
       end
:($(Expr(:thunk, CodeInfo(
  1 ─ %1  = 1:N                             │
  │         #s2 = (Base.iterate)(%1)        │
  │   %3  = #s2 === nothing                 │
  │   %4  = (Base.not_int)(%3)              │
  └──       goto #7 if not %4               │
  2 ┄ %6  = #s2                             │
  │         i = (Core.getfield)(%6, 1)      │
  │   %8  = (Core.getfield)(%6, 2)          │
2 │         println(i)                      │
3 │   %10 = i % 2                           │
  │   %11 = %10 == 0                        │
  └──       goto #5 if not %11              │
  3 ─       goto #5                         │
  4 ─       goto #5                         │
  5 ┄       #s2 = (Base.iterate)(%1, %8)    │
  │   %16 = #s2 === nothing                 │
  │   %17 = (Base.not_int)(%16)             │
  └──       goto #7 if not %17              │
  6 ─       goto #2                         │
  7 ─       return                          │
))))

Looking at this output brings a lot of questions:

  • What is the difference between assignments such as %1 = s (which are indented in a specific way), and #s2 = (Base.iterate)(%1), which are indented like the rest of the code?

  • What are the delimited blocks (numbered 1 to 7 in this example)? They look more or less like basic blocks (with jumps appearing only at the end of blocks), but I would think that it is a bit early to define basic blocks at this stage, so there might be something more to understand here…

  • Some of the “blocks” are introduced by a solid line (, like block #1 above) while others are introduced by a dashed line (, like block #2 above). What does this mean?

  • What are the numbers in the far left? Line numbers?

  • And what about the vertical bar in the far right? What does it show?

I’ve found some information about lowered forms in the expansion and lowering paragraph of the documentation, but not much. Are there other places where I might find such kind of details? (to be clear: I’m not saying the general documentation is lacking information and should be more precise in this area. These details are probably not very interesting to many documentation readers; I’m just curious to learn more about these)


#2

The IR data structure is in SSA form (https://en.wikipedia.org/wiki/Static_single_assignment_form). Each line corresponds to one SSA expression, but only those that later get used somewhere get printed. So e.g. internally there’s no difference between %1 = 1:N and %9 = println(i), but %9 never gets used, so the assignment doesn’t get printed. Unindented assignments are assignments to slots (you can think of it as memory in the traditional formulation). They get eliminated by SSA conversion early in the optimizer but are present in lowered code.

Yes, they’re basic blocks. Note though that basic block numbering is computed by the printing code. The internal representation uses statement numbers.

The ones with a dashed line have multiple predecessors

Yes, line numbers.

It’s a compact representation of inlining information. Since inlining hasn’t run at this stage, there’s not much there (see the comment at https://github.com/JuliaLang/julia/blob/master/base/compiler/ssair/show.jl#L171 for an overview of the representation).


#3

Thanks a lot!