How to get Julia SSA-form IR (with Phi nodes)?

Given a function call, how can I get the corresponding SSA-form IR described here?

Alternatively, what is the easiest way - function, package or algorithm - to convert a normal lowered representation to the SSA form? I’m specifically interested in Phi nodes.


For the context, I’m trying to detect loops and conditions in lowered code. I have a prototype for loop detection using heuristics over goto statements, but the complexity grows quickly, so control flow analysis using well-known algorithms looks like a more robust approach.

4 Likes

Funnily enough I’m working on a similar thing. Adding a LoopInfo pass.
code_typed gives you the SSAIR, also check out https://github.com/JuliaLang/julia/pull/36832 which is what i’m currently rebasing.

2 Likes

Well, I think SSA-formed IR is produced by Julia’s type inferencer, so you can look through type inferencer’s code to get what you want. I believe it’s located in folder compiler/ssair and you can find the toplevel entry of conversion algorithm there. But this only produces untyped SSA and you still need the following abstract interpreter and optimizer to get the final typed IR. If you need typed IR you’d better directly invoke code_type.

It looks really interesting! Do you plan to publish it in a particular package or merge into julia itself? I think we discussed loop detection in a compiler pass here and there, so it might be a pretty valuable work for the community.

Regarding @code_typed, indeed, I see the phi nodes, but unfortunately it’s terribly slow compared to @code_lowered and does much more transformations than I’d like to see in the IR.

The untyped IR is exactly what I need. I will go through the code you linked, thanks!


I also found out that IRTools - now unmaintained, but still powerful package - re-implements many of relevant transformations in a more readable way by introducing an abstraction layer on top of the compiler primitives. I will now try to combine all the sources of knowledge to better understand how to better handle loops and conditions.

1 Like

I hope to make it a compile pass for julia. You could just stop the compiler pipeline in the middle, similar to how it’s done here. Inference · The Julia Language. Revise is nice for this because you can just comment out the compiler passes.

2 Likes

FYI, I wrapped some tools for the development of YaoCompiler, so that you don’t have to touch a lot of internals, personally I think IRCode is much much easier to work with than typed code if you want to write code transformation in Julia: CompilerPluginTools.jl/ircode.jl at master · JuliaCompilerPlugins/CompilerPluginTools.jl · GitHub

2 Likes

I should have mentioned that I’m not writing a compiler pass :slight_smile: I found out that surprisingly many things can be done in the “user space” by manually traversing CodeInfo (example), while debugging is a way simpler. Even more surprisingly, the performance of this approach is on par with the normal compilation-execution pipeline, at least in the current generation of the Julia compiler. But of course it’s just my current use case, so I’m really thankful for all the answers!

@Roger-luo Regarding IRCode, I played around with it a bit and can confirm it has much more useful details such as phi nodes, CFG, etc. But code_ircode() also optimizes out many details that I want to control. For instance, in this simple case:

foo(x) = 2x + 1
bar(x) = foo(x ^ 2)
baz(x, y) = bar(x) * foo(y)

code_lowered(baz, (Float64, Int)) returns:

1-element Vector{CodeInfo}:
 CodeInfo(
1 ─ %1 = Main.bar(x)
│   %2 = Main.foo(y)
│   %3 = %1 * %2
└──      return %3
)

while code_ircode(baz, (Float64, Int)) produces:

1-element Vector{Pair{IRCode, DataType}}:
1 1 ─ %1 = Base.mul_float(_2, _2)::Float64                    │╻╷╷   bar
  │   %2 = Base.mul_float(2.0, %1)::Float64                   ││╻╷    foo
  │   %3 = Base.add_float(%2, 1.0)::Float64                   │││╻     +
  │   %4 = Base.mul_int(2, _3)::Int64                         ││╻     *
  │   %5 = Base.add_int(%4, 1)::Int64                         ││╻     +
  │   %6 = Base.sitofp(Float64, %5)::Float64                  ││╻╷╷╷  promote
  │   %7 = Base.mul_float(%3, %6)::Float64                    ││╻     *
  └──      return %7                                          │     
   => Float64

At the very least, bar() and foo() have been inlined preventing me from treating them in some special way, e.g. replacing with a different call. Assuming I don’t want to dive into custom interpreters yet, is the a way to turn off optimization as we can do in code_typed()?

1 Like

Yes you can replace the optimize pass with an empty one I think

The default Julia pass does the inlining etc. CompilerPluginTools.jl/passes.jl at master · JuliaCompilerPlugins/CompilerPluginTools.jl · GitHub

but you can also just replace it with an empty function CompilerPluginTools.jl/passes.jl at master · JuliaCompilerPlugins/CompilerPluginTools.jl · GitHub

the code_ircode function just use default Julia pass by default CompilerPluginTools.jl/ircode.jl at 0dce551c0c1786776f6df1923d8d9da666d5a90d · JuliaCompilerPlugins/CompilerPluginTools.jl · GitHub

you can replace it with no_passs for sure, this is quite useful when writing IR transforms since you need to check if the transform happens as expected

PS. I haven’t checked if things are broken on 1.8 and 1.9 btw, so just in case if you hit anything on 1.8 & 1.9 there might be things broken due to API changes

2 Likes

It looks like you are implementing some kind of abstract interpretation tho? This code looks very similar to the IR interpret code to me…

Yes! The difference though is that I never delegate execution to the Julia compiler, but instead keep the full control. The task is to trace function call execution, recording onto a “tape” so-called primitives and recursing into non-primitives, and this is the n-th incarnation of the tracer. I used to do it by adding special instructions to the IR using Cassette, IRTools and more recently Mixtape. Yet it has always been like playing volleyball on a carousel - you throw the ball in one place and hope to catch it in another, but not really controlling the thing when it’s on the air. In the current approach, I just extract the CodeInfo and iterate over it, invoking function calls and following gotos. And this simple approach works surprisingly well!

Thanks for the detailed answers and links! I think my first approach will be to look into IRTools once again since I’m already familiar with its code base and it seems to implement most things I need (not just phi nodes). And if it doesn’t work, I will dive into the IRCode using CompilerPluginTools, which is pretty readable too!

2 Likes

As nice as it would be to leave IRTools behind altogether, I’ve recently been looking into some optimizations + testing for it. This is mostly out of necessity since IRTools constitutes a large chunk of Zygote’s TTFG overhead, and hedging on us being unable to move off Zygote in the short-medium term is probably wise…

Yeah I think it’s gonna be harder and harder to keep IRTools updated with Julia internals, we should just move on to IRCode which already has a lot of toolchains around inside Julia internals. I believe it’s more because the design is not stable yet and the lack of docstrings is preventing people from using it which is why I tried to create some higher-level wrappers around it in CompilerPluginTools

But how stable is IRCode itself? I mean, CodeInfo is at least exposed via official @code_lowered, I thought IRCode is a very, very deep internal thing and can be changed significantly or dropped altogether at any time, no?

1 Like

I think the plan is to make IRCode stable eventually and build a tool chain around at least this is what was discussed last time about compiler plugins that I know, as now the core develop decided to support this officially. I’m not sure when would that happen tho, since I haven’t been working on this for a while. @aviatesk is the best person to ask for sure.

3 Likes

In case this is helpful, I implemented some loop-detection utilities on top of IRTools while working on Genify.jl. The relevant code is here: https://github.com/probcomp/Genify.jl/blob/main/src/utils.jl

Note that this only detects natural loops, and works under the expectation that people aren’t using @goto statements in their code.

Would be cool if someone does something similar for IRCode as well!

2 Likes

There’s also the built-in https://github.com/FluxML/IRTools.jl/blob/master/src/passes/relooper.jl, but I don’t know who actually uses it in the wild…

The timelines for this really are the biggest thing. If a stable, 1.6-compatible IRCode were available today, we could jettison IRTools completely from libraries like Zygote. As-is I would imagine that is unlikely to materialize :frowning:

1 Like

I just realize you might be asking for custom intrinsic support instead of inlining in the context of AD, since I think it’s better to let DCE etc. run first before doing AD, this is because we get less and simpler code to optimize and transform for AD, currently, the way doing custom intrinsic is not very stable, but you can use a custom method table to achieve this (check what GPUCompiler does for GPUs via abstract interpreter), basically, I think for those methods with handwritten ADs (primals) they can be treated as an intrinsic in the transformation passes, then you extract a method table from CR, then implement your own absinp to produce an IRCode that treats these methods as intrinsics. Then synthesize a gradient IRCode from it.

I like your idea on linearized IR btw, I think this might lead to better-optimized and more stable AD in Julia.

Thanks for the tip! I think the AD case is actually simpler than that - treating a program as a single computation graph actually opens the door for many optimizations that are simply not possible in a general-purpose compiler. But of course there are other cases for tracing that go beyond AD, and I’m enjoy watching the progress of the compile plugins infrastructure.

Then synthesize a gradient IRCode from it.

Is it actually possible to compile IRCode back to a normal function? I think there was a discussion about it in Slack recently, and resolution was that conversion to IRCode is a one way ticket. Or I just didn’t read it to the end?

1 Like

It is one-way, which means you can’t transform lowered IRCode back to CodeInfo, but you can transform IRCode to typed CodeInfo then go through Julia codegen, which will give you the executable.

I didn’t mean a general-purpose compiler, it’s just about reducing the general SSA to a single basic block so you get the simple computational graph. And it’s gonna be simpler if you get the representation after compiler optimization.

1 Like