Maintaining Julia native GPU codegen vs relying on MLIR

Apparently swift will be getting GPU kernel generation capability through MLIR.

As far as I understand it, Julia relies already on LLVM to generate Cuda code. What are the costs and benefits of maintaining an extra layer on top of that to generate GPU kernels instead of relying on MLIR, which I assume will reduce the maintenance and code burden?

I’m particularly interested in hearing thoughts from @jekbradbury and @maleadt

I just found this so it might be that LLVM will include this natively in the future. MLIR Is A New IR For Machine Learning That Might Become Part Of LLVM - Phoronix

1 Like

I’m not sure how the MLIR->GPU LLVM IR->PTX conversion will look, so I’m speculating here, but I assume that it will only work for relatively coarse operations as they exist in XLA HLO right now. That will probably result in great GPU performance, but wouldn’t be usable as a target to compile general-purpose Julia code to. XLA.jl seems like a better fit there. Being able to compile general-purpose code with CUDAnative.jl directly to GPU LLVM IR is still important as it enables a wider range of applications.

1 Like

Cool thanks, that makes sense and answers my question!

MLIR is an exciting but admittedly confusing project (note that I don’t work directly on it) and the best way to keep up with it is to join the mlir@tensorflow.org mailing list. That said, let me try to give some context:

MLIR is a flexible compiler framework/infrastructure that can be used for many different things, but it’s particularly valuable as a shared infra layer for defining legalization paths through multiple levels of new and existing IRs. In TensorFlow it’s being used to improve, simplify, and share code between compiler/converter tools like the TF-TF Lite Converter, the TF-XLA bridge, and other graph compiler backends. It’s also being investigated as an infra layer within parts of XLA (for codegen), Flang (where FIR would be defined as an MLIR dialect), and other compiler projects. There’s no single MLIR IR, although there are a couple work-in-progress “standard” dialects used to implement shared functionality that can be reused by other dialects at various different levels (HLO-like, loop nest, etc.). LLVM IR itself is a first-class dialect inside the MLIR infrastructure; at the moment it, TensorFlow and TF Lite graphs, and XLA HLO are among the most complete dialect specifications.

Swift would be then be one possible frontend for MLIR dialects and MLIR-based compilers; Julia would be another. I see clear benefits to each one stemming from static compilation and diagnostics on one side and JIT compilation and a fluid value/type domain boundary on the other. I’m giving an “MLIR for Julia developers” talk at JuliaCon where my main goal will be provoking the community to try out different things and figure out what MLIR can do for Julia and vice versa.

11 Likes

Very interesting, thanks for the clarification.

What could MLIR entail as far as layer/function interop between Julia and Swift? Would it be possible for a Julia function to autodiff through a swift layer, optimize it and even do interprocedural optimizations?