Maintaining Julia native GPU codegen vs relying on MLIR

datnamer · June 24, 2019, 5:19pm

Apparently swift will be getting GPU kernel generation capability through MLIR.

As far as I understand it, Julia relies already on LLVM to generate Cuda code. What are the costs and benefits of maintaining an extra layer on top of that to generate GPU kernels instead of relying on MLIR, which I assume will reduce the maintenance and code burden?

I’m particularly interested in hearing thoughts from @jekbradbury and @maleadt

DoktorMike · June 24, 2019, 7:09pm

I just found this so it might be that LLVM will include this natively in the future. MLIR Is A New IR For Machine Learning That Might Become Part Of LLVM - Phoronix

maleadt · June 28, 2019, 9:35am

I’m not sure how the MLIR->GPU LLVM IR->PTX conversion will look, so I’m speculating here, but I assume that it will only work for relatively coarse operations as they exist in XLA HLO right now. That will probably result in great GPU performance, but wouldn’t be usable as a target to compile general-purpose Julia code to. XLA.jl seems like a better fit there. Being able to compile general-purpose code with CUDAnative.jl directly to GPU LLVM IR is still important as it enables a wider range of applications.

datnamer · June 28, 2019, 1:06pm

Cool thanks, that makes sense and answers my question!

jekbradbury · June 30, 2019, 8:05pm

MLIR is an exciting but admittedly confusing project (note that I don’t work directly on it) and the best way to keep up with it is to join the mlir@tensorflow.org mailing list. That said, let me try to give some context:

MLIR is a flexible compiler framework/infrastructure that can be used for many different things, but it’s particularly valuable as a shared infra layer for defining legalization paths through multiple levels of new and existing IRs. In TensorFlow it’s being used to improve, simplify, and share code between compiler/converter tools like the TF-TF Lite Converter, the TF-XLA bridge, and other graph compiler backends. It’s also being investigated as an infra layer within parts of XLA (for codegen), Flang (where FIR would be defined as an MLIR dialect), and other compiler projects. There’s no single MLIR IR, although there are a couple work-in-progress “standard” dialects used to implement shared functionality that can be reused by other dialects at various different levels (HLO-like, loop nest, etc.). LLVM IR itself is a first-class dialect inside the MLIR infrastructure; at the moment it, TensorFlow and TF Lite graphs, and XLA HLO are among the most complete dialect specifications.

Swift would be then be one possible frontend for MLIR dialects and MLIR-based compilers; Julia would be another. I see clear benefits to each one stemming from static compilation and diagnostics on one side and JIT compilation and a fluid value/type domain boundary on the other. I’m giving an “MLIR for Julia developers” talk at JuliaCon where my main goal will be provoking the community to try out different things and figure out what MLIR can do for Julia and vice versa.

datnamer · July 2, 2019, 7:38pm

Very interesting, thanks for the clarification.

What could MLIR entail as far as layer/function interop between Julia and Swift? Would it be possible for a Julia function to autodiff through a swift layer, optimize it and even do interprocedural optimizations?

Topic		Replies	Views
Should Julia use MLIR in the future? Internals & Design question	19	3637	June 28, 2025
What are the future plans for Scientific ML? Machine Learning	1	439	November 4, 2024
How to generate a reusable static LLVM IR module? GPU	4	703	July 25, 2022
MLX and Apple silicon GPU apple , metaljl	4	720	January 24, 2025
The case for the Julia compiler models Internals & Design	0	280	August 17, 2024

Maintaining Julia native GPU codegen vs relying on MLIR

Related topics