DAG sorting and op rewriting for IR-level structural diffing — anyone worked on this?

I’m working on a compiler infrastructure project (RepliBuild.jl) that needs to diff two DAGs — one representing Julia IR and one representing C++ IR derived from DWARF metadata — and identify structural mismatches by byte offset. The idea is that when the two DAGs don’t match at a given node, the byte offset delta from DWARF is enough to pinpoint the exact IR location that needs a fixup (a thunk op splice), without ever touching source.

The algorithm I’m thinking through:

  1. Build both DAGs from their respective sources (Julia IR + DWARF)
  2. Walk them in parallel, matching nodes structurally
  3. Where a mismatch is detected, record the byte offset delta
  4. Use that mismatch map to splice AOT thunk ops into the IR at the identified sites
  5. Topo-sort to determine safe lowering order

Questions I’m curious about:

  • Has anyone implemented DAG structural diffing in a Julia/MLIR context? Any pkg’s I should look at?
  • Are there known algorithms for topo-sort that handle the case where node identity is a composite of (symbolic name + byte offset) rather than just a symbol?
  • Any experience with op rewriting passes that are driven by an externally computed mismatch map rather than pattern matching on op types alone?

The broader goal is a system where the DAG diff is the only thing that drives IR rewrites — the source never changes, and the thunk ops exist purely at the IR level.

1 Like

Ok this is really cool, so what i have so far is .dot with svg files and graphitz for the ui, but this is what the DWARF mismatch and routing to the MLIR jit for IR rewrite and cache the thunk for julia, but now when the chain makes a thunk I can open the svg and see why the call was roued to mlir instead of handled with ccall directly. This is ugly though if anyone has any UI suggestions i need them for sure, I havent used any julia rendering pkgs either.