I’m working on a compiler infrastructure project (RepliBuild.jl) that needs to diff two DAGs — one representing Julia IR and one representing C++ IR derived from DWARF metadata — and identify structural mismatches by byte offset. The idea is that when the two DAGs don’t match at a given node, the byte offset delta from DWARF is enough to pinpoint the exact IR location that needs a fixup (a thunk op splice), without ever touching source.
The algorithm I’m thinking through:
- Build both DAGs from their respective sources (Julia IR + DWARF)
- Walk them in parallel, matching nodes structurally
- Where a mismatch is detected, record the byte offset delta
- Use that mismatch map to splice AOT thunk ops into the IR at the identified sites
- Topo-sort to determine safe lowering order
Questions I’m curious about:
- Has anyone implemented DAG structural diffing in a Julia/MLIR context? Any pkg’s I should look at?
- Are there known algorithms for topo-sort that handle the case where node identity is a composite of (symbolic name + byte offset) rather than just a symbol?
- Any experience with op rewriting passes that are driven by an externally computed mismatch map rather than pattern matching on op types alone?
The broader goal is a system where the DAG diff is the only thing that drives IR rewrites — the source never changes, and the thunk ops exist purely at the IR level.
