What are the recommended tools these days for working with function definition expressions in macros? I’m aware of ExprTools.jl and MacroTools.jl, but I don’t know how maintained they are and whether those packages or other packages are recommended these days.
Hi! Maintainer of MacroTools.jl and author of splitdef/combinedef
here. MacroTools is not actively developed ATM, but it’s been around for a long time, and basically it does what it says it does? It’s one of the most used packages in the ecosystem, so the bugs have been ironed out.
ExprTools offers a splitdef/combinedef
with very similar interfaces. There was talk to merge the two five years ago, but that didn’t happen. Not clear what the benefits are in 2025. I’m not aware of any outstanding bug in splitdef
.
There’s also MLStyle.jl which can do it. It comes with a benchmark! The graph looks off, but looking at the data behind it, it seems to be that MLStyle is faster for small-ish expressions, but slower for big ones? But that’s only precompilation time, so pretty small potatoes unless you have a widely used macro. It’s used in the SciML space; that’s a good recommendation in its favor.
One important point regardless: if your macro is returning function definitions, consider using @qq begin
instead of quote
, so that line numbers are correct (then go-to-definition and stacktraces become much more useful).
Thanks @cstjean!
Some time ago I saw a comment on Github from a core developer that mentioned that MacroTools.jl has some issues with latency or invalidation or something like that. So, since then I’ve been somewhat hesitant to use MacroTools.jl. Do you have any insight into that matter?
What are you trying to do? (For a lot of simple things the built-in functions are enough.)
I remember reading such a comment about it years ago from @tim.holy, and I still don’t get it. In any case, MacroTools is an indirect dependency of a big chunk of the ecosystem, so if there’s a major issue there we should fix it.
I can see how importing it might introduce undesirable methods that affect precompilation time, but I don’t see how using it can possibly be problematic.
I haven’t checked in ages, but in general terms the reason it was problematic was the same reason that CoreLogging was initially problematic: both injected poorly-inferred functions into your code. For example, if you write
function foo(x)
x < 0 && @warn "expected $x to be positive"
return x
end
that expands to
:(function foo(x)
#= REPL[3]:1 =#
#= REPL[3]:2 =#
x < 0 && begin
#= logging.jl:384 =#
let
#= logging.jl:385 =#
var"#48#level" = Base.CoreLogging.Warn
#= logging.jl:387 =#
var"#49#std_level" = var"#48#level"
#= logging.jl:388 =#
if (var"#49#std_level").level >= (Base.Threads.Atomic{Int32}(-1000))[]
#= logging.jl:389 =#
var"#50#group" = Symbol("REPL[3]")
#= logging.jl:390 =#
var"#51#_module" = Main
#= logging.jl:391 =#
var"#52#logger" = (Base.CoreLogging.current_logger_for_env)(var"#49#std_level", var"#50#group", var"#51#_module")
#= logging.jl:392 =#
if !(var"#52#logger" === Base.CoreLogging.nothing)
#= logging.jl:393 =#
var"#53#id" = :Main_e4489fe9
#= logging.jl:396 =#
if Base.CoreLogging.invokelatest(Base.CoreLogging.shouldlog, var"#52#logger", var"#48#level", var"#51#_module", var"#50#group", var"#53#id")
#= logging.jl:397 =#
var"#54#file" = "REPL[3]"
#= logging.jl:398 =#
if var"#54#file" isa Base.CoreLogging.String
#= logging.jl:399 =#
var"#54#file" = (Base.CoreLogging.Base).fixup_stdlib_path(var"#54#file")
end
#= logging.jl:401 =#
var"#55#line" = 2
#= logging.jl:402 =#
local var"#56#msg", var"#57#kwargs"
#= logging.jl:403 =#
begin
#= logging.jl:373 =#
try
#= logging.jl:374 =#
var"#56#msg" = "expected $(x) to be positive"
#= logging.jl:375 =#
var"#57#kwargs" = (;)
#= logging.jl:376 =#
true
catch var"#70#err"
#= logging.jl:378 =#
Base.invokelatest(Base.CoreLogging.logging_error, var"#52#logger", var"#48#level", var"#51#_module", var"#50#group", var"#53#id", var"#54#file", var"#55#line", var"#70#err", true)
#= logging.jl:379 =#
false
end
end && Base.CoreLogging.invokelatest(Base.CoreLogging.handle_message, var"#52#logger", var"#48#level", var"#56#msg", var"#51#_module", var"#50#group", var"#53#id", var"#54#file", var"#55#line"; var"#57#kwargs"...)
end
end
end
#= logging.jl:409 =#
Base.CoreLogging.nothing
end
end
#= REPL[3]:3 =#
return x
end)
You’ll notice that the expanded code includes many function calls, some of which are made via invokelatest
: this was because a lot of the code in CoreLogging
was uninferrable, and poorly-inferred code is vastly more vulnerable to invalidation than well-inferred code. The nasty part was that if you loaded some package that invalidated code in CoreLogging, that percolated up through your code that made use of it. invokelatest
forces runtime dispatch, and thus breaks the “chain of invalidation” that would otherwise percolate through the entire caller path.
We spent several days going over CoreLogging finding and fixing type inference where possible, and adding invokelatest
where not. But I never did that for MacroTools, and I honestly don’t know whether someone else has done so in the meantime.
One way you can test it is schematized like this:
using MySmallPackageThatUsesMacroTools
exercise_my_package() # forces compilation of a lot of your code
using SnoopCompileCore
invalidations = @snoop_invalidations using PkgA, PkgB, PkgC
Good candidates for PkgA
etc are packages that extend Julia’s own methods: things like new number types (e.g., BFloat16.jl) or string types (e.g., InlineStrings.jl). Obviously you would need to first check that those packages aren’t already loaded by MySmallPackageThatUsesMacroTools
or you’ll fool yourself into thinking you’re safe when you aren’t.
Another more comprehensive way is to use JET on your package, but you’d have to distinguish problems caused by MacroTools from ones that are not.
In normal usage, MacroTools functions are only evaluated at precompile-time, when macros are expanded, not at runtime. So why should type stability matter?
On top of that, macros don’t have backedges, so they cannot cascade into invalidating other functions (to my chagrin).
It’s more a question of what they expand to than the macro itself. If the code they expand to has inference problems that wouldn’t be present without the macro, that’s when it becomes a problem. It’s not very common that a macro would change none of the dispatches in the code (if not, what is it doing?), even little things like ===
vs ==
can make a big difference for certain operations (e.g., Use === in more places to reduce invalidation by timholy · Pull Request #167 · FluxML/MacroTools.jl · GitHub).
Perhaps it used to inject calls to internal MacroTools methods and now doesn’t? I last looked at this many years ago so lots could have changed. I only responded here because I got pinged to explain the issue, not because I know anything about the current package state.
MacroTools is only utilities for pattern matching and recursive transformation of expressions. It doesn’t inject any of its own code into expressions as far as I know.
I’m not a MacroTools user, so this isn’t as easy for me as it would be for y’all. But I’m a big fan of just measuring things. Using an example from the manual:
julia> using MacroTools
julia> macroexpand(Main, :(@capture(:[1, 2, 3, 4, 5, 6, 7], [1, a_, 3, b__, c_])))
quote
#= /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/macro.jl:66 =#
a = MacroTools.nothing
b = MacroTools.nothing
c = MacroTools.nothing
#= /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/macro.jl:67 =#
var"#25#env" = MacroTools.trymatch($(Expr(:copyast, :($(QuoteNode(:([1, a_, 3, b__, c_])))))), $(Expr(:copyast, :($(QuoteNode(:([1, 2, 3, 4, 5, 6, 7])))))))
#= /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/macro.jl:68 =#
if var"#25#env" === MacroTools.nothing
#= /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/macro.jl:69 =#
false
else
#= /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/macro.jl:71 =#
a = MacroTools.get(var"#25#env", :a, MacroTools.nothing)
b = MacroTools.get(var"#25#env", :b, MacroTools.nothing)
c = MacroTools.get(var"#25#env", :c, MacroTools.nothing)
#= /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/macro.jl:72 =#
true
end
end
julia> methods(MacroTools.trymatch)
# 1 method for generic function "trymatch" from MacroTools:
[1] trymatch(pat, ex)
@ ~/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:112
julia> using MethodAnalysis, JET
julia> mis = methodinstances(MacroTools.trymatch)
2-element Vector{Core.MethodInstance}:
MethodInstance for MacroTools.trymatch(::Expr, ::Any)
MethodInstance for MacroTools.trymatch(::Expr, ::Expr)
julia> report_opt(mis[2])
[ Info: tracking Base
═════ 7 possible errors found ═════
┌ trymatch(pat::Expr, ex::Expr) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:113
│┌ match(pat::Expr, ex::Expr) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:108
││┌ match(pat::Expr, ex::Expr, env::Dict{Any, Any}) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:99
│││┌ normalise(ex::Expr) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:88
││││┌ unblock(ex::Expr) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/utils.jl:122
│││││ runtime dispatch detected: unblock(%11::Any)::Any
││││└────────────────────
││┌ match(pat::Expr, ex::Expr, env::Dict{Any, Any}) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:101
│││┌ store!(env::Dict{Any, Any}, name::Symbol, ex::Expr) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:20
││││┌ haskey(h::Dict{Any, Any}, key::Symbol) @ Base ./dict.jl:569
│││││┌ ht_keyindex(h::Dict{Any, Any}, key::Symbol) @ Base ./dict.jl:275
││││││┌ ==(w::Symbol, v::WeakRef) @ Base ./gcutils.jl:36
│││││││ runtime dispatch detected: isequal(w::Symbol, %1::Any)::Bool
││││││└────────────────────
│││┌ store!(env::Dict{Any, Any}, name::Symbol, ex::Expr) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:21
││││┌ assoc!(d::Dict{Any, Any}, k::Symbol, v::Expr) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/utils.jl:9
│││││┌ setindex!(h::Dict{Any, Any}, v::Expr, key::Symbol) @ Base ./dict.jl:392
││││││┌ ht_keyindex2_shorthash!(h::Dict{Any, Any}, key::Symbol) @ Base ./dict.jl:335
│││││││┌ rehash!(h::Dict{Any, Any}, newsz::Int64) @ Base ./dict.jl:194
││││││││ runtime dispatch detected: Base.hashindex(%236::Any, %19::Int64)::Tuple{Int64, UInt8}
│││││││└────────────────────
│││┌ store!(env::Dict{Any, Any}, name::Symbol, ex::Expr) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:20
││││ runtime dispatch detected: (%12::Any MacroTools.:(==) ex::Expr)::Any
│││└────────────────────
│││┌ store!(env::Dict{Any, Any}, name::Symbol, ex::Expr) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:20
││││ runtime dispatch detected: MacroTools.:!(%14::Any)::Any
│││└────────────────────
││┌ match(pat::Expr, ex::Expr, env::Dict{Any, Any}) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:104
│││┌ store!(env::Dict{Any, Any}, name::Symbol, ex::Vector{Any}) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:20
││││ runtime dispatch detected: (%12::Any MacroTools.:(==) ex::Vector{Any})::Any
│││└────────────────────
││┌ match(pat::Expr, ex::Expr, env::Dict{Any, Any}) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:105
│││ runtime dispatch detected: MacroTools.match_inner(%91::Any, %92::Any, env::Dict{Any, Any})::Any
││└────────────────────
Does that really need to be a Dict{Any,Any}
or could it be something more specific? Are there type-asserts you can add? Do the returned Union
s have to be so complicated?
For people who want to fix the problems with MacroTools, these are really easy things to check. Fixing them could be harder, but until you actually look you just don’t know.
I think you’re missing the point, here: @capture
is something that is intended to be used in the body of a macro, not in the generated code. An example they gave in the manual of a typical pattern is:
macro foo(ex)
postwalk(ex) do x
@capture(x, some_pattern) || return x
return new_x
end
end
Usually, code operating on expressions (Expr
) is type-unstable, because expressions contain a Vector{Any}
.
In any case, none of the @capture
-generated code executes at runtime, so what does a Dict{Any,Any}
matter?
Yes, if it never appears in the final code, agreed.
The reason I initially reported issues with MacroTools is not because I was checking it directly: it’s because ages ago (back when I was working on reducing latency in the SciML stack in preparation for Julia 1.8), an invalidation hunt in widely-used packages traced a substantial number of invalidations to MacroTools. I made 2 PRs trying to fix it before deciding it was a bigger job than I was prepared to tackle. I think SciML responded by reducing their usage of MacroTools. That’s really all I know. And MacroTools may have changed since.
Correct. They’re using MLStyle now. I’d be interested to know if it lead to any measurable difference loading/precompilation time. @cryptic.ax ?
It hasn’t.
Thank you for the examples and workflow, that’s a great starting point. I feel like working on Expr
trees is unavoidably type-unstable, but I’m sure it can be improved.
We still have MacroTools in a couple places iirc. I do not have any timing numbers on me for what MLStyle costs, so can’t comment there, but we’re also gradually shifting to Moshi/ExproniconLite now.
I feel like working on
Expr
trees is unavoidably type-unstable, but I’m sure it can be improved.
Is it, but this issue has also been faced repeatedly in Julia’s type inference and in JuliaInterpreter. In many cases you know what kinds of objects go in which “slots” of a Vector{Any}
and can add type-asserts or if isa(x, Symbol)
and eliminate inference failures that way. The more restrictions you can place on what goes into those Exprs, the more often you can eliminate inference failures.
That’s less likely to be directly the issue. Instead the issue is this: a package developer might do a nice thing for their users and add a bunch of @compile_workload
s to minimize TTFX. For the purposes of discussion, let’s say that
using CoolPackage
do_something_cool(args...)
initially had a TTFX (for the do_something_cool
call) of 15s, and thanks to those @compile_workload
s the package maintainer was able to knock that down to just 0.2s. Big win! However, if you have invalidation risk then all that lovely precompiled code may have to be thrown out if your users do this instead:
using CoolPackage
using InlineStrings, BFloat16
do_something_cool(args...)
If the code underlying do_something_cool
(and all its callees) gets invalidated by loading those extra packages, suddenly you might be right back up to a 15s TTFX. Terrible outcome. But it depends on the specific vulnerabilities in CoolPackage
and the specific extra packages you load.
Ecosystem composability is the main reason it’s worth eliminating invalidation risk wherever you can.
If type-unstable code working with Expr
trees, or using MacroTools, only appears in macro bodies, then no invalidations will occur for the generated & precompiled code.
Code doing runtime evaluation of Expr
trees should be quite unusual — mainly interactive interpreter code (like the REPL or IJulia), parsers like JuliaSyntax, and other parts of a compiler/interpreter toolchain.