Macro utilities for working with function definitions in 2025

What are the recommended tools these days for working with function definition expressions in macros? I’m aware of ExprTools.jl and MacroTools.jl, but I don’t know how maintained they are and whether those packages or other packages are recommended these days.

Hi! Maintainer of MacroTools.jl and author of splitdef/combinedef here. MacroTools is not actively developed ATM, but it’s been around for a long time, and basically it does what it says it does? It’s one of the most used packages in the ecosystem, so the bugs have been ironed out.

ExprTools offers a splitdef/combinedef with very similar interfaces. There was talk to merge the two five years ago, but that didn’t happen. Not clear what the benefits are in 2025. I’m not aware of any outstanding bug in splitdef.

There’s also MLStyle.jl which can do it. It comes with a benchmark! The graph looks off, but looking at the data behind it, it seems to be that MLStyle is faster for small-ish expressions, but slower for big ones? But that’s only precompilation time, so pretty small potatoes unless you have a widely used macro. It’s used in the SciML space; that’s a good recommendation in its favor.

One important point regardless: if your macro is returning function definitions, consider using @qq begin instead of quote, so that line numbers are correct (then go-to-definition and stacktraces become much more useful).

2 Likes

Thanks @cstjean!

Some time ago I saw a comment on Github from a core developer that mentioned that MacroTools.jl has some issues with latency or invalidation or something like that. So, since then I’ve been somewhat hesitant to use MacroTools.jl. Do you have any insight into that matter?

What are you trying to do? (For a lot of simple things the built-in functions are enough.)

1 Like

I remember reading such a comment about it years ago from @tim.holy, and I still don’t get it. In any case, MacroTools is an indirect dependency of a big chunk of the ecosystem, so if there’s a major issue there we should fix it.

I can see how importing it might introduce undesirable methods that affect precompilation time, but I don’t see how using it can possibly be problematic.

I haven’t checked in ages, but in general terms the reason it was problematic was the same reason that CoreLogging was initially problematic: both injected poorly-inferred functions into your code. For example, if you write

function foo(x)
    x < 0 && @warn "expected $x to be positive"
    return x
end

that expands to

:(function foo(x)
      #= REPL[3]:1 =#
      #= REPL[3]:2 =#
      x < 0 && begin
              #= logging.jl:384 =#
              let
                  #= logging.jl:385 =#
                  var"#48#level" = Base.CoreLogging.Warn
                  #= logging.jl:387 =#
                  var"#49#std_level" = var"#48#level"
                  #= logging.jl:388 =#
                  if (var"#49#std_level").level >= (Base.Threads.Atomic{Int32}(-1000))[]
                      #= logging.jl:389 =#
                      var"#50#group" = Symbol("REPL[3]")
                      #= logging.jl:390 =#
                      var"#51#_module" = Main
                      #= logging.jl:391 =#
                      var"#52#logger" = (Base.CoreLogging.current_logger_for_env)(var"#49#std_level", var"#50#group", var"#51#_module")
                      #= logging.jl:392 =#
                      if !(var"#52#logger" === Base.CoreLogging.nothing)
                          #= logging.jl:393 =#
                          var"#53#id" = :Main_e4489fe9
                          #= logging.jl:396 =#
                          if Base.CoreLogging.invokelatest(Base.CoreLogging.shouldlog, var"#52#logger", var"#48#level", var"#51#_module", var"#50#group", var"#53#id")
                              #= logging.jl:397 =#
                              var"#54#file" = "REPL[3]"
                              #= logging.jl:398 =#
                              if var"#54#file" isa Base.CoreLogging.String
                                  #= logging.jl:399 =#
                                  var"#54#file" = (Base.CoreLogging.Base).fixup_stdlib_path(var"#54#file")
                              end
                              #= logging.jl:401 =#
                              var"#55#line" = 2
                              #= logging.jl:402 =#
                              local var"#56#msg", var"#57#kwargs"
                              #= logging.jl:403 =#
                              begin
                                      #= logging.jl:373 =#
                                      try
                                          #= logging.jl:374 =#
                                          var"#56#msg" = "expected $(x) to be positive"
                                          #= logging.jl:375 =#
                                          var"#57#kwargs" = (;)
                                          #= logging.jl:376 =#
                                          true
                                      catch var"#70#err"
                                          #= logging.jl:378 =#
                                          Base.invokelatest(Base.CoreLogging.logging_error, var"#52#logger", var"#48#level", var"#51#_module", var"#50#group", var"#53#id", var"#54#file", var"#55#line", var"#70#err", true)
                                          #= logging.jl:379 =#
                                          false
                                      end
                                  end && Base.CoreLogging.invokelatest(Base.CoreLogging.handle_message, var"#52#logger", var"#48#level", var"#56#msg", var"#51#_module", var"#50#group", var"#53#id", var"#54#file", var"#55#line"; var"#57#kwargs"...)
                          end
                      end
                  end
                  #= logging.jl:409 =#
                  Base.CoreLogging.nothing
              end
          end
      #= REPL[3]:3 =#
      return x
  end)

You’ll notice that the expanded code includes many function calls, some of which are made via invokelatest: this was because a lot of the code in CoreLogging was uninferrable, and poorly-inferred code is vastly more vulnerable to invalidation than well-inferred code. The nasty part was that if you loaded some package that invalidated code in CoreLogging, that percolated up through your code that made use of it. invokelatest forces runtime dispatch, and thus breaks the “chain of invalidation” that would otherwise percolate through the entire caller path.

We spent several days going over CoreLogging finding and fixing type inference where possible, and adding invokelatest where not. But I never did that for MacroTools, and I honestly don’t know whether someone else has done so in the meantime.

One way you can test it is schematized like this:

using MySmallPackageThatUsesMacroTools
exercise_my_package()   # forces compilation of a lot of your code
using SnoopCompileCore
invalidations = @snoop_invalidations using PkgA, PkgB, PkgC

Good candidates for PkgA etc are packages that extend Julia’s own methods: things like new number types (e.g., BFloat16.jl) or string types (e.g., InlineStrings.jl). Obviously you would need to first check that those packages aren’t already loaded by MySmallPackageThatUsesMacroTools or you’ll fool yourself into thinking you’re safe when you aren’t.

Another more comprehensive way is to use JET on your package, but you’d have to distinguish problems caused by MacroTools from ones that are not.

6 Likes

In normal usage, MacroTools functions are only evaluated at precompile-time, when macros are expanded, not at runtime. So why should type stability matter?

1 Like

On top of that, macros don’t have backedges, so they cannot cascade into invalidating other functions (to my chagrin).

It’s more a question of what they expand to than the macro itself. If the code they expand to has inference problems that wouldn’t be present without the macro, that’s when it becomes a problem. It’s not very common that a macro would change none of the dispatches in the code (if not, what is it doing?), even little things like === vs == can make a big difference for certain operations (e.g., Use === in more places to reduce invalidation by timholy · Pull Request #167 · FluxML/MacroTools.jl · GitHub).

Perhaps it used to inject calls to internal MacroTools methods and now doesn’t? I last looked at this many years ago so lots could have changed. I only responded here because I got pinged to explain the issue, not because I know anything about the current package state.

5 Likes

MacroTools is only utilities for pattern matching and recursive transformation of expressions. It doesn’t inject any of its own code into expressions as far as I know.

I’m not a MacroTools user, so this isn’t as easy for me as it would be for y’all. But I’m a big fan of just measuring things. Using an example from the manual:

julia> using MacroTools

julia> macroexpand(Main, :(@capture(:[1, 2, 3, 4, 5, 6, 7], [1, a_, 3, b__, c_])))
quote
    #= /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/macro.jl:66 =#
    a = MacroTools.nothing
    b = MacroTools.nothing
    c = MacroTools.nothing
    #= /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/macro.jl:67 =#
    var"#25#env" = MacroTools.trymatch($(Expr(:copyast, :($(QuoteNode(:([1, a_, 3, b__, c_])))))), $(Expr(:copyast, :($(QuoteNode(:([1, 2, 3, 4, 5, 6, 7])))))))
    #= /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/macro.jl:68 =#
    if var"#25#env" === MacroTools.nothing
        #= /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/macro.jl:69 =#
        false
    else
        #= /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/macro.jl:71 =#
        a = MacroTools.get(var"#25#env", :a, MacroTools.nothing)
        b = MacroTools.get(var"#25#env", :b, MacroTools.nothing)
        c = MacroTools.get(var"#25#env", :c, MacroTools.nothing)
        #= /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/macro.jl:72 =#
        true
    end
end

julia> methods(MacroTools.trymatch)
# 1 method for generic function "trymatch" from MacroTools:
 [1] trymatch(pat, ex)
     @ ~/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:112

julia> using MethodAnalysis, JET

julia> mis = methodinstances(MacroTools.trymatch)
2-element Vector{Core.MethodInstance}:
 MethodInstance for MacroTools.trymatch(::Expr, ::Any)
 MethodInstance for MacroTools.trymatch(::Expr, ::Expr)

julia> report_opt(mis[2])
[ Info: tracking Base
═════ 7 possible errors found ═════
┌ trymatch(pat::Expr, ex::Expr) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:113
│┌ match(pat::Expr, ex::Expr) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:108
││┌ match(pat::Expr, ex::Expr, env::Dict{Any, Any}) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:99
│││┌ normalise(ex::Expr) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:88
││││┌ unblock(ex::Expr) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/utils.jl:122
│││││ runtime dispatch detected: unblock(%11::Any)::Any
││││└────────────────────
││┌ match(pat::Expr, ex::Expr, env::Dict{Any, Any}) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:101
│││┌ store!(env::Dict{Any, Any}, name::Symbol, ex::Expr) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:20
││││┌ haskey(h::Dict{Any, Any}, key::Symbol) @ Base ./dict.jl:569
│││││┌ ht_keyindex(h::Dict{Any, Any}, key::Symbol) @ Base ./dict.jl:275
││││││┌ ==(w::Symbol, v::WeakRef) @ Base ./gcutils.jl:36
│││││││ runtime dispatch detected: isequal(w::Symbol, %1::Any)::Bool
││││││└────────────────────
│││┌ store!(env::Dict{Any, Any}, name::Symbol, ex::Expr) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:21
││││┌ assoc!(d::Dict{Any, Any}, k::Symbol, v::Expr) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/utils.jl:9
│││││┌ setindex!(h::Dict{Any, Any}, v::Expr, key::Symbol) @ Base ./dict.jl:392
││││││┌ ht_keyindex2_shorthash!(h::Dict{Any, Any}, key::Symbol) @ Base ./dict.jl:335
│││││││┌ rehash!(h::Dict{Any, Any}, newsz::Int64) @ Base ./dict.jl:194
││││││││ runtime dispatch detected: Base.hashindex(%236::Any, %19::Int64)::Tuple{Int64, UInt8}
│││││││└────────────────────
│││┌ store!(env::Dict{Any, Any}, name::Symbol, ex::Expr) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:20
││││ runtime dispatch detected: (%12::Any MacroTools.:(==) ex::Expr)::Any
│││└────────────────────
│││┌ store!(env::Dict{Any, Any}, name::Symbol, ex::Expr) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:20
││││ runtime dispatch detected: MacroTools.:!(%14::Any)::Any
│││└────────────────────
││┌ match(pat::Expr, ex::Expr, env::Dict{Any, Any}) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:104
│││┌ store!(env::Dict{Any, Any}, name::Symbol, ex::Vector{Any}) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:20
││││ runtime dispatch detected: (%12::Any MacroTools.:(==) ex::Vector{Any})::Any
│││└────────────────────
││┌ match(pat::Expr, ex::Expr, env::Dict{Any, Any}) @ MacroTools /home/tim/.julia/packages/MacroTools/Ar0jT/src/match/match.jl:105
│││ runtime dispatch detected: MacroTools.match_inner(%91::Any, %92::Any, env::Dict{Any, Any})::Any
││└────────────────────

Does that really need to be a Dict{Any,Any} or could it be something more specific? Are there type-asserts you can add? Do the returned Unions have to be so complicated?

For people who want to fix the problems with MacroTools, these are really easy things to check. Fixing them could be harder, but until you actually look you just don’t know.

3 Likes

I think you’re missing the point, here: @capture is something that is intended to be used in the body of a macro, not in the generated code. An example they gave in the manual of a typical pattern is:

macro foo(ex)
  postwalk(ex) do x
    @capture(x, some_pattern) || return x
    return new_x
  end
end

Usually, code operating on expressions (Expr) is type-unstable, because expressions contain a Vector{Any}.

In any case, none of the @capture-generated code executes at runtime, so what does a Dict{Any,Any} matter?

2 Likes

Yes, if it never appears in the final code, agreed.

The reason I initially reported issues with MacroTools is not because I was checking it directly: it’s because ages ago (back when I was working on reducing latency in the SciML stack in preparation for Julia 1.8), an invalidation hunt in widely-used packages traced a substantial number of invalidations to MacroTools. I made 2 PRs trying to fix it before deciding it was a bigger job than I was prepared to tackle. I think SciML responded by reducing their usage of MacroTools. That’s really all I know. And MacroTools may have changed since.

2 Likes

Correct. They’re using MLStyle now. I’d be interested to know if it lead to any measurable difference loading/precompilation time. @cryptic.ax ?

It hasn’t.

Thank you for the examples and workflow, that’s a great starting point. I feel like working on Expr trees is unavoidably type-unstable, but I’m sure it can be improved.

1 Like

We still have MacroTools in a couple places iirc. I do not have any timing numbers on me for what MLStyle costs, so can’t comment there, but we’re also gradually shifting to Moshi/ExproniconLite now.

1 Like

I feel like working on Expr trees is unavoidably type-unstable, but I’m sure it can be improved.

Is it, but this issue has also been faced repeatedly in Julia’s type inference and in JuliaInterpreter. In many cases you know what kinds of objects go in which “slots” of a Vector{Any} and can add type-asserts or if isa(x, Symbol) and eliminate inference failures that way. The more restrictions you can place on what goes into those Exprs, the more often you can eliminate inference failures.

That’s less likely to be directly the issue. Instead the issue is this: a package developer might do a nice thing for their users and add a bunch of @compile_workloads to minimize TTFX. For the purposes of discussion, let’s say that

using CoolPackage
do_something_cool(args...)

initially had a TTFX (for the do_something_cool call) of 15s, and thanks to those @compile_workloads the package maintainer was able to knock that down to just 0.2s. Big win! However, if you have invalidation risk then all that lovely precompiled code may have to be thrown out if your users do this instead:

using CoolPackage
using InlineStrings, BFloat16
do_something_cool(args...)

If the code underlying do_something_cool (and all its callees) gets invalidated by loading those extra packages, suddenly you might be right back up to a 15s TTFX. Terrible outcome. But it depends on the specific vulnerabilities in CoolPackage and the specific extra packages you load.

Ecosystem composability is the main reason it’s worth eliminating invalidation risk wherever you can.

2 Likes

If type-unstable code working with Expr trees, or using MacroTools, only appears in macro bodies, then no invalidations will occur for the generated & precompiled code.

Code doing runtime evaluation of Expr trees should be quite unusual — mainly interactive interpreter code (like the REPL or IJulia), parsers like JuliaSyntax, and other parts of a compiler/interpreter toolchain.