Macro expanding to macro: @macroexpand1 works while @macroexpand does not

While doing some repetitive data evaluation I had the idea to write a simple templating macro to reduce the boilerplate a bit. However my solution does not work and I don’t understand why. Moreover it in fact does work, if I @macroexpand1 the expression once and then evaluate. Here is a MWE:

using DataFrames, DataFramesMeta, Statistics

df = DataFrame(:A => [rand(2,3,4), rand(2,3,4)])

## what I want to do: compute mean and variance of a matrix-valued column
## ofc in the real application there are many columns to be transformed this way...
@rtransform(df, :A_mean = mean(:A), :A_var = var(:A)) # works

## macro-solution
macro mytrafo(df, field)
    # need to use these "Meta.quot" to get the ":" in front of symbols
    avg_clause = :($(Meta.quot(Symbol(field,"_mean"))) = mean($(Meta.quot(field))))
    var_clause = :($(Meta.quot(Symbol(field,"_var"))) = var($(Meta.quot(field))))
    return quote
        @rtransform($df, $avg_clause, $var_clause)
    end |> esc
end

@mytrafo(df, A) # ArgumentError: Malformed expression on LHS in DataFramesMeta.jl macro
eval(@macroexpand1 @mytrafo(df, A)) # works?!
@macroexpand @mytrafo(df, A) # ArgumentError: Malformed expression on LHS in DataFramesMeta.jl macro

The result from @macroexpand looks a bit off but actually works when copy-pasting:

julia> @macroexpand1 @mytrafo(df, A)
quote
    @rtransform df :A_mean = mean(:A) :A_var = var(:A)
end
julia> @rtransform df :A_mean = mean(:A) :A_var = var(:A)
# works
Full error message
julia> @macroexpand @mytrafo(df, A)
ERROR: ArgumentError: Malformed expression on LHS in DataFramesMeta.jl macro
Stacktrace:
 [1] fun_to_vec(ex::Expr; gensym_names::Bool, outer_flags::NamedTuple{(Symbol("@byrow"), Symbol("@passmissing"), Symbol("@astable")), Tuple{Base.RefValue{Bool}, Base.RefValue{Bool}, Base.RefValue{Bool}}}, no_dest::Bool)
   @ DataFramesMeta ~/.julia/packages/DataFramesMeta/MrIOy/src/parsing.jl:378
 [2] fun_to_vec
   @ ~/.julia/packages/DataFramesMeta/MrIOy/src/parsing.jl:314 [inlined]
 [3] (::DataFramesMeta.var"#44#45"{NamedTuple{(Symbol("@byrow"), Symbol("@passmissing"), Symbol("@astable")), Tuple{Base.RefValue{Bool}, Base.RefValue{Bool}, Base.RefValue{Bool}}}})(ex::Expr)
   @ DataFramesMeta ./none:0
 [4] iterate(::Base.Generator{Vector{Any}, DataFramesMeta.var"#44#45"{NamedTuple{(Symbol("@byrow"), Symbol("@passmissing"), Symbol("@astable")), Tuple{Base.RefValue{Bool}, Base.RefValue{Bool}, Base.RefValue{Bool}}}}})
   @ Base ./generator.jl:47
 [5] rtransform_helper(::Symbol, ::Expr, ::Vararg{Expr})
   @ DataFramesMeta ~/.julia/packages/DataFramesMeta/MrIOy/src/macros.jl:1599
 [6] var"@rtransform"(__source__::LineNumberNode, __module__::Module, x::Any, args::Vararg{Any})
   @ DataFramesMeta ~/.julia/packages/DataFramesMeta/MrIOy/src/macros.jl:1638
 [7] #macroexpand#63
   @ ./expr.jl:119 [inlined]
 [8] top-level scope
   @ REPL[15]:1

So my core questions here are:

  1. Why does @macroexpand1 work and @macroexpand fail? In my mental model, the latter just repeatedly calls the former until converged. I know that macros are expanded inside out, but precedence should never matter here, since at first there is only my macro and then only the macro from DataFramesMeta.jl
  2. How do I work around this? (Maybe clearer with the answer to the question above)

I am probably overlooking something simple, but the @macroexpand(1) thing really throws me off :sweat_smile:

EDIT: Happens both under my 1.9.3 and a freshly downloaded 1.10.0

I’m not sure I really understand your code, to be honest. Maybe someone with a better understanding of Meta.quot can help out.

But I am curious why you are looking for a macro-based solution.

DataFramesMeta.jl is really a thin wrapper around the excellent and robust DataFrames.jl transformation API, where you see stuff like transform(df, src => fun => dest). Attempts to do very complicated stuff programatically should use this framework.

Or is the issue that you want to write A instead of :A? If that’s the case, I would suggest you learn to love :A instead of A for it’s clarity in distinguishing whats a column in a data frame versus a local variable.

1 Like

You are probably right that the easiest solution should perhaps target DataFrames.jl directly. When I started with DataFrames I rather skipped learning much of its syntax and directly went to DataFramesMeta.jl because I find its syntax much more convenient :smiley:

I went for the macro-based solution since I knew exactly what output code I wanted and it really is just filling in a very simple template. So I wanted to save myself some copy-pasting. I think for the purpose a one-shot a small macro is not unreasonable - at least in Common Lisp it would be totally acceptable :laughing: (and no I don’t care about : :wink: actually the a bit more complete and complex version of this MWE takes keywords instead of naked symbols)

This here would work for my purposes :slight_smile:

macro mytrafo2(df, field)
	avg_clause = :($(Meta.quot(field)) => ByRow(x->mean(x;dims=(2,3))) => $(Meta.quot(Symbol(field,"_mean"))))
	var_clause = :($(Meta.quot(field)) => ByRow(x->var(x;dims=(2,3))) => $(Meta.quot(Symbol(field,"_var"))))
	return quote
		transform($df, $avg_clause, $var_clause)
	end |> esc
end

I would see how far you can get using $ to escape variables in DataFramesMeta.jl. I spent some time trying to mess with Meta.quot and having a macro call @transform and I didn’t get anywhere. It’s not recommended.

I think you might be stuck in lisp-mode a little bit. Working with expressions is fine and all in Julia, but it’s no substitute for passing functions around. I would strongly recommend this instead.

julia> function fun(df, x)
           @transform df $"$(x)_mean" = mean($x) 
       end;

The outer $ is DataFramesMeta.jl-specific interpolation. The inner $ is normal Julia string interpolation.

Or the vanilla data frames version instead

julia> function fun(df, x)
           transform(df, x => mean => "$(x)_mean")
       end;

Notice that both of these are functions, not macros. This is good.

1 Like

Also, are you sure you want mean to be working with ByRow? That would only work if each element of :A is itself a collection, like a vector.

1 Like

A very interesting. I missed that DataFramesMeta.jl has some interpolation capacity and also that I can address columns with their names as String. With this I can at least do the analysis iteratively per field which should be fine. My datasets are not so huge that performance is a problem so no point in worrying :slight_smile:

Yes I have matrices in my DataFrame :sweat_smile: (I simulate quantum systems and get a time trace for each set of parameters)

You probably have a point there :slight_smile: I thought I understood how Julia’s evaluation works well enough :sweat_smile:

Thanks a lot that solves my problem nicely from an engineering point of view. Now it’s just the academic in my left wondering why the macro did not work and why @macroexpand1 makes a difference…

Using Strings or anything else should not impact performance. DataFramesMeta.jl should have excellent performance, since everything parses down to normal Julia functions and uses function barriers for inference. There should be no performance considerations beyond the actual meat of your code.

1 Like