Using variables from call space in macro return execution

TL DR: I have a macro in an external module, how do I enable the expression it executes after “return” to see variables declared in the scope in which the macro is called? Background at the end if you are interested.

I have this module

module test_module
export @test_macro

 macro test_macro(ex)
    return :(x -> $ex)
end
end

,basically the macro returns a function corresponding to the input expression. If I then have this code:

using test_module
f(x) = 2x
a = 2

f1 = @test_macro(2x)
f2 = @test_macro(f(x))
f3 = @test_macro(a*x)
println(f1(2))
println(f2(2))
println(f3(2))

I would expect all to print the value 4. However only the first does, the other two gives me messages UndefVarError: f not defined and UndefVarError: a not defined. How would I write it if I want the macro to use code in the scope were I am executing it?

If I instead define the macro in the same scope it works fine:

macro test_macro(ex)
    return :(x -> $ex)
end
f(x) = 2x
a = 2

f1 = @test_macro(2x)
f2 = @test_macro(f(x))
f3 = @test_macro(a*x)
println(f1(2))
println(f2(2))
println(f3(2))

writes

4
4
4

, just as I would expect.

(Background: )
I have a program which lets me input chemical reaction networks and then returns systems of differential equations to be used for simulating the network, after some initial interpretation of the input it returns:

return :(network_model($f, $g, $jumps))

(newtork_model is a structure containing functions f and g as well as a tuple of constant rate jumps.)
When I define the reaction system some of the reaction rates might be written as k_d1 etc. However when the macro returns the functions they cannot see the values of the variables (like k_d1). However if I rather than having a macro have a function which returns f, g and jumps as expressions, and then executes the expressions using eval they can include variables like k_d1. But there should be a way to do this in the macro without using a function and eval?

This is related to macro hygiene.

In your case, to get the answer you expect you should write the macro as follows:

macro test_macro(ex)
    return esc(:(x -> $ex))
end

You will have to be more careful when escaping expressions in a macro like this. I would carefully read the section in the Julia manual about metaprogramming which I linked to above, it is well explained there.

Don’t esc the whole thing. That’ll not do the right thing is ex is :x. esc on the ex only instead.

If you escape only on ex, for the first example (f1 = @test_macro(2x)) you get this result

ERROR: LoadError: UndefVarError: x not defined

The @test_macro is basically fundamentally un-hygiene. It’s also pretty much useless and the real macro shouldn’t do this.

Fair enough. I was simply trying to answer the question as posed.

It probably isn’t the best use of a macro, but I have run in to problems that could have been simplified into something very similar.

I have read that section many, many times, and I still don’t have a good mental model of macro hygiene in Julia. And I have programmed for years in Common Lisp, and used macros heavily. But I still run into surprises. I have seen many experienced Julia programmers also struggle with hygiene bugs (which I occasionally discover in their packages).

I mostly get it “right” (= until I find bugs) by trial and error. I have a vague idea that esc somehow protects from the rewriter mechanism, but it is unclear how to apply it in some cases, it seems to prefer specific parts of expressions.

A detailed tutorial to the topic is sorely needed. The best book on Common Lisp has two chapters on macros, even though CL macros are conceptually much simpler than Julia macros because you just juggle S-expressions, and the language does not try to do hygiene for you.

2 Likes

I still run into surprises. I have seen many experienced Julia programmers also struggle with hygiene bugs (which I occasionally discover in their packages).

Aye. There is some “magic” to esc. The issue I mentioned in my reply above is an example where I definitely struggled in spite of thinking I understood hygiene. So perhaps my statement about it being well explained was wrong. That said, in many simple applications of macros, I have found the manual to be really helpful.

A detailed tutorial to the topic is sorely needed.

Agreed!

When I read about macros and hygiene I did not understand anything I could use in this problem
my take was that I could use local when declaring variable in the return expressions (to be seen in the call scope).
However it always took the form

local variable_name = variable value

and since it was not creating new variables in the local scope, but rather using existing ones, I did not understand how to use it.
I tried

return quote
   local :(network_model($f, $g, $jumps))
end

however this yielded error syntax: invalid syntax in "local" declaration
just as

return quote
   local output =  :(network_model($f, $g, $jumps))
end

returned UndefVarError: f not defined

In a similar way I was confused by esc, it was initially introduced by saying “Therefore we must arrange for the code in ex to be resolved in the macro call environment”, which I thought was the exact opposite of what I wanted. I re checked it now and you are right that they also show esc being used in a pure return statement, although the text is to short to fully understand what is going on.

Apply it to all user input once and only once.

1 Like

Gaussia. I don’t understand your goal very well. Can you give a concrete example. I.e. one specific chemical equation you would want a user of your macro to input, and the expected behavior of the macro wrt that equation?

Yes, but consider

module Test
using MacroTools
macro foo(ex)
    @capture(ex, f_(args__)) || error("Expected a function call")
    quote
        bar($(esc(args)...))
    end
end
bar(x...) = sum(x)              # inane example
end

Then

julia> macroexpand(:(Test.@foo g(a, b, c)))
:($(Expr(:error, MethodError(start, (:($(Expr(:escape, Any[:a, :b, :c]))),), 0x00000000000055a2))))

I now know (suspect?) that the right way is

bar($(map(esc, args)...))

but there must be more to it than your maxim above. Which is not in the manual. And neither is what I learned above.

1 Like

Input:

using reaction_reader
a = 2.0
hill(x,n,v,K) = v(x^n)/(K^n+x^n)
nm = @read_network begin
    (a,hill(XY,2,3,1)), X + Y ⟷ XY
end

this is a very simple reaction network were X and Y associates to XY at the rate a(*[Y]) and XY disassociates to X and Y, and this dissociations is activated by itself (modelled by a hill function, it will also be proportional to its own concentration).

the macro looks something like this:

module reaction_reader
using DifferentialEquations
export @read_network
export network_model

struct network_model
    f::Function
    g::Function
    jumps::Tuple{ConstantRateJump,Vararg{ConstantRateJump}}
end

macro read_network(ex::Expr)
    reactions = get_reactions(ex)  
    reactants = get_reactants(reactions)

    f = recursive_equify!(get_f(reactions, reactants), reactants)       ::Expr
    g = recursive_equify!(get_g(reactions, reactants), reactants)       ::Expr
    jumps = recursive_equify!(get_jumps(reactions, reactants), reactants)  ::Expr
    return :(network_model($f, $g, $jumps))
end

here f, g and jumps are expressions describing what I need for making deterministic, stochastic and guillespie simulations of the system, respectively (to be used like: prob = ODEProblem(nm.f,[10.0,5.0,3.0],(0.0,5.0)). If I evaluate them they are anonymous functions.

If I look at e.g. the expression f it would look something like:
f = :((t,u,du) →
du[1] = -(a * u[1] * u[2]) + hill(u[3], 2, 3, 1) * u[3]
du[2] = -(a * u[1] * u[2]) + hill(u[3], 2, 3, 1) * u[3]
du[3] = a * u[1] * u[2] + -(hill(u[3], 2, 3, 1) * u[3])
(ignore that that is not actually correct syntax, the syntax is correct, I can fetch f as an expression via a function and write f_fun = eval(f) and then I can use f_fun as I want and expect)

However what happens now is that when I do

using reaction_reader
using DifferentialEquations
a = 2.0
hill(x,n,v,K) = v(x^n)/(K^n+x^n)
nm = @read_network begin
    (a,hill(XY,2,3,1)), X + Y ⟷ XY
end

prob = ODEProblem(nm.f,[10.0,1.0,1.0],(0.0,5.0))
sol = solve(prob,reltol=1e-6,save_everystep=false)

I get the error UndefVarError: a not defined when I try to solve the equation

I’m still a little lost. In this code

using reaction_reader
a = 2.0
hill(x,n,v,K) = v(x^n)/(K^n+x^n)
nm = @read_network begin
    (a,hill(XY,2,3,1)), X + Y ⟷ XY
end

What do you want the value of nm to be? If you had to create a network_model by hand (without the macro) how would you write that code? Does it work? (Make sure it does work before trying to fix your macro). (Also, FYI, normally you would use NetworkModel, as that clearly marks the object as a type, while network_model would be a function or variable).

Oh! I think I just figured it out:

So, to simplify: you want your macro to generate a function f(x) = a * otherstuff, where a is a global variable. Right?

Hygiene turns the variable a into some unique variable, ##a#blablabla. to avoid unintentional variable capture. (e.g. a macro that wants to use an internally relevant quantity, x and the surrounding context of the macro just so happens to have another variable named x, unrelated to the variable the macro is using). However, you want the expression to capture a from the surrounding context. So you need to escape a. In essence, anywhere you use $a in your macro, it should be replaced with $(esc(a)). Things can get very complicated with variable capture, as Tamas_Papp is showing above. It’s possible you may want to find a solution that doesn’t involve a macro, and make your life simpler.

A warning: global variables are slow. You ultimately will probably want to wrap the call to @read_network in a function so that a can be passed as an argument within that function and then used within the macro. This would avoid the slowness, and danger, of globals.

So the (very) simple alternative would be, rather than as now having a macro generating an executing the line

:(network_model($f, $g, $jumps))

(as after the return in the macro)
I could have a function that generates and returns the expression

expr = :(network_model($f, $g, $jumps))

then I could use eval to get my model structure as

nm = eval(expr)

(or even simpler:

nm = eval(read_network(reaction syntax))

Now I am slightly unsure of the “goodness” of using eval. I think it would must certainly give me what I would want, with minimal change in my code, but would it avoid the slowness that you warn me about (I think there is something about eval executing in the global scope?).

If you want to use a macro, I believe what you need is to escape the function body. E.g. f = esc(recursive_equify!(get_f(.... But… that might lead to problematic capture of variables you don’t want (i.e. if you use an variables internally to the code you generate for f).

This is obviously wrong and is not what I said at all.

I guess I should be more explicit and say that escape all parts of user input once and only once before splicing them into expressions. Subexpressions that does not need processing can be escaped as a whole.

Yes. Just to be clear, my problem is not with what you said, but that macro hygiene in Julia depends on implicit rules which are underdocumented, or not documented at all.

People who worked on macroexpand.scm (which of course includes you) know the rules and may find them obvious, others discover them by trial and error and asking on the forum. At some point detailed documentation with lots examples would be great.