Applying a macro to a variable

I created a macro to generate a simple sequence:

macro seq(n::Int64) # n is degree
    f = []
    I = Vector{Int}()
    if n < 0
        print("The input must be equal to or bigger than zero.") 
    else  
        for i in 0:n
            push!(I, i)
        end
    end
    I
end

I can thus execute

@seq 5

and it will return [0,1,2,3,4,5]. However, I will to apply the macro to a variable:

   v = 5
   @seq v

This does not work. I read several explanations, but I still do not understand. Any help is appreciated. Thanks.

In your example, a macro does not seem suitable: it should be a function instead. The purpose of a macro is not to operate with variables, but to analyze the expressions that are passed as arguments, and replace them by another (usually more complicated) expression based on them.

The problem that you see happens because @seq 5 and @seq v are not evaluated in runtime, but in parsetime. That means that the 5 or v that go right after @seq are not taken as the objects β€œmeant” by 5 or v in runtime, but as the expressions :5 and :v, respectively. In the first case that’s the number 5 (for which everything looks as if it worked fine), and in the latter it’is the symbol :v (where the comparison n < 0 should fail).

3 Likes

@seq v sees v as a Symbol and not as an Integer, thus, your macro definition does not apply.

Also: Because macros are evaluated at compile time you cannot make use of the value that is assigned to it before the macro, because it could be anything before the function is compiled (of course in this case it is predictable, but in general it might not).

If you want to make it work you will have to write the macro such that it returns
[ i for i in 0:v ]. Note that this will be then compiled and evaluate at runtime.

1 Like

Here is what I am after, and this works:

macro IJ(n::Symbol) # n is degree
    return esc(quote
        f = []
        I = Vector{Int}()
        J = Vector{Int}()
        if $n < 0
            print("The input must be equal to or bigger than zero.") 
        else  
            for i in 0:$n
            for j in 0:$n
                if i+j <= $n
                    push!(I, i)
                    push!(J, j)
                end
            end
            end
        end
        (I, J)
    end)
end

The reason I need this is because I am using Zygote with Tullio.jl to differentiate polynomials dynamically evaluated at many points, and the degree is not known at compile time. Furthermore, the polynomial is multi-dimensional. I wish to create multi-indexed sequences that very precisely tell me which terms of the polynomial should be included. Making things more complicated, is that mutations are not allowed. There is no way to construct my sequences with a function without using mutation, from what I can ascertain.

The following works:

function test()
    n = 5
    I, J = @IJ n
    println("I: ", I)
    println("J: ", J)
end

test()

I know little about Zigotte and Tullio, so I don’t know if such a macro is the suitable solution for the problem. In any case, let me give some suggestions about that particular macro:

  • The array f = [] does not seem to do anything, so perhaps that line should be removed.
  • escaping the whole expression returned by a macro has some dangers, because the symbols defined in it (I, J, and the seemingly superfluous f) will be visible in the calling scope, and that can create perplexing situations, e.g.:
using LinearAlgebra

function test()
    n = 5
    X, Y = @IJ n
    @IJ n
    # that's ok:
    println("X: ", X)
    println("Y: ", Y)
    # now let's use the identity matrix from `LinearAlgebra` (`I`):
    println(I * [1,2,3])
end
julia> test()
X: [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 4, 4, 5]
Y: [0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 0, 1, 2, 3, 0, 1, 2, 0, 1, 0]
ERROR: MethodError: no method matching *(::Vector{Int64}, ::Vector{Int64})

This is a safer solution, escaping only the symbol that is introduced as argument:

macro IJ(n) # n is degree
    n = esc(n)  # <- `n` won't be replaced by an obfuscated name
    return quote
        I = Vector{Int}() # <- `I` will be obfuscated in the code made by the macro
        J = Vector{Int}() # <- Same for `J`
        if $n < 0
            print("The input must be equal to or bigger than zero.") 
        else  
            for i in 0:$n
            for j in 0:$n
                if i+j <= $n
                    push!(I, i)
                    push!(J, j)
                end
            end
            end
        end
        I, J
    end
end
1 Like

Thank you! The f() was a remnant from a macro I copy. The suggestions regarding the esc and quote are very useful.

Why not just use a normal function? Nothing’s actually being computed in the macroβ€”there’s no actual expression manipulation happening. Might as well just use a function.

julia> @inline function IJ(n) # n is degree
           I, J = Int[], Int[]
           @assert n β‰₯ 0 "The input must be equal to or greater than zero."
           for i = 0:n, j = 0:n
               if i+j ≀ n
                   push!(I, i)
                   push!(J, j)
               end
           end
           return I, J
       end
IJ (generic function with 1 method)

julia> @btime (n -> @IJ n)(5);
  231.390 ns (6 allocations: 960 bytes)

julia> @btime (n -> IJ(n))(5);
  227.609 ns (6 allocations: 960 bytes)

julia> let n=5; IJ(n) == @IJ(n) end
true

There are of course optimizations due to the structure of the problem. But no point in using a macro where a macro doesn’t help.

A macro is necessary because the push! command implies mutability, which is not allowed by Zygote. Macros were not my first choice.

I don’t know Zygote so I don’t know if this avoids your mutability problem or not, but would a comprehension be allowed?

I = [i for j = 0:n, i = 0:n if i+j <= n]
J = [j for j = 0:n, i = 0:n if i+j <= n]

This specific i+j <= n condition can be done without an explicit conditional:

I = [i for i = 0:n for j = 0:n-i]
J = [j for i = 0:n for j = 0:n-i]

These still won’t work for macros but might work within a function.

1 Like

We can also construct Tuples, which can be faster:

julia> IJ(n) = [i for i = 0:n for j = 0:n-i], [j for i = 0:n for j = 0:n-i]
IJ (generic function with 1 method)

julia> @btime IJ(5);
  257.353 ns (8 allocations: 1.06 KiB)

julia> IJ2(n) = ((i for i = 0:n for j = 0:n-i)...,), ((j for i = 0:n for j = 0:n-i)...,)
IJ2 (generic function with 1 method)

julia> @btime IJ2(5);
  4.300 ns (0 allocations: 0 bytes)

though you have to avoid the Tuple constructor because it’s type-unstable:

julia> IJ3(n) = Tuple(i for i = 0:n for j = 0:n-i), Tuple(j for i = 0:n for j = 0:n-i)
IJ3 (generic function with 1 method)

julia> @btime IJ3(5);
  1.900 ΞΌs (11 allocations: 1.77 KiB)

The reason this is fast is because the 5 is constant-propagated so it knows the exact size of the tuple at compile time. If you properly interpolate into the benchmark as a runtime value it’s much worse

julia> @btime IJ2($5);
  2.611 ΞΌs (91 allocations: 5.81 KiB)

and will leave further type instability in its wake.

It’s not that the Tuple constructor itself is to blame, it’s that it appears to be blocking the constant propagation. This is probably because IJ3 is using less inlining than IJ2.

1 Like

I think a MWE would be really helpful here. Maybe I’m just lacking imagination, but I can’t quite figure out why the code that generates indices needs to be differentiated in the first place.

Furthermore, the proposed macros (except for the one in the opening post) are equivalent to inlined functions.

For example
julia> f() = (@IJ 3)
f (generic function with 1 method)

julia> f()
([0, 0, 0, 0, 1, 1, 1, 2, 2, 3], [0, 1, 2, 3, 0, 1, 2, 0, 1, 0])

julia> @code_lowered f()
CodeInfo(
1 ──       Core.NewvarNode(:(@_2))
β”‚    %2  = Core.apply_type(Main.Vector, Main.Int)
β”‚          I = (%2)()
β”‚    %4  = Core.apply_type(Main.Vector, Main.Int)
β”‚          J = (%4)()
β”‚    %6  = 3 < 0
└───       goto #3 if not %6
2 ──       Main.print("The input must be equal to or bigger than zero.")
└───       goto #11
3 ── %10 = 0:3
β”‚          @_2 = Base.iterate(%10)
β”‚    %12 = @_2 === nothing
β”‚    %13 = Base.not_int(%12)
└───       goto #11 if not %13
4 ┄─ %15 = @_2
β”‚          i = Core.getfield(%15, 1)
β”‚    %17 = Core.getfield(%15, 2)
β”‚    %18 = 0:3
β”‚          @_5 = Base.iterate(%18)
β”‚    %20 = @_5 === nothing
β”‚    %21 = Base.not_int(%20)
└───       goto #9 if not %21
5 ┄─ %23 = @_5
β”‚          j = Core.getfield(%23, 1)
β”‚    %25 = Core.getfield(%23, 2)
β”‚    %26 = i + j
β”‚    %27 = %26 <= 3
└───       goto #7 if not %27
6 ──       Main.push!(I, i)
└───       Main.push!(J, j)
7 ┄─       @_5 = Base.iterate(%18, %25)
β”‚    %32 = @_5 === nothing
β”‚    %33 = Base.not_int(%32)
└───       goto #9 if not %33
8 ──       goto #5
9 ┄─       @_2 = Base.iterate(%10, %17)
β”‚    %37 = @_2 === nothing
β”‚    %38 = Base.not_int(%37)
└───       goto #11 if not %38
10 ─       goto #4
11 β”„ %41 = Core.tuple(I, J)
└───       return %41
)

The logic to construct I,J is still present. All the macro does is wrap the whole machinery in an expression and return it. You could have equivalently copy-pasted that code into test.
If n is dynamic, i.e. its value only known at run-time, there is no way to make this work with a macro, which is expanded and evaluated at parse time.

However, a generated function can work. To quote from the manual

While macros work with expressions at parse time and cannot access the types of their inputs, a generated function gets expanded at a time when the types of the arguments are known, but the function is not yet compiled.

function to generate I,J
 function IJ(n) # n is degree
     I = Vector{Int}() # <- `I` will be obfuscated in the code made by the macro
     J = Vector{Int}() # <- Same for `J`
     if n < 0
         print("The input must be equal to or bigger than zero.")
     else
         for i in 0:n
         for j in 0:n
             if i+j <= n
                 push!(I, i)
                 push!(J, j)
             end
         end
         end
     end
     return (I, J)
 end
@generated function fgen(::Val{n}) where n
   ij = IJ(n)
   return :( $ij )
 end

julia> fgen(Val(3))
([0, 0, 0, 0, 1, 1, 1, 2, 2, 3], [0, 1, 2, 3, 0, 1, 2, 0, 1, 0])

julia> @code_lowered fgen(Val(3))
CodeInfo(
    @ REPL[45]:1 within `fgen`
   β”Œ @ REPL[45] within `macro expansion`
1 ─│ %1 = Core.tuple([0, 0, 0, 0, 1, 1, 1, 2, 2, 3], [0, 1, 2, 3, 0, 1, 2, 0, 1, 0])
└──│      return %1
   β””
)

The downside is that n needs to be wrapped in a value type, and excess code generation and compilation may occur when the function is called for many different values of n.

This feels like hacking around the actual issue though.

1 Like

Let me restate the original issue that led to my original post. I am interested in constructing a polynomial basis (which could be generalized with additional basis functions), that is multi-dimensional, and auto-differentiable using Zygote. Of course, I did not start off with macros, but with an explicit polynomial construction. In 3 dimensions, I’d be interested in specifying the indices (at run time)

(1,3), (5,3), (42,37)

and constructing the polynomial:

P(x,y) = a x y^3 + b x^5 y^3 + c x^42 y^37

which I could then differentiate with Zygote:

gradP = Zygote.gradient((x,y) -> P(x,y), .3, .4)

I can do this for specific cases, but I was looking for a general solution.

Thanks.

1 Like

@skleinbo, here is the requested MWE. If you can make this work without macros, I would greatly appreciated. I would like to specify the characteristics of my polynomial expansion at run time. For example, I’d like to make 3 runs (without recompilation) by placing sets of indices in an input file.
I also would like to tell Zygote (@ChrisRackauckas) not to differentiate with respect to the arrays I and J to save cycles. Of course, I would rather not write my own differentiation rules. For now, I have a working solution, but this MWE is presented as a challenge for all the experts out there! :slight_smile:

using Tullio
using ForwardDiff
using Zygote

# My preference is not to use any macros, such as @Tullio

function generate_polynomial_2D(degree)
    I = (2, 1, 4)   # Ideally, I would have several sets of these two lists
                         # of indices in an input file read by my program. 
                         # I wish to avoid any recompliation between runs.
    J = (5, 3, 2)
     Poly(coef, x, y) = @tullio poly[i] := coef[j] * x[i]^I[j] * y[i]^J[j] grad=Dual nograd=x nograd=y nograd=I nograd=J
    
     return Poly
end

function test_polynomial_generation_2d()
	N = 100
    x = rand(N)
    y = rand(N)
    nb_terms = 3
    coef = rand(nb_terms)
	poly = generate_polynomial_2D(degree)
	loss = (coef, x, y) -> sum(poly(coef, x, y).^2)
	result = Zygote.gradient(loss, coef, x, y)    # <<<< THIS MUST WORK
    return result
end


result = test_polynomial_generation_2d();
2 Likes

Maybe I’m still misunderstanding what it is you want, but the following gets the gradient of five randomly generated polynomials. Only the first call results in compilation. It’s basically identical to your MWE. Reading the coefficients from a file should be trivial.

using Tullio
using ForwardDiff
using Zygote

# My preference is not to use any macros, such as @Tullio
function generate_polynomial_2D(IJ)
     Poly(coef, x, y) = @tullio poly[i] := coef[j] * x[i]^I[j] * y[i]^J[j] nograd=I nograd=J
     return Poly
end

function test_polynomial_generation_2d(IJ)
	N = 100
    x = rand(N)
    y = rand(N)
    nb_terms = length(IJ[1])
    coef = rand(nb_terms)
	poly = generate_polynomial_2D(IJ)
	loss = (coef, x, y) -> sum(poly(coef, x, y).^2)
	result = Zygote.gradient(loss, coef, x, y)    # <<<< THIS MUST WORK
    return result
end

for _ in 1:5
    terms = rand(1:100)
    @time test_polynomial_generation_2d((rand(0:10,terms),rand(0:10,terms)))
end

  0.047186 seconds (1.73 k allocations: 86.414 KiB, 97.11% compilation time)
  0.001814 seconds (37 allocations: 11.406 KiB)
  0.000715 seconds (37 allocations: 9.406 KiB)
  0.001958 seconds (37 allocations: 11.406 KiB)
  0.000913 seconds (37 allocations: 9.531 KiB)

I don’t know why you are reluctant to use Tullio here. It seems perfect for your usecase. It evaluates multi-threaded, and more importantly performs symbolic differentiation when it can.

You could write your loss function in Tullio too.

function tullio_loss(coef, I, J, x, y)
     @tullio poly[i] := coef[j] * x[i]^I[j] * y[i]^J[j] nograd=I nograd=J
     @tullio loss := poly[i]^2
     return loss
end

for _ in 1:5
    terms = rand(1:100)
    I, J = rand(0:10,terms), rand(0:10,terms)
    x, y = rand(100), rand(100)
    coeff = rand(terms)
    @time Zygote.gradient(tullio_loss, coeff, I, J, x,y)
end

  1.345314 seconds (987.79 k allocations: 49.346 MiB, 18.11% gc time, 99.79% compilation time: 25% of which was recompilation)
  0.000970 seconds (26 allocations: 4.969 KiB)
  0.000611 seconds (26 allocations: 4.812 KiB)
  0.000535 seconds (26 allocations: 4.797 KiB)
  0.001832 seconds (26 allocations: 5.328 KiB)
2 Likes