Using gensym properly

Using ReactiveBasics.jl, I have been using signals to deal with event-driven time series event modeling. I find that using a reactive style approach is intuitive and improve the clarity of my code. My question, though, is how to go about using gensym properly.

f = Signal(n)

now, i want to build a function and/or macro that will let me apply a specific transformation to an arbitrary Signal. In order to get the median value of the last 5 values of the Signal, I need to create an array that exists separately from the signal:

macro medWin(sig, arr, win = 5)
    quote
        $arr = []
        flatmap($sig) do iv
            unshift!($arr, iv)
            length($arr) > win ? pop!($arr) : false
            median($arr) |> Signal
        end
    end
end

med_signal = @medWin(f, arbitrary_array_name, 5)

This works. However, I realized that I do not really care what “arbitrary_array_name” is, just that it is unique. I never reference it outside the function/macro “medWin”. This seems a clear case for gensym(). However, I cannot seem to get my quote notation, $ interpolations, and eval calls set up to make this work properly. Can someone help me see how to implement this more clearly? Alternatively, perhaps there is a better way to go about this task.

1 Like
  1. The macro is wrong and won’t work on 0.6 since you must escape every user input once and exactly once. The correct version of your macro is (also note that you didn’t interpolate win)

    macro medWin(sig, arr, win = 5)
        quote
            $(esc(arr)) = []
            flatmap($(esc(sig))) do iv
                unshift!($(esc(arr)), iv)
                length($(esc(arr))) > $(esc(win)) ? pop!($(esc(arr))) : false
                median($(esc(arr))) |> Signal
            end
        end
    end
    
  2. Now if you don’t need to access arbitrary_array_name in the parent scope, you should just use a normal variable and an variable that’s not accessible by other code in the enclosing scope will automatically be created. (This is described in detail in the doc). So simply

    macro medWin(sig, win = 5)
        quote
            arr = []
            flatmap($(esc(sig))) do iv
                unshift!(arr, iv)
                length(arr) > $(esc(win)) ? pop!(arr) : false
                median(arr) |> Signal
            end
        end
    end
    

    will work.

  3. If you really want to do it yourself, which is not recommended since it messes up debug info (an error about the variable won’t have a readable variable name anymore since you’ll force it to be an randomly generated one) you can do

    macro medWin(sig, win = 5)
        arr = gensym() # or `@gensym arr`
        quote
            $(esc(arr)) = []
            flatmap($(esc(sig))) do iv
                unshift!($(esc(arr)), iv)
                length($(esc(arr))) > $(esc(win)) ? pop!($(esc(arr))) : false
                median($(esc(arr))) |> Signal
            end
        end
    end
    
3 Likes

Will option 2 above work with every update to signal f pushing properly
through the macro, and arr always holding the last 5 values? I had thought
that option 2 would mean that every time a new signal is pushed to f, a new
empty array “arr” would be created.

What do you mean by that?

That’s what all versions above means. Your version included.

If you want a compile time generated (non-threadsafe and non-reentrant) array that are shared through multiple runtime execution of the same macro expansion, you can splice in the array directly (i.e. do arr = [] in the macro and splice in this value and not the name). If you want some function local sharing then no it’s impossible and you shouldn’t do it. Any state that’s not local for a single macro should be explicitly managed.

So, the options you presented all yield the behavior I am looking for; I have simply not expressed myself quite clearly. The macro “@medWin” is only invoked once, but the function inside the macro will be called and need access to “arr” every time an event is fired:

macro medWin(sig,win = 5)
    arr = []
    quote
        flatmap($sig) do iv
            unshift!($arr, iv)
            length($arr) > $win ? pop!($arr) : false
            median($arr) |> Signal
        end
    end
end

ff = Signal(0)
gg = @medWin(ff,3)

Now, we update signal ff and see how this changes our model:

for i=1:10
    push!(ff, i)
    println("ff is ", ff.value)
    println("gg is ", gg.value)
    println("-----")
end

Output:

ff is 1
gg is 0.5
-----
ff is 2
gg is 1.0
-----
ff is 3
gg is 2.0
-----
ff is 4
gg is 3.0
-----
ff is 5
gg is 4.0
-----
ff is 6
gg is 5.0
-----
ff is 7
gg is 6.0
-----
ff is 8
gg is 7.0
-----
ff is 9
gg is 8.0
-----
ff is 10
gg is 9.0
-----

This works as desired, with gg returning the median value of the most recent 3 values of ff. Additionally, arr is not defined in the global scope; as I understand it, it’s in something like a closure. Thank you for your help; I did not realize that I could access the scope within the macro in this manner.

P.S.: obv still using 0.5.1. Where can I learn more about why the need for the escapes have been added to 0.6? Curious on the thought process here.

It actually is defined in a global scope, just with a gensym name so that it’s not named arr anymore.

It’s a bug fix so that the behavior is actually consistent with the document now. It has always been a requirement to do this as described in the hygiene section I linked above (though it may not make a difference in global scope) but the implementation was wrong when you are calling the macro in the same module it was defined. Argueably the doc organization might have also encouraged people to use it incorrectly since it gaves multiple wrong example before introducing hygiene. I’m not really sure how to improve it though…

Hopefully, at some point you won’t need to use esc at all: RFC: WIP: Make macro hygiene easier to use and less error-prone by vtjnash · Pull Request #10940 · JuliaLang/julia · GitHub

1 Like