Macros and piping

How come these two examples don’t work.

using Pipe, Chain

macro test(x) x end

@pipe 1 |> @test(_)     #works
@pipe 1 |> @test _      #works
@pipe 1 |> @test        #doesn't work

@chain 1 begin      
    @test(_)            #doesn't work

you need to tell use what package/s you are using.

Pipe and Chain

At least for the last example with Chain it’s due to macro hygiene:

julia> @macroexpand1 @chain 1 begin
    local var"##242" = 1
    #= REPL[13]:2 =#
    local var"##243" = #= REPL[13]:2 =# @test(var"##242")

julia> @macroexpand @chain 1 begin
    local var"##246" = 1
    #= REPL[16]:2 =#
    local var"##247" = Main.:(var"##246")

It does work, if you change @test to escape it’s argument:

macro test(x) esc(x) end

Have not checked Pipe yet. The expansion of @chain seems correct though, i.e., it just pastes the call @test into its expansion which then get’s recursively expanded by Julia’s macro-expander. With macros there are always several environments one needs to be aware of, i.e., the environment the macro is defined in and the environment it gets expanded into. In general, arguments passed to the macro almost always need to be escaped as they come from the environment the macro gets expanded into – not that I have fully understood how macro hygiene works in Julia though.


Ok, have also looked at Pipenow:

  1. The cases that work, are because @pipe does not introduce a local variable and just passes the literal 1 to the macro. It would fail for the same reason as above if you referred to a local variable:
    julia> let x = 1
               @pipe x |> @test _
    ERROR: UndefVarError: `x` not defined
  2. The last example expands as
    julia> @macroexpand1 @pipe 1 |> @test
    :((#= REPL[45]:1 =# @test())(1))
    which is certainly not what you want – and also different from the other two cases.
    It seems to be not the fault of @pipe, but rather due to how macros are parsed:
    @pipe 1 |> @test |> @test(_)
    #               └─┘ ── not a unary operator
    I.e., the parser is consuming the next expression as a macro argument when it’s being called without brackets. As macros get to see the code after it’s been parsed there is nothing @pipe can do about that.
1 Like

Thanks so much.

This works, but is there a simpler way to write it ?

macro filter( data, condition )

     :(filter( $condition,  $(esc(data) ) ))    


@chain begin
    @filter   iseven            #works :)

Writing it as a function instead of a macro would be simpler indeed, but I guess that this answer is so obvious that it’s not the one you were looking for.

Perhaps more context is needed to understand why you want this to be a macro, which often makes things very complicated.

With DataPipes, all of the examples work – aside from 1 |> @test. No matter if literal values or variables:

julia> using DataPipes

julia> macro test(x) x end

julia> let x = 1
           @p x |> @test(__)

julia> let x = 1
           @p x |> @test __

julia> let x = 1
           @p begin
               @test __

I’m a big fan of Chain and DataFramesMeta.
There’s zero wasted characters, and the lack of brackets and commas (as you have with functions) makes it very clear. e.g.

@chain begin

     @rsubset       :a > 5
     @rtransfrom    :b = mod( :a, 2 )
     @by            :b    :x = sum(:a)
     @select        :b :x

Why not have a set of macros that extend this functionality beyond DataFrames.

@chain begin
    @filter      > 5 

There are lots of functions like filter where the argument you are likely to pass from the previous row is not the 1st argument. Different piping packages handle this in different ways (.e.g Lazy >> or Chain _ ). A nice alternative would be to have macro versions of these functions where the 1st argument is the one you are likely to pass - so they work with Chain.

foo(a, x) can be wrapped as myfoo(x, a) = foo(a, x) to achieve that, without the difficulties of macros. Of course, you wouldn’t get rid of brackets (and commas, if there are more than two arguments), although I would say that they are worth (and from my personal point of view, they don’t worsen clarity that much).

You mean, it’s better to type @ and space instead of ()?

Like, currently we have
@p tbl filter(_.a > 0)

@p let
    filter(_.a > 0)

and you want

@macro let
    @filter _.a > 0

? Just to clarify the motivation.

DataPipes is interesting and in contrast to the other examples above it expands macro calls found in its body:

julia> let x = 1
           # Note: Macroexpand1, but @test nevertheless got expanded
           @macroexpand1 @p x |> @test __
    (var"##__#298",) = let
            var"##__#297" = x
            var"##__#298" = var"##__#297"

It is still brittle and might not always work as expected:

module Foo

sqr(x) = x*x

function testf(x) sqr(x) end

# Note: Escapes too little
macro test1(x) :(sqr($x)) end

# Note: Just right, i.e. sqr is Foo.sqr and x escaped into calling environment
macro test2(x) :(sqr($(esc(x)))) end

# Note: Escapes too much, i.e., sees sqr from calling environment
macro test3(x) esc(:(sqr($x))) end

export testf, @test1, @test2, @test3


using .Foo
using DataPipes

let x = 3, sqr = sqrt
    @show @p x |> sqr |> testf
    @show @p x |> testf(sqr(__))

let x = 3, sqr = sqrt
    @show @p x |> sqr |> @test1 __
    @show @p x |> @test1(sqr(__))

let x = 3, sqr = sqrt
    @show @p x |> sqr |> @test2 __
    @show @p x |> @test2(sqr(__))

let x = 3, sqr = sqrt
    @show @p x |> sqr |> @test3 __
    @show @p x |> @test3(sqr(__))

Here, I would argue that both lines should give the same result and the function version shows the expected behaviour. Only the macro escaping its argument, but not the rest of its expansion works the same way.

@Lincoln_Hannah Unfortunately, I don’t think there is a simpler version to define the macro, i.e., one has to be careful what to escape and what not. Given the small benefit, i.e., saving some brackets, I would not use a macro in that case – also with data frames I usually stick to the functions from DataFrames instead of using any macro solution (DataFramesMeta is nice though in that it is a very lite-weight wrapper which directly expands into the corresponding functions).
One could define a small helper macro though:

macro escaping(exprs...)
    @assert length(exprs) > 0
    vars = exprs[1:(end-1)]
    body = exprs[end]
    @assert all(isa.(vars, Symbol))
    :(let $((:($(esc(var)) = esc($(esc(var)))) for var in vars)...)

and then define test and filter as

macro test(x)
    @escaping x begin

macro filter( data, condition )
    @escaping data condition begin
        :(filter( $condition,  $data ))   

Not sure, which of those lines give unexpected results?

You mean first and second in each pair? That would be strange: @p x |> sqr |> @test1 __ always has to use sqr from the calling env, while in @p x |> @test1(sqr(__)) resolving sqr is up to the @test1 macro.

And anyway, macros handling is just a cherry on top (: The main point of DataPipes is to make generic data manipulation in Julia as boilerplate-free as possible. I think it succeeds in that already, demonstrating that a single macro (@p) is enough – actual operations within the pipe can be regular functions without losing terseness and expressivity.

Yes, that’s what I meant.

Interesting, was just assuming that the function semantic, i.e., testf(sqr(__)), of resolving the sqr inside the definition to Foo.sqr and the one in its argument to the calling environment would be the expected one.
You are right though that a macro can deliberately choose to break that and there is no reason to assume that sqr |> @test __ and @test(sqrt(__)) should be the same. All of the above macros are indeed different wrt to where the identifiers sqr – both inside the macro definition and in the macro argument – are resolved:

  • @test1: Both are resolved as Foo.sqr
  • @test2: Argument resolved in calling environment, other one as Foo.sqr, i.e., like the function
  • @test3: Both are resolved in the calling environment, i.e., macro is deliberately non-hygienic

DataPipes is indeed nice and does the right thing here – wonder if a macro needs to expand macros in its body explicitly (like DataPipes does) is required or if the other approach, i.e., just interpolating the macro call into the outer expansion, can be made to work as well? (From the above examples it does not seem to play nice with hygiene or mess up environment handling otherwise?)

The main reason for adding macro expansion to DataPipes was to support cases like these:

# string macro that uses _:
@p 1:10 |> map(f"{_:2f}")
# macro that uses _ itself:
@p 1:10 |> map(@set _ |> abs(_) = 1)

I don’t think it’s possible to handle them without expanding inner macros first…

In the end, I find the current implementation very reliable, even complex nested expressions with __ and _ work.