@chaineach

I’d like a macro @chaineach such that:

@chaineach x begin
    #commands
end

is equivalent to

map( y -> @chain y begin
    #commands
end,  collect(x) )

Example usage is a @chain block processing a DataFrame and creating a GroupedDataFrame
I’d then like to apply a @chain block to each subdataframe.

What about this:

macro chaineach(x, ex)
    y = gensym()
    quote
        map($y -> @chain($y, $ex), collect($x))
    end |> esc
end

Example:

julia> using Chain

julia> r = 1:3; x = 2; @chaineach r begin _ + x; - end
3-element Vector{Int64}:
 -3
 -4
 -5

The macro assumes that @chain is in scope.

using DataFrames, Chain, TidierData


macro chaineach(x, ex)
    y = gensym()
    quote
        map($y -> @chain($y, $ex), collect($x))
    end |> esc
end


@chain begin 

    DataFrame( a=[1,1,2,2],  b=1:4, c=11:14 )

    @group_by a

    @chaineach _ begin
        
        sum(_.b) + sum(_.c)

    end

end 


ERROR: FieldError: type GroupedDataFrame has no field `b`, available fields: `parent`, `cols`, `groups`, `idx`, `starts`, `ends`, `ngroups`, `keymap`, `lazy_lock`

I’m not sure if you can expect nested @chain / @chaineach macros to work. What do @macroexpand and @macroexpand1 give?

EDIT: This works

julia> @chaineach [1:2, 3:4] begin @chain _ begin 2*_ end end
2-element Vector{StepRangeLen{Int64, Int64, Int64, Int64}}:
 2:2:4
 6:2:8

but this doesn’t:

julia> @chain [1:2, 3:4] begin @chaineach _ begin 2*_ end end
ERROR: MethodError: no method matching *(::StepRangeLen{Int64, Int64, Int64, Int64}, ::Vector{UnitRange{Int64}})

Could it be that @chain recognizes itself when scanning an expression? It cannot recognize @chaineach.

I think the @chain macro is coded specifically with nesting ability
I guess the same would have to be added specifically for @chaineach

Idea: replace @chaineach by @map @chain. This way an outer @chain can see the inner one and act accordingly.

macro map(ex)
    y = gensym()
    x, ex.args[3] = ex.args[3], y
    quote
        map($y -> $ex, collect($x))
    end |> esc
end

Nesting seems to work:

julia> @map @chain [1:2, 3:4] begin 2*_ end
2-element Vector{StepRangeLen{Int64, Int64, Int64, Int64}}:
 2:2:4
 6:2:8

julia> @map @chain [1:2, 3:4] begin @chain _ begin 2*_ end end
2-element Vector{StepRangeLen{Int64, Int64, Int64, Int64}}:
 2:2:4
 6:2:8

julia> @chain [1:2, 3:4] begin @map @chain _ begin 2*_ end end
2-element Vector{StepRangeLen{Int64, Int64, Int64, Int64}}:
 2:2:4
 6:2:8

julia> @map @chain [1:2, 3:4] begin @map @chain _ begin 2*_ end end
2-element Vector{Vector{Int64}}:
 [2, 4]
 [6, 8]

Basically you want an easier way to write

map(x) do xi
    @chain xi begin 
        ...
    end
end

?

FWIW, that’s how the @Lincoln_Hannah’s example from above would look with DataPipes.jl:

       @p let
           StructArray( a=[1,1,2,2],  b=1:4, c=11:14 )
           group(_.a)
           map() do  __
               sum(__.b) + sum(__.c)
           end
       end

No new macros at all, and quite intuitive behavior: __ always means the result of the previous pipeline step, and doing map() do __ effectively assigns to it – starting the inner pipeline with this value.

@aplavin, using DataFrames.jl, I believe the following is equivalent to your code:

using DataPipes, DataFrames

@p let
    DataFrame(a=[1,1,2,2], b=1:4, c=11:14)
    groupby(__, :a)
    combine() do  __
        sum(__.b) + sum(__.c)
    end
end

One wrinkle here is that map is not defined for grouped data frames, so this particular example wouldn’t quite work. But overall I think this macro solves your problem

julia> macro chaineach(iterable, chainarg)
           map_arg = gensym()
           chainblock = Expr(:macrocall, Symbol("@chain"), 1, map_arg, chainarg)
           out = quote 
               map($iterable) do $map_arg
                   $chainblock
               end
           end
           return esc(out)
       end;

julia> x = [1, 2, 3];

julia> @chaineach x begin
           _ + 1
       end
3-element Vector{Int64}:
 2
 3
 4

julia> df = DataFrame(g = [1, 1, 2, 2], y = [1, 2, 10, 20]);

julia> gd = groupby(df, :g);

julia> gd_vec = [gdi for gdi in gd];

julia> @chaineach gd_vec begin 
           @with begin 
               sum(:y)
           end
       end
2-element Vector{Int64}:
  3
 30

Isn’t it essentially identical to the macro in my first response (modulo the collect that OP wanted to have)?

Ah you are correct. And your map solution is pretty good.

I love this syntax. I’m converting tabular historical data to a KeyedArray of vol surfaces.

using DataFrames, Chain, TidierData, AxisKeys

@chain begin

    # Many lines to get date
    @select   histDate volatility money tenor
    @arrange  histDate days money
    @group_by histDate

    @aside histDate = first.(keys(_))
    
    @map @chain _ begin

        wrapdims( :volatility,  :money,  :tenor  )
        extend_surface()
        Vol_Surface{linear_variance}()

    end 

    KeyedArray( histDate )

end

@pdeffebach your solution of a single macro @chaineach is great too. I assume the the collect() function that @matthias314 added could be included so it could work directly on a GroupedDataFrame. Either way its a very clean syntax. Minimal brackets, minimal dummy variables.