Apply function By Row without re-stating column names

Just to note with the syntax, AsTable(:) => fun => ..., fun does not get a AbstractDataFrame. Rather, it gets a NamedTuple of vectors. And in AsTable(:) => ByRow(fun) => ..., fun gets a named tuple.

The reason is because we want src => fun => dest to be very performant and type stable, which we can’t be if we just pass a DataFrame to fun.

This is definitely confusing, but there’s always the ability to define your own function

julia> df = DataFrame(a = [5, 6], b = [7, 8]);

julia> function maprowsdf(f, df)
           map(eachrow(df)) do r
               nt = NamedTuple(r)
               res = f(; nt...)
               merge(nt, res)
           end |> DataFrame
       end;

julia> maprowsdf(foo, df)
2×4 DataFrame
 Row │ a      b      c      d     
     │ Int64  Int64  Int64  Int64 
─────┼────────────────────────────
   1 │     5      7     12     35
   2 │     6      8     14     48

Though I think transform with ByRow is still better.

2 Likes