jzr
1
Can I specify the name of the output column?
using DataFrames, Chain
@chain DataFrame(a=[1,2,3,1,2,3], b=[:x, :y, :z, :x, :y, :x]) begin
groupby(:a)
combine(:b=>unique=>:b, :b => (b->mean.(map(b₀->b₀.==b, unique(b)))) => AsTable)
end
4×3 DataFrame
Row │ a b x1
│ Int64 Symbol Float64
─────┼────────────────────────
1 │ 1 x 1.0
2 │ 2 y 1.0
3 │ 3 z 0.5
4 │ 3 x 0.5
sijo
2
Yes you just need to return a “table” with column names.
With a named tuple:
combine(:b=>unique=>:b, :b => (b->(; c=mean.(map(b₀->b₀.==b, unique(b))))) => AsTable)
With a Dict:
combine(:b=>unique=>:b, :b => (b->Dict(:c=>mean.(map(b₀->b₀.==b, unique(b))))) => AsTable)
3 Likes
Additionally in this case just the following works:
julia> @chain DataFrame(a=[1,2,3,1,2,3], b=[:x, :y, :z, :x, :y, :x]) begin
groupby(:a)
combine(:b=>unique=>:b, :b => (b->mean.(map(b₀->b₀.==b, unique(b)))) => :output)
end
4×3 DataFrame
Row │ a b output
│ Int64 Symbol Float64
─────┼────────────────────────
1 │ 1 x 1.0
2 │ 2 y 1.0
3 │ 3 z 0.5
4 │ 3 x 0.5
or - if indeed you produce a table and want to give new names for its columns use:
julia> @chain DataFrame(a=[1,2,3,1,2,3], b=[:x, :y, :z, :x, :y, :x]) begin
groupby(:a)
combine(:b=>unique=>:b, :b => (b->mean.(map(b₀->b₀.==b, unique(b)))) => [:output])
end
4×3 DataFrame
Row │ a b output
│ Int64 Symbol Float64
─────┼────────────────────────
1 │ 1 x 1.0
2 │ 2 y 1.0
3 │ 3 z 0.5
4 │ 3 x 0.5
(the latter is more useful if you had more than one column - for a single vector the former is sufficient)
What @sijo proposed is a general solution if you want AsTable
as a target and want to specify the column names within a transformation function.
3 Likes