I am submitting to you a question which I believe to be of a general nature (along with other issues arising from the case), derived from a particular case on which I was working.
I find myself in the need to make transformations (via reshapein (...)
) of the various subdataframes obtained from groupby (df, ...)
and then combine them together with the use of vcat.
Since applying the reshapein (...)
function to the various subgroups generates dataframes with a different set of columns, I cannot use combine (groupby (...), ...),
although an option would be useful in these cases like that of vcat cols =: union.
Therefore I regret merging with vcat (..., cols =: union)
. I wonder if it wasn’t useful in cases like these where I know well the origin of the various missing items, if it wasn’t useful to have a fillmissing = something
argument.
In the absence of this I have defined and used the two functions filldown ()
and fillup ()
which replace the missing value with the last useful value available forward or backward.
Also with regard to functions of this type I believe that they are quite of general necessity and it would not be bad if they were available (not in naive form, like the one made by me) in the dataframes.jl package
filldown(v)=accumulate((x,y)->coalesce(y,x), v,init=v[1])
fillup(v)=reverse(filldown(reverse(v)))
function reshape(grp)
sgrps=groupby(df[grp,:],:Page)
rr=mapreduce(reshapein,(x,y)-> vcat(x,y, cols = :union), sgrps)
transform!(rr,Cols(:).=> filldown∘fillup,renamecols=false)
end