It’s not clear what @Pisistrato means by “not working” but I think the desired output is a vector for each group.
The problem with combine(groupby(df, :B), [:A] => ((a) -> (a=myfun(a))) => AsTable) is that (a=...) is a simple assignment, not a named tuple so you need to write (a=...,), and the second problem is that returning a vector means returning a column. But you want a vector for each row in the result! So you need to return a vector of vectors
Would you mind if you dropped a few lines explaining what Ref() does in that case or simply redirect us to a place with similar examples and an explanation? The manual is definitely clear to CS people but not to me… Based on the manual definition below, I cannot see the connection. I seem to always miss the point with Ref().
Ref{T}
An object that safely references data of type T. This type is guaranteed to point to valid, Julia-allocated memory of the correct type. The underlying data is protected from freeing by the garbage collector as long as the Ref itself is referenced.
Following @Erik_Engheim 's post here, Ref() is creating a pointer-like object, if I understood right.
So, there is definitely something I am missing here, since I don’t see why the :B => Ref => :B works. I find it pretty often in code repos and it is a saver in many situations, but if I don’t understand how it works, I will not be able to use it. Any hint is more than welcome!
I want to emphasize, the vast majority of people on this website are not CS people. We are mostly scientists / data scientists with amateur computer science knowledge.
Further down in the docs give a (short, unsatisfying) answer
Use in broadcasting
Ref is sometimes used in broadcasting in order to treat the referenced values as a scalar.
Basically DataFrames.jl tries to “expand” vectors to fill the full column in combine. Ref stops that. It’s in the docs here, again not a very complete explanation
In all of these cases, function can return either a single row or multiple rows. As a particular rule, values wrapped in a Ref or a 0 -dimensional AbstractArray are unwrapped and then treated as a single row.
Ref is basically a confusing way to treat an object as a scalar during broadcasting. It might be nice if Base Julia had a better name for this, like Scalar. Another alternative which might be more intuitive is to wrap a collection in a length-one tuple in order to treat that collection as a scalar:
It is relieving to hear that there are many people with similar CS background.
So, in other words, if I understood it right, in the case of :B => Ref => :B what Ref does is that it takes the column :B and keeps its contents as an array for every group of :A. In that sense, in the context of broadcasting, Ref functions as a kind of “delaying” mechanism.