While porting GAP code, I translated the function CollectBy (that I find very useful) as:
"""
group items of list l according to the corresponding values in list v
julia> groupby([31,28,31,30,31,30,31,31,30,31,30,31],
[:Jan,:Feb,:Mar,:Apr,:May,:Jun,:Jul,:Aug,:Sep,:Oct,:Nov,:Dec])
Dict{Int64,Array{Symbol,1}} with 3 entries:
31 => Symbol[:Jan, :Mar, :May, :Jul, :Aug, :Oct, :Dec]
28 => Symbol[:Feb]
30 => Symbol[:Apr, :Jun, :Sep, :Nov]
"""
function groupby(v::AbstractVector,l::AbstractVector)
res=Dict{eltype(v),Vector{eltype(l)}}()
for (k,val) in zip(v,l)
push!(get!(res,k,similar(l,0)),val)
end
res
end
"""
group items of list l according to the values taken by function f on them
julia> groupby(iseven,1:10)
Dict{Bool,Array{Int64,1}} with 2 entries:
false => [1, 3, 5, 7, 9]
true => [2, 4, 6, 8, 10]
Note:in this version l is required to be non-empty since I do not know how to
access the return type of a function
"""
function groupby(f,l::AbstractVector)
res=Dict(f(l[1])=>[l[1]]) # l should be nonempty
for val in l[2:end]
push!(get!(res,f(val),similar(l,0)),val)
end
res
end
I choose the name groupby
since I saw in some messages that there seems to be a function of that name
doing something similar in some package. I have several questions:
- is there some “standard library” where such a function can be found?
- if not, does my implementation look good? Is it possible to do better/faster?
- In particular, is there a good way to solve the problem of accessing function return type?