Maybe I’m missing something but, at first accept, I would say that this solution is based on the specific situation that the dataframe is already sorted by groups.
The test should be done in the case of a “shuffled” dataframe to get a more meaningful answer.
using Random
s=shuffle(repeat(1:10^6, inner=4))
df=DataFrame(;s,t,r)
@assert combine(groupby(df, :s),:r=>maximum).r_maximum == vec(maximum(reshape(r, 4, :), dims=1))
ERROR: AssertionError: (combine(groupby(df, :s), :r => maximum)).r_maximum == vec(maximum(reshape(r, 4, :), dims = 1))
Stacktrace:
[1] top-level scope
@ c:\Users\sprmn\.julia\environments\v1.8.3\dataframes33.jl:414
edit
Now I read better the premise of the post, which specifies the scope of validity of the proposed solution.
in any case, the proposal, adapted to the general situation, does not disfigure at all.
julia> @btime begin
sort!(df,:s)
maximum(reshape(df.r, 4, :), dims=1)
end
30.534 ms (22 allocations: 114.44 MiB)
1×1000000 Matrix{Float64}:
0.742695 0.952913 0.884315 … 0.771818 0.929275 0.943746
julia> @btime combine(groupby(df, :s),:r=>maximum)
32.264 ms (348 allocations: 55.33 MiB)
1000000×2 DataFrame
a variant that makes better use of the particularity
julia> @btime begin
sort!(df,[:s,:r])
reshape(df.r, 4, :)[4,:]
end
25.522 ms (45 allocations: 114.44 MiB)
1000000-element Vector{Float64}: