Performance optimization of a custom binning function

Here’s the final version that I settled on, thanks again for everyone’s inputs:

"""
    bin(x::Vector{<:Number}, y::Array{<:Number}, xbin_edges::Union{Vector{<:Number},StepRangeLen{}}; method::F = mean) where {F<:Function}

Fast binning of `y` as a function of `x`. Returns `x_binned`, `y_binned`, `bin_count`. `method` 
specifies approached used for aggregating binned values. NOTE: `x_binned` == 0 and 
`y_binned` == 0 when `bin_count` == 0.

# see https://discourse.julialang.org/t/performance-optimization-of-a-custom-binning-funciton/91616
"""
function bin(
    x::Vector{<:Number}, 
    y::Array{<:Number}, 
    xbin_edges::Union{Vector{<:Number},StepRangeLen{}};
    method::F = mean,
    ) where {F<:Function}

    # find bin breakpoints
    p = sortperm(vcat(x, xbin_edges))
    bins = findall(>(length(x)), p)

    # initialize outputs
    bin_count = Int.(diff(bins).-1)
    x_binned = zeros(eltype(x), length(bin_count))
    y_binned = zeros(eltype(y), length(bin_count), size(y, 2))

    # calculate binned metrics
    for i = findall(bin_count .> 0)
        x_binned[i] = method(@view(x[p[bins[i]+1:bins[i+1]-1]]))
        y_binned[i] = method(@view(y[p[bins[i]+1:bins[i+1]-1]]))
    end
    
    return x_binned, y_binned, bin_count 
end
2 Likes

You forgot the @view() around the slices, which here give a some additional speedup and memory savings:

julia> @btime bin($x,$y,$xbin_edges);
  3.500 μs (56 allocations: 8.72 KiB)

after

               x_binned[i] = method(@view(x[p[bins[i]+1:bins[i+1]-1]]))
               y_binned[i] = method(@view(y[p[bins[i]+1:bins[i+1]-1]]))

I get:

julia> @btime bin($x,$y,$xbin_edges);
  2.786 μs (36 allocations: 5.97 KiB)

Good catch… previous post edited

I found that puzzling as well and went on to read the backstory: Make `findall` faster for AbstractArrays by nalimilan · Pull Request #37177 · JuliaLang/julia · GitHub. Food for thought.

2 Likes