Range (statistics)

Thanks, so I guess I better define one of the above range functions in my startup.jl, if there is nothing ready-made.

Fyi, I found this minmax algorithm here, which seems to perform better than the previous solutions:

function range2(x)
    min = max = first(x)
    for xi in view(x, 2:length(x))
        min > xi ? min = xi : max < xi && (max = xi)
    end
    return max - min
end

x = 10 * rand(100)
@btime range2($x)        # 123 ns (0 allocs: 0 bytes)

Since range is one of the important measures of variability, an optimized version should be available in the standard library, IMHO.

For performance, branches (as generated by ? : operator) are bad. An alternate version:

function range3(x)
    min = typemax(eltype(x))
    max = typemin(eltype(x))
    for xi in x
        min = ifelse(min > xi, xi, min)
        max = ifelse(max < xi, xi, max)
    end
    return max - min
end

is faster on my machine.

julia> @btime range3($x)        # 123
  98.743 ns (0 allocations: 0 bytes)
4 Likes

Thanks @Dan for the insights, I see 40% speedup here!

This just replicates the functionality of extrema followed by a subtraction. If this is faster than extrema, that means there’s something wrong with the latter.

3 Likes

Hmm, I amend my statement a bit. extrema handles NaNs, which range3 does not (nor does it handle empty collections):

julia> range3([1.0, NaN, 2.0])
1.0

julia> extrema([1.0, NaN, 2.0])
(NaN, NaN)

julia> range3(Float64[])
-Inf

julia> range3(Int[])
1

If you want this to be in a stdlib, it should be robust against things like this.

3 Likes

Fixing this does not entail so much overhead:

function range4(x)
    isempty(x) && error("Does not support empty vectors")
    min = typemax(eltype(x))
    max = typemin(eltype(x))
    hasnan = false
    for xi in x
        hasnan |= isnan(xi)
        min = ifelse(min > xi, xi, min)
        max = ifelse(max < xi, xi, max)
    end
    hasnan && error("Does not support NaNs in vectors")
    return max - min
end
julia> x = 10 * rand(100);

julia> y = 10 * rand(1000000);

julia> @btime range4($x);
  107.822 ns (0 allocations: 0 bytes)

julia> @btime extrema($x);
  318.175 ns (0 allocations: 0 bytes)

julia> @btime range4($y);
  1.397 ms (0 allocations: 0 bytes)

julia> @btime extrema($y);
  3.285 ms (0 allocations: 0 bytes)

There is a ~3X gap here (persisting into large vectors), and this trivial (and popular) operation could probably be optimized further (SIMD, threads etc.)

2 Likes

Can you take a look at Base.extrema and see if you can tell what, if anything, is wrong with it, performance-wise?

Because I really think the right approach is to use extrema for this, and fix any problems there.

2 Likes

Dan, sorry to ask, but is hasnan required?
Could we write range4() as follows:

function range4b(x)
    isempty(x) && error("Does not support empty vectors")
    min = typemax(eltype(x))
    max = typemin(eltype(x))
    for xi in x
        isnan(xi) && error("Does not support NaNs in vectors")
        min = ifelse(min > xi, xi, min)
        max = ifelse(max < xi, xi, max)
    end
    return max - min
end

You probably want to avoid branches.

You wouldn’t want to read the vector from memory twice. The memory transfers are critical for performance.

Whoops, the code changed… but now DNF is right, the branch is bad.

Sorry, you seem to have captured a temporary change in the code, the original snippet is restored.

But benchmarking is the best disinfectant. You are welcome to post results (my machine’s results are up there already).

Had a quick glance, but will look some more. Basically, there is an attempt to capture many reduce-like operations in one template, and perhaps the overhead of reduce with support to dims option is taking a toll.

On the NaN / empty-vec front, I think actually a vector type which prohibits NaNs can be useful (i.e. throws when operations generate a NaN). With this type, specialized cases can be dispatched.

In your examples which do not contain NaN, for the small array case, range4b() is ~5% faster, and for the large array case, range4() is ~2% faster.

1 Like

…and compared to extrema? is it still 3X?

Yes, extrema() is still 3x slower

branch prediction?

One answer is #45581. Which seems to be 3-4x faster than the functions above.

3 Likes

Thank you for all the valuable answers. In the absence of a dedicated range function in the standard library, I marked the one from @stillyslalom as the solution, as it is the most compact one that uses extrema(), which according to @mcabbott will be greatly accelerated soon.

1 Like