Avoiding allocations in `view`s

gdkrmr · July 3, 2019, 1:05pm

Is there a way to avoid allocations with view and missing values? If I change the inner loop to sum then there are allocations in both tests.

using BenchmarkTools

X = randn(10, 1000)
Xm = convert(Array{Union{Float64, Missing}}, X)


function colsums(x::Array{T}) where T
    y = Vector{T}(undef, size(x, 2))
    @inbounds for i in 1:size(x, 2)
        xx = view(x, :, i)
        x0 = 0
        for j in eachindex(xx)
            x0 += xx[j]
        end
        y[i] = x0
    end
    return y
end

@btime colsums(X);
@btime colsums(Xm);

julia> @btime colsums(X);
  9.974 μs (1 allocation: 7.94 KiB)

julia> @btime colsums(Xm);
  247.901 μs (20490 allocations: 329.08 KiB)

tkoolen · July 3, 2019, 1:38pm

One option is UnsafeArrays; see e.g. this post: Array views becoming dominant source of memory allocation - #32 by oschulz. The current rule of thumb is that creating an array view will allocate if the view is used as an argument to a non-inlined function or if it is returned from a function.

ExpandingMan · July 3, 2019, 1:42pm

Your x0 is being initialized as an Int and winds up getting promoted on every iteration. This is definitely causing a lot of extra allocations, though I’m not quite sure why it goes so much more badly wrong with missng.

function colsums(x::Array)
    y = Vector{eltype(x)}(undef, size(x, 2))
    @inbounds for i ∈ 1:size(x, 2)
        xx = view(x, :, i)
        x0 = zero(eltype(x))
        for j ∈ eachindex(xx)
            x0 += xx[j]
        end
        y[i] = x0
    end
    y
end

julia> @btime colsums(Xm);
  11.640 μs (1 allocation: 8.94 KiB)

with your version I get

julia> @btime colsums(Xm);
  340.882 μs (20490 allocations: 329.08 KiB)

This somehow beats vec(sum(Xm, dims=1)) which seems odd. I wonder if it’s worth opening an issue.

julia> @btime vec(sum(X, dims=1));
  4.937 μs (3 allocations: 8.02 KiB)

julia> @btime vec(sum(Xm, dims=1));
  15.836 μs (3 allocations: 8.98 KiB)

gdkrmr · July 3, 2019, 3:11pm

O, thanks for spotting this, this mistake was from the MWE only, it didn’t solve my problem. What solved big part of the problem was removing an unneeded where T from some functions.

sum is probably slower because it does pairwise aggregation for better accuracy and therefore has some overhead.

Elrod · July 4, 2019, 1:57am

sum(xx) is slower than a loop because it isn’t inline, so the compiler heap allocates your views, and because 10 rows is too short for vectorization.

What solved big part of the problem was removing an unneeded where T from some functions.

I keep getting bit by that, and would love to hear if anyone has a good solution.
Once upon a time this triggered an error. I wish that we’re still the case.

gdkrmr · July 4, 2019, 6:46am

A while a go I opened an issue for LanguageServer.jl to issue a warning.

The code I am encountering this problem has some really strange other issues, too (e.g. it reaches the unreachable in julia 1.0, 1.1, 1.2, and master due to some typeinference bug and the only thing that helps is manually inlining everything) and I am really struggling to create a reproducible example.

Elrod · July 4, 2019, 7:30am

Cool, following the issue. I should try getting LanguageServer.jl working in my emacs again.
I didn’t try SpaceMacs because I wasn’t familiar with the vim keybindings either (although I’ve been using ergoemacs, which has only slightly different movement keys – maybe I should have gone evil/spacemacs instead), and have things configured in a way I like otherwise.
That mostly means treemacs (which is also integrated into spacemacs), so I’ll probably give it a try sometime.

Having to manually inline everything is worse than having to @inline problem functions, like I do to work around this isssue.

How far does the “everything” go in “manually inlining everything”? I’d hope you don’t have to inline getindex calls, for example.

gdkrmr · July 4, 2019, 7:54am

Luckily not getindex, I have to manually inline all the imported functions from a package which I am extending (which is ~3 layers deep) . The weird/interesting/annoying part is, that it works fine in one case, but not in a very similar one.

Topic		Replies	Views
Avoid allocations in retrieving array parts Performance question , array , views	2	50	October 29, 2024
Is there a way to guarantee 0 allocations when accessing an array? General Usage	18	4283	June 1, 2017
Shouldn't `@views sum(data[mask])` be non-allocating? General Usage memory-allocation	19	436	August 14, 2024
Understanding Allocations and Views General Usage memory-allocation , generator	4	761	January 21, 2022
Assistance in avoiding allocations in summations Performance	8	778	March 18, 2019

Avoiding allocations in `view`s

Related topics