Using `view` in 0.6 on a ragged array

dmbates · July 20, 2017, 6:47pm

I have noticed that a particular calculation in the MixedModels package unexpectedly allocates a lot of storage. The method in question is around lines 340 to 350 of modelterms.jl. The calculation involves the columns of a k by n matrix where k << n and a PooledDataArray which indicates which group the column is part of. There are g groups. In the code, the n-vector A.f.refs will contain the integers 1:g where g < n.

The output value is a g-vector of k by k matrices called d in the code. These are zeroed out then each column is used in a rank-1 update of the corresponding element of d. Right now the loop is written

    for i in eachindex(refs)
        ## FIXME: A lot of allocation going on here
        BLAS.syr!('L', one(T), view(Az, :, i), d[refs[i]])
    end

Checking with --track-allocation shows considerable allocation in this loop (which is called many times on a large problem).

        0     for i in eachindex(refs)
        -         ## FIXME: A lot of allocation going on here
1690606080         BLAS.syr!('L', one(T), view(Az, :, i), d[refs[i]])
        -     end

I don’t understand where the allocation is coming from. I had thought that a view would be lightweight, especially a view of a column in a matrix.

Is there a better way of writing this so as to avoid all the allocation?

dmbates · July 20, 2017, 7:15pm

I thought I might be able to avoid the allocation problem by using an unsafe_wrap of an array even though it is, well, unsafe. But it actually made the problem worse

        0     refs = A.f.refs
        0     Apt = pointer(Az)
        0     @inbounds for i in eachindex(refs)
        -         ## FIXME: A lot of allocation going on here
2817676800         BLAS.syr!('L', one(T), unsafe_wrap(Array, Apt, l), d[refs[i]])
        0         Apt += l * sizeof(T)
        -     end

dmbates · July 20, 2017, 7:42pm

Actually, I think the “twice as bad” may be because I ran the fit twice and forgot to Profile.clear_malloc_data() after the first fit.

mbauman · July 20, 2017, 8:00pm

Ideally view will be allocation-free some day, but for now the wrapper gets allocated in many cases (https://github.com/JuliaLang/julia/issues/10899). You can try the old ArrayViews package, where the unsafe contiguous view should manage to get optimized away… but it’s not gotten much love lately.

dmbates · July 20, 2017, 8:05pm

I think the problem was the calls to BLAS.syr! although I am not sure exactly why that would allocate. In any case, the calculation is so simple that I just replaced the high-level calls with explicit loops and that doesn’t allocate.

The reason that I don’t think it was the view call that was a problem is because I still had the problem with allocation after switching to unsafe_wrap(Array, ...)

dmbates · July 20, 2017, 8:08pm

Now it looks like

    refs = A.f.refs
    @inbounds for i in eachindex(refs)
        dri = d[refs[i]]
        for j in 1:l
            Aji = Az[j, i]
            for k in 1:l
                dri[k, j] += Aji * Az[k, i]
            end
        end
    end

Nitin_Arora · July 21, 2017, 6:27am

Instead of view , try using unsafe_aview from the ArrayViews package. As long as your code works with Abstract Arrays, this should be non allocating. I am using this until we get non allocating views in the base.

tim.holy · July 21, 2017, 12:45pm

I’ve noticed that the new inliner in Julia 0.7 seems more likely optimize out the construction of the SubArray, you could give that a try.

dmbates · July 21, 2017, 2:53pm

My current opinion is that it is not the call to view that is causing the allocations but rather syr!. Because of the Fortran calling sequence used by the BLAS, I think that all the integers passed to the BLAS subroutine need to be allocated. The ccall is

 ccall((@blasfunc($fname), $lib), Void,
     (Ptr{UInt8}, Ptr{BlasInt}, Ptr{$elty}, Ptr{$elty},
      Ptr{BlasInt}, Ptr{$elty}, Ptr{BlasInt}),
      &uplo, &n, &α, x,
      &stride(x, 1), A, &max(1,stride(A, 2)))

The point is that this is called many, many times with n = 2. Even trivial amounts of allocation within the call will add up.

kristoffer.carlsson · July 21, 2017, 3:41pm

Pretty sure that passing integers to Fortran with & will not allocate.

dmbates · July 21, 2017, 4:06pm

I haven’t been able to find another explanation as the allocations persist even when replacing view(... by unsafe_wrap(Array, ...

As it is, this problem motivated me to rewrite the code to avoid the calls to view and syr! so I guess all is good in the end. The tools to check allocations and to profile execution have proven their worth again.

yuyichao · July 21, 2017, 4:35pm

unsafe_wrap allocate the array.

Nitin_Arora · July 22, 2017, 7:00am

@dmbates did you try unsafe_aview ?

Topic		Replies	Views
Passing views to function without allocation New to Julia	14	1534	December 18, 2020
Is there a way to guarantee 0 allocations when accessing an array? General Usage	18	4307	June 1, 2017
Array views becoming dominant source of memory allocation Performance	31	5170	November 14, 2018
Memory allocation while using @views Performance performance , arrays	5	1334	August 16, 2021
How do I use `@view` correctly? General Usage	4	750	June 23, 2020

Using `view` in 0.6 on a ragged array

Related topics