How to avoid allocations when multiplying numbers with missings

Coming from R I have a hard time to use missings and ensure type stability.

I guess that the following allocations in multiplication of numbers also relate to this issue. How can I avoid the memory allocations in the following code snipped?

x = [missing, 1.0];
y = [missing, 2.0];
f(x,y) = x[1] * y[1] * x[2] * y[2];
@time f(x,y); @time f(x,y)

  0.000006 seconds (2 allocations: 32 bytes)

Please, help me so that I can write own scalar operations within loops without allocating memory in each iteration. (Do not focus on special solutions for the exact formula in the example.) E.g. in the following snippet there are allocations in each iteration.

function f(x,y)
    s = zero(eltype(x))
    for i in axes(x,1), j in axes(y,1)
        sij = x[i] * x[j] + y[i] * y[j]
        if !ismissing(sij) 
            s += sij
        end
    end
    s
end

In the example I could check for any missings before the formula. However, that makes more complex code even more complex and it does still allocate when using any on a tuple.

function f(x,y)
    T = nonmissingtype(eltype(x))
    s = zero(T)
    for i in axes(x,1), j in axes(y,1)
        #if !any(ismissing.((x[i], x[j], y[i], y[j])))
        if !ismissing(x[i]) && !ismissing(x[j]) && !ismissing(y[i]) && !ismissing(y[i]) 
            sij = x[i] * x[j] + y[i] * y[j]
            s += sij::T
        end
    end
    s
end
@time f(x,y); @time f(x,y)
0.000004 seconds (1 allocation: 16 bytes)

I guess there is a simpler and more general way of dealing with missings.

Haven’t looked at this closely, but independent of what exactly you’re doing (1) never benchmark in global scope and (2) use BenchmarkTools to get more reliable measurements, interpolating any global variables into your benchmarking expression:

For your final version of f I get:

julia> @btime f($x, $y)
  12.913 ns (0 allocations: 0 bytes)
5.0
5 Likes

Thanks for showing me how to benchmark properly.
Yes, I managed to create a version without allocation in the loop (also shown with @time - as there its only a single allocation).
However the code gets really obfuscated with all these types and && concatenated checking.
I want to understand why a simple multiplication of three numbers leads to allocation and how to rewrite code in a concise way to avoid this.

My beginners question is still, is there a simpler way how to deal with missings in this case?

P.S. I have not found out how to edit my original post (to use @btime), or to answer in a comment to @nilshg. Maybe I need to earn some batches before being able to do this.

Your first loop version doesn’t allocate and is quite readable, but it’s not as efficient as the long version that checks for missing values before doing the calculation…

For me it is allocating:

x = [missing, 1.0];
y = [missing, 2.0];

function f1(x,y)
    s = zero(eltype(x))
    for i in axes(x,1), j in axes(y,1)
        sij = x[i] * x[j] * y[i] * y[j]
        if !ismissing(sij) 
            s += sij
        end
    end
    s
end
@btime f1($x,$y)
  153.440 ns (9 allocations: 144 bytes)

Which version of Julia are you using? On 1.6rc1 I get:

julia> @btime f1($x,$y)
  27.192 ns (0 allocations: 0 bytes)
2 Likes

Thats nice. I am using julia version 1.5.3.

Would be nice to understand what is going on and at what conditions such a multiplication is allocating.

You might also like this:

julia> f2(x,y) = sum(skipmissing(x[i] * x[j] + y[i] * y[j] for i in axes(x,1), j in axes(y,1)))
f2 (generic function with 1 method)

julia> @btime f2($x,$y)
  29.772 ns (0 allocations: 0 bytes)
5.0
1 Like

Thanks, I now that works for the simple example. For the real use-case I have autocorrelations. skipmissing then would confound the distances between positions. I need to keep the missing records in the (more complicated) loop and sum.

I don’t understand: As long as you first do the calculations and then reject missing results (like f1 does), it should make no difference to use skipmissing?

Yes, f1 is ok (though allocating in julia 1.5). I just wanted to tell that f2, which uses skipmissing, does not work for my real use case. Skipmissing returns a vector of a different length, e.g. [1.0, 2.0, 3.0] versus [1.0, missing, 2.0, 3.0]. Combining the 1.0 and 2.0 have different distances in both vectors.
Can you refer me to an explanation why the allocations are there in julia 1.5?

But the calculations are made as usual with the non-filtered array elements so the “distances” are not affected:

sum(skipmissing(x[i] * x[j] + y[i] * y[j] for i in axes(x,1), j in axes(y,1)))

Here skipmissing doesn’t change which values of x and y are used. It’s a normal function call. Julia first evaluates the parameter (x[i] * x[j] + y[i] * y[j] for i in axes(x,1), j in axes(y,1)). This is a generator that does all the calculations directly on x and y and returns the results (some of which might be missing).

These results are later passed to the skipmissing function, which will discard the missing values. So it really works exactly like f1.

1 Like

I don’t know… For some reason, these two functions allocate in 1.5.3:

f(x) = x[1] * x[2] * x[3]
g(x) = x[1] + x[2] + x[3]

julia> @btime f($[missing, 1.0, 2.0])
  28.780 ns (2 allocations: 32 bytes)

julia> @btime g($[missing, 1.0, 2.0])
  28.334 ns (2 allocations: 32 bytes)

but the following doesn’t:

h(x) = x[1] * x[2] + x[3]

julia> @btime h($[missing, 1.0, 2.0])
  4.558 ns (0 allocations: 0 bytes)

I also don’t know what has changed exactly in 1.6 to avoid these allocations.

2 Likes

Understanding these allocations seems to be not a first steps questions. And the behaviour does not occur with Julia 1.6. Hence I accept sudete’s answer as the solution.

I relocated your post to “Usage > Performance”.

1 Like