Avoiding unnecessary mallocs while building then solving a linear system inside a loop can only be achieved by routines that modify in-place?

Hello,

I have some functions that build a linear system. I then want to build and solve the system in a loop, while modifying a parameter. My first, straightforward, approach resulted in many allocations being done unnecessarily. It was as follows:

function f(n::Integer)
    x = Array{Float64}(undef, n, n)
    @inbounds for i = 1:n
        @inbounds for k = 1:n
            x[k,i] = randn()
        end
    end
    return x
end

function g(n::Integer, m::Integer)
    y = Array{Float64}(undef, m)
    b = zeros(Float64, n)
    b[1] = 1.0
    @inbounds for i = 1:m
        a = f(n)
        x = a\b
        y[i] = x[1]
    end
    return y
end

The profiling:

@time g(2, 4) # 0.000021 seconds (26 allocations: 2.109 KiB)
@time g(2, 5) # 0.000025 seconds (31 allocations: 2.563 KiB)

The only way I could think of that would avoid those unnecessary allocations was to use functions that modify the inputs in-place, as follows:

using LinearAlgebra

function f!(x, n::Integer)
    for i = 1:n
        for k = 1:n
            x[k,i] = randn()
        end
    end
end

function g!(y, n::Integer, m::Integer)
    a = Array{Float64}(undef, n, n)
    b = Array{Float64}(undef, n)
    for i = 1:m
        b[1] = 1.0
        for k = 2:n
            b[k] = 0.0
        end
        f!(a, n)
        LAPACK.gesv!(a,b)
        y[i] = b[1]
    end
end

The profiling:

y = zeros(Float64, 5)
@time g!(y, 2, 4) # 0.000018 seconds (10 allocations: 752 bytes)
@time g!(y, 2, 5) # 0.000021 seconds (11 allocations: 848 bytes)

Is there a way to optimize the first approach with a f that does not need to modify the a matrix in-place? I am concerned with the users of my code: engineers with weak programming skills who just know a bit of MATLAB. I fear that the second approach will be too alien to them, weakening my efforts to proselytize Julia.

One thing you could do would be to use the @! macro from https://github.com/simonbyrne/InplaceOps.jl to let you write a \ b but have the actual code run in-place. Whether this is more or less clear to your users is hard to say: a macro can make the syntax look more like what they would expect, but it also introduces a new layer of indirection that they may eventually have to understand.

On the other hand, I think your current code is actually fine: there is no magic in it, and it accomplishes its task efficiently. Perhaps your effort would be better spent writing comments to help your readers understand. After all, Julia and Matlab are quite different languages, and good Julia code will usually end up looking quite different from Matlab code.

1 Like

You can do in-place matrix operations with the lu! and ldiv! functions from the LinearAlgebra module, without dropping down to “raw” LAPACK calls. You can also use in-place “dot assignments” like .=. For example, your function could be rewritten:

function g!(y, n::Integer, m::Integer)
    a = Array{Float64}(undef, n, n)
    b = Array{Float64}(undef, n)
    for i = 1:m  
        b .= 0
        b[1] = 1
        a .= randn.()
        ldiv!(lu!(a), b)
        y[i] = b[1]
    end
end

However, if you are doing zillions of solves and other operations with tiny arrays, e.g. the 2x2 matrices in your code, it should generally be much faster to use the StaticArrays package, which both avoids heap allocation and has optimized inlined implementations for such small sizes. StaticArrays take a bit of getting used to in order to get the full benefit (especially to make sure the dimensions are known when a critical function is compiled), however.

PS. Better to use @btime from the BenchmarkTools package, e.g. @btime g!($y, 2, 5), for benchmarking, especially for benchmarking such inexpensive functions.

2 Likes