Potential small inefficiency in `triu!` and `tril!`?

Seif_Shebl · June 16, 2018, 9:20pm

The definition of triu! in dense.jl of LinearAlgebra is copy-pasted below. Observing line 181, we see that zero(M[i]) is calculated at each iteration of the inner loop. Obviously, this can be moved out of the loop and re-written as ZERO = zero(M[1]) just below idx = 1. Line 181 now should read M[i] = ZERO. The same thing happens again at line 225 in tril!.

function tril!(M::AbstractMatrix, k::Integer)
    m, n = size(M)
    if !(-m - 1 <= k <= n - 1)
        throw(ArgumentError(string("the requested diagonal, $k, must be at least ",
            "$(-m - 1) and at most $(n - 1) in an $m-by-$n matrix")))
    end
    idx = 1
    for j = 0:n-1
        ii = min(max(0, j-k), m)
        for i = idx:(idx+ii-1)
            M[i] = zero(M[i])
        end
        idx += m
    end
    M
end

Benchmarking the original and modified versions shows a small performance improvement of about 8% .

using BenchmarkTools
a = rand(1000,1000);
julia> @btime triu!(a,0);
  187.311 μs (0 allocations: 0 bytes)
julia> @btime my_triu!(a,0);
  173.455 μs (0 allocations: 0 bytes)

Should I open an issue to modify both functions or did I miss something?

EDIT:

Oh, sorry, my fault. I modified a between the two calls. The compiler of 0.7-alpha seems too smart and likely, it optimized that away and constant-propagated that ZERO.

julia> a = rand(1000,1000);

julia> b = copy(a);

julia> @btime triu!(a,0);
  173.969 μs (0 allocations: 0 bytes)

julia> @btime my_triu!(b,0);
  173.456 μs (0 allocations: 0 bytes)

Topic		Replies	Views
Working with `LinearAlgebra.mul!` Performance question , linearalgebra	12	1080	May 7, 2022
Memory Allocation when using mul! with sparse arrays and views Performance blas , linearalgebra , memory-allocation , sparsearrays	4	428	July 8, 2024
Why mul! is so fast? General Usage question , linearalgebra	7	6867	November 26, 2019
Why does `mul!(u, A, v)` allocate when `A` is sparse and `u, v` are views? Performance linearalgebra , sparse	9	595	November 10, 2023
Odd behavior with mul!() and @view General Usage	3	242	January 10, 2024

Potential small inefficiency in `triu!` and `tril!`?

Related topics