I recently installed Julia Version 1.5.0-beta1.0 and I was impressed with the non-allocating views. I just tried an example of LU factorisation written in vectorised style to get a feel for it, and the performance compared very well to the inbuilt
generic_lufact! which uses straight loops.
swap!(x,y) = for i in eachindex(x, y) @inbounds x[i], y[i] = y[i], x[i] end function lufact!(A, p) m, n = size(A) p .= 1:m; @views @inbounds begin for k = 1:m-1 _, kp = findmax(i->abs(A[i,k]), k:m) #35316 if k != kp swap!(A[k,:], A[kp,:]) swap!(view(p,k), view(p,kp)) end A[k+1:m,k] ./= A[k,k] for j = k+1:n A[k+1:m,j] .-= A[k,j] .* A[k+1:m,k] end end end A, p end
Note: I used the two argument findmax from https://github.com/JuliaLang/julia/pull/35316#issuecomment-622629895 to find the maximum absolute value for the pivot. I hope this will make its official appearance in Julia shortly, because it seems very useful. But maybe there’s a better way to do this that I overlooked.
Quick timing test:
using LinearAlgebra:generic_lufact! function lutest() N = 1000; A = rand(N,N); p = zeros(Int, N); B = copy(A); @time lufact!(B,p); C = copy(A); @time generic_lufact!(C); @show B ≈ C end
I found the time was very competitive (faster even) compared to
generic_lufact! which surprised me.
julia> lutest() 0.115506 seconds 0.206406 seconds (1 allocation: 7.938 KiB) B ≈ C = true
By comparison, in Julia Version 1.4.2 the results were
julia> lutest() 0.236483 seconds (1.51 M allocations: 91.735 MiB, 1.80% gc time) 0.210450 seconds (2 allocations: 7.969 KiB) B ≈ C = true
with lots of allocations.
I’m new to Julia, so any feedback on style or substance is appreciated. I also wondered if I overlooked an inbuilt function to swap the contents of two arrays (like BLAS swap).
Thanks for your time.