Below is my implementation of the block principal pivoting algorithm for solving a non-negative linear least squares problem. This function is the bottleneck in my code (running on v0.6.2), which calls this in an inner loop with square matrices
AtA of size 200-600 and appropriate vectors
x0 as inputs.
Is there any way to speed this up?
Based on profiling a Matlab version of this, it appears that essentially all the time is spent solving the linear system:
F is a mask that changes significantly every iteration. I have tried prepending
@inbounds, without any effect on the timings (measured using
@code_warntype does not indicate any type instability.
Any advice would be much appreciated!
Here is the full function:
function nnls_bpp(AtA,Atb,x0) n = length(Atb) F = x0 .> 0 x = zeros(n) x[F] = AtA[F,F]\Atb[F] y = AtA*x - Atb xFneg = (x .< 0) .& F yGneg = (y .< 0) .& .!F nInfeasible = sum(xFneg .| yGneg) p = 3 t = n+1 while nInfeasible > 0 if nInfeasible < t t = nInfeasible p = 3 F[xFneg] = false F[yGneg] = true elseif p >= 1 p = p - 1 F[xFneg] = false F[yGneg] = true else idx = findlast(xFneg .| yGneg) F[idx] = .!F[idx] end x = zeros(n) x[F] = AtA[F,F]\Atb[F] y = AtA*x - Atb xFneg = (x .< 0) .& F yGneg = (y .< 0) .& .!F nInfeasible = sum(xFneg .| yGneg) end return x end