Below is my implementation of the block principal pivoting algorithm for solving a non-negative linear least squares problem. This function is the bottleneck in my code (running on v0.6.2), which calls this in an inner loop with square matrices AtA
of size 200-600 and appropriate vectors Atb
and x0
as inputs.
Is there any way to speed this up?
Based on profiling a Matlab version of this, it appears that essentially all the time is spent solving the linear system: x[F]=AtA[F,F]\Atb[F]
, where F
is a mask that changes significantly every iteration. I have tried prepending @views
and/or @inbounds
, without any effect on the timings (measured using @btime
). Also, @code_warntype
does not indicate any type instability.
Any advice would be much appreciated!
Here is the full function:
function nnls_bpp(AtA,Atb,x0)
n = length(Atb)
F = x0 .> 0
x = zeros(n)
x[F] = AtA[F,F]\Atb[F]
y = AtA*x - Atb
xFneg = (x .< 0) .& F
yGneg = (y .< 0) .& .!F
nInfeasible = sum(xFneg .| yGneg)
p = 3
t = n+1
while nInfeasible > 0
if nInfeasible < t
t = nInfeasible
p = 3
F[xFneg] = false
F[yGneg] = true
elseif p >= 1
p = p - 1
F[xFneg] = false
F[yGneg] = true
else
idx = findlast(xFneg .| yGneg)
F[idx] = .!F[idx]
end
x = zeros(n)
x[F] = AtA[F,F]\Atb[F]
y = AtA*x - Atb
xFneg = (x .< 0) .& F
yGneg = (y .< 0) .& .!F
nInfeasible = sum(xFneg .| yGneg)
end
return x
end