Simple Dykstra projection : Could it be faster?

lrnv · July 28, 2021, 12:21pm

Hi,

I’m writting a simple Dykstra sheme for projecting a vector start onto equality constraints Ax = b and inequelity constraints x >= 0. It works great, but I wander if it could be faster ?

Here’s the current version :

using Random
Random.seed!(11)
N,M = 30,50
A = big.(rand(N,M));
x0 = big.(rand(M));
b = A*x0;
# check : 
A*x0 == b;
start = big.(randn(M));

get_AiAA(A) = A'inv(A*A')
proj_eq(x,A,b,AiAA) = x .- AiAA*(A*x-b)
proj_ineq(x::T) where T = x > T(0) ? x : T(0)
function naive_version(start,A,b)
    AiAA = get_AiAA(A)
    x = deepcopy(start)
    old_x = zero(x)
    y = zero(x)
    p = zero(x)
    q = zero(x)
    # iter = 0
    while !(old_x ≈ x)
        old_x .= x
        y .= proj_eq(x.+p,A,b,AiAA)
        p .= x .+ p .- y
        x .= proj_ineq.(y .+ q)
        q .= y + q - x
        # iter += 1
    end
    # println("Converged in $iter iterations")
    return x
end
function update!(old_x,x,p,q,ProjP,Projq)
    old_x .= x
    x .= x .+ p
    p .= ProjP*x .- Projq
    x .= x .- p .+ q
    q .= x
    x[x.<0] .= 0
    q[q.>0] .= 0
end
function Dykstra(start,A,b)
    # Deal with the linear projection
    AiAA = A'inv(A*A')
    ProjP = AiAA*A
    Projq = AiAA*b

    # Dykstra : 
    x = deepcopy(start)
    old_x, p, q = zero(x), zero(x), zero(x)
    while !(old_x ≈ x)
        update!(old_x, x, p, q, ProjP, Projq)
    end
    return x
end

naive_solution = naive_version(start,A,b);
solution = Dykstra(start,A,b);
using Test
using BenchmarkTools
@test solution ≈ naive_solution
@test all(solution .>=0)
@test A*solution ≈ b
@btime Dykstra($start,$A,$b);

I included a naive version that is known to be the right algorithm to compare to. On my machine it takes 2.1s for these array sizes, but it gets slow for bigger arrays.

It seems to be type-stable. I have concerns regarding broadcasting : could’nt it be faster with loops in the update! function ?

kristoffer.carlsson · July 28, 2021, 12:59pm

Profiling is your friend when it comes to questions like this. Here, it looks like almost all time is spent in the matrix-matrix multiplication with big floats. So thinking about how to optimize that or avoid it at all should probably be the first step.

lrnv · July 28, 2021, 1:09pm

I hear you. By profiling, It seems like i’m allocating a lot a BigFloats, and hence should go to MutableArithmetics.jl
However I have troubles understanding how to change my update! function so that it down not allocate…

kristoffer.carlsson · July 28, 2021, 1:54pm

It seems MutableArithmetics have some stuff for matrix multiplication. You could perhaps try that:

julia> a = big.(rand(300, 300));

julia> c = similar(a);

julia> @btime MutableArithmetics.mul_to!(c, a, a);
  1.930 s (180002 allocations: 8.93 MiB)

julia> @btime mul!(c, a, a);
  3.692 s (108720000 allocations: 5.27 GiB)

Sukera · July 28, 2021, 2:02pm

Do you really need the extra precision from BigFloats or is something like a Double64 from DoubleFloats.jl precise enough?

lrnv · July 28, 2021, 3:19pm

I need both Which is why i’m testing with BigFloats and MultiFloats.jl, alongside standard Float64.

Topic		Replies	Views
Could this kernel go faster? Performance	3	809	January 4, 2022
Zygote dozens* of times slower than manually written function Performance zygote , forwarddiff	17	1719	April 21, 2022
Improving Performance of a Loop Performance	9	417	November 3, 2021
Inlining of rand -- Vector vs nx1 Matrix Internals & Design	5	544	May 6, 2021
Could anyone make this run faster? Performance	21	556	December 26, 2022

Simple Dykstra projection : Could it be faster?

Related topics