To assist with formulating a JuMP model for an SDP, I need to create affine expressions y (then the objectives and constraints can be cheaply constructed by getindex into y):
F = qr(m::SparseMatrixCSC) # call to SuiteSparse.SparseQR
r = rank(F)
cols = @view F.Q[:,1:r]
model = JuMP.Model()
@variable(model, z[1:r])
@expression(model, y, cols * z)
# omitted: objective, constraints...
The size of cols is (172410, 1107) for a small instance and around (10_000_000,100_000) for a larger one. How can I speed up the creation of these expressions?
We should really have an efficient getindex method for slicing Q, but no one has implemented one yet. We could similarly implement a more efficient method for working with views of Q.
My naive benchmark of the whole code shows that it is allocating way too much. So you think the bottleneck is with getindex and not matmul of non-sparse matrix with Vector{VariableRef}? Can QR.Q be efficiently multiplied with Vector{Float64} or arbitrary Vector?
Edit: if I happen to know that Q is sparse, would it benefit to explicitly make sparse(Q)?
I didn’t notice that you are working with sparse QR. That is even more tricky. See: Differences in A \ b for sparse and nonsparse rank-deficient A - #5 by stevengj … certainly, you can multiply by a subset of the columns of Q efficiently as explained in that post, but I’m less sure about what is feasible to do with JuMP.
(This may be one of the cases where JuMP is too limiting and you need to call the lower-level optimizers directly.)
Thanks @stevengj the problem is really with getindex. Extracting columns of QR.Q (obtained from sparse qr call) by matrix multiplication with Matrix(I, size(QR.Q,1), r) is really fast!!!
Btw, some of these FAQs should really be in the documentation of Sparse Linear Algebra.
As I mentioned above, however, you have to be careful because sparse QR also includes a permutation of the input and output rows, e.g. as employed here. That is, F.Q may not mean what you think it means.
Thanks but I got that part sorted out after struggling for some time. Basically from the documentation F.Q*F.R = A[F.prow,F.pcol] which is equivalent to “index on the left” as in (F.Q*F.R)[invperm(F.prow),invperm(F.pcol)] = A or also to F.Q[invperm(F.prow),invperm(F.pcol)]*F.R[invperm(F.prow),invperm(F.pcol)] = A (distribute the indexing inside).