Speed-up graphical lasso code


I would like to get you feedback to accelerate my routine glasso for graphical lasso computation (the code follows Friedman et al. 2007 https://arxiv.org/pdf/0708.3517.pdf) where S is the sample covariance matrix and ρ is the scaling factor upfront of the L1 penalty term.

using LinearAlgebra, Statistics, Lasso, SparseArrays, PDMats

function glasso(S, ρ; tol = 1e-3, Niter = 500)
    Nx = size(S, 1)
    W = S + ρ*I
    if Nx == 1
         return W
    W11 = zeros(Nx-1, Nx-1)
    b = zeros(Nx-1)
    βj = zeros(Nx-1)
    Wold = copy(W)
    jminus = zeros(Int64, Nx-1)
    # Iterate until convergence
    for i=1:Niter
        for j = 1:Nx
            jminus .= setdiff(1:Nx, j)
            # Compute a square-root factorization of W11
            W11 .= W[jminus, jminus]
            W11sqrt = sqrt(W11) #W_11^(1/2)
            b .= W11sqrt \ S[jminus,j] # W_11^(-1/2) * s_12
            # We need this rescaling by ρ/(Nx-1) due to the definition of the function fit in Lasso.jl, 
            # that includes a factor 1/(Nx-1) upfront of the log-likelihood term
            # fit(LassoPath, X, y, d=Normal(), l=canonicallink(d); ...)
            #  fits a linear or generalized linear Lasso path given the design
            #  matrix `X` and response `y`:
            #  ``\underset{\beta,b_0}{\operatorname{argmin}} -\frac{1}{N} \mathcal{L}(y|X,\beta,b_0) +     \lambda\left[(1-\alpha)\frac{1}{2}\|\beta\|_2^2 + \alpha\|\beta\|_1\right]``

            # Nx-1 is the number of rows for W11sqrt
            βj .= fit(LassoPath, W11sqrt, b, λ=[ρ/(Nx-1)],
                         standardize=false, intercept=false).coefs
            # Update W using that βj is a sparse vector
            W[j, jminus] = W11*sparse(βj)
            W[jminus, j] = W[j, jminus]
        # Stop criterion
        if norm(W - Wold,2) < tol
        copy!(Wold, W)
    return W

Minimal working example:

Nx = 40
Ne = 100
θX = PDMat(Symmetric(Matrix(2.0*sprand(Nx,Nx,0.3) + 5.0*I)))
ΣX = inv(θX)

πX = Distributions.MvNormal(μX, ΣX)
X = rand(πX, Ne)
S = cov(X')
W = glasso(S, 0.1)