# Optimzing many linear solves

I have a function that runs many linear solves on subsets of columns from a matrix I give it. Something like:

``````function score(data, parents, child)

#Subset columns from the data matrix
@views begin
X = data[:, parents]
y = data[:, child]
end

#Solve the system
b = X\y

#Get the predicted values
ŷ = X*b

#return mean square error
return sum( (yᵢ - ŷᵢ)^2 for (yᵢ, ŷᵢ) in zip(y,ŷ)) / length(y)
end
``````

An example of calling this function:

``````data = rand(15,10)
parents = [1,2,3]
child = 4
result = score(data, parents, child)
``````

My question is whether there are any further optimizations I can add to make this function as performant as possible. In my case, the size of the `parents` vector can change which I think makes things tricky.

If you do the QR factorization of `data` (assuming that is full column rank), you can re-use it for repeated least-square solves with a subset of the columns. See e.g. GitHub - mpf/QRupdate.jl: Column and row updates to "Q-less" QR decomposition, including stable least-squares solves

1 Like

Oh this looks really interesting. In my case there’s no reason why the data wouldn’t be full rank. I tried implementing this method and it gave a little bit more than a 2x speed up . However, occasionally it gives different/worse solutions (relative to `X\y`). It looks like maybe the `csne()` function would need more iterations to converge (?). I’ll have to read the reference they provide to better understand the method.