Thank you, I’ll check it out! I can sketch the problem:
We are interested in approximating a (costly) function f(x_1, ..., x_p) by simulating a grid of points (x_1^{(n)},...,x_p^{(n)}),~n = 1,...,N, evaluating the function along the grid to obtain values y^{(n)},~n = 1,...,N. To this end, we want to regress on features \phi_1(x_1,...,x_p),...,\phi_m(x_1,...,x_p). Both the grid-points and the features themselves do not exhibit something like a nice product structure and everything is certainly dense. (Both the construction of the grid-points and the “correct” choice of features depend on the problem at hand)
At different points in the algorithm, we wish to evaluate different functions f, but it is valid (and I believe even desirable for stability) to re-use the same grid-points (and thus the same feature matrix), which is why running the QR decomposition once up-front was so useful.
What is also important to note for technical reasons related to the validity of our algorithm is that we must use an approximation of the form f \approx \sum_{i = 1}^m\beta_i\phi_i(x_1,...,x_p).
One thing that comes to mind is to split the sample in K batches, obtaining K estimates \beta^1,...,\beta^K and taking their average, but this should be suboptimal and I’m not aware whether this gets done in practice.