Thank you, I’ll check it out! I can sketch the problem:
We are interested in approximating a (costly) function f(x_1, ..., x_p) by simulating a grid of points (x_1^{(n)},...,x_p^{(n)}),~n = 1,...,N, evaluating the function along the grid to obtain values y^{(n)},~n = 1,...,N. To this end, we want to regress on features \phi_1(x_1,...,x_p),...,\phi_m(x_1,...,x_p). Both the grid-points and the features themselves do not exhibit something like a nice product structure and everything is certainly dense. (Both the construction of the grid-points and the “correct” choice of features depend on the problem at hand)
At different points in the algorithm, we wish to evaluate different functions f
, but it is valid (and I believe even desirable for stability) to re-use the same grid-points (and thus the same feature matrix), which is why running the QR decomposition once up-front was so useful.
What is also important to note for technical reasons related to the validity of our algorithm is that we must use an approximation of the form f \approx \sum_{i = 1}^m\beta_i\phi_i(x_1,...,x_p).
One thing that comes to mind is to split the sample in K batches, obtaining K estimates \beta^1,...,\beta^K and taking their average, but this should be suboptimal and I’m not aware whether this gets done in practice.