Is there a way to compile away a step in a calculation internal to some function called in a closure but without manually decomposing the function into its constituent calculations within the closure?
Below I provide a MWE in OLS because this is the simplest example I could think of: suppose we are going to run a loop of regressions using the same covariates but different outcome variables. We use a closure to create a new function that treats covariates as a constant.
my_pinv(X) = (X'X)^-1 * X' # yes I know pinv is faster, this is just to drive a larger computational time discrepancy
ols(X, Y) = my_pinv(X) * Y
N = 1_000
K = 100
Y_vars = 100
X = hcat([rand(N) for k in 1:K]...)
Ys = hcat.([rand(N) for k in 1:Y_vars])
pre_computable_time = @timed pinv1 = my_pinv(X)
time_faster_step = @timed pinv1 * Ys[1]
# want to avoid:
slow_time = Y_vars * (pre_computable_time.time + time_faster_step.time)
Ys .|> y -> ols(X, y)
# want to achieve:
pre_computable_time.time + (Y_vars * time_faster_step.time)
closed_ols = let x = X
y -> ols(x, y)
end
@time Ys .|> closed_ols
# yeilds no improvement over slow time.
# Although it works, I do not want to resort to rewriting the ols function in my let block
closed_ols2 = let x = X
x_pinv = my_pinv(x)
y -> x_pinv * y
end
@time Ys .|> closed_ols2
Naively, I would expect that on first call of closed_ols, the compiler would calculate my_pinv(X) and use this result in the method it generates to be used for all subsequent calls of closed_ols, given that the let block imposes x is a constant and therefore the my_pinv calculation within ols will be identical for all calls.
Is there any way to force the closure to precompute my_pinv(x) given x=X so as to avoid recomputing this within the loop, but doing so without explicitly rewriting the function ols within the closure? Is this even feasible? Obviously, I could manually rewrite the ols function wthin the closure (final example above), or I could create datatypes that store intermediate calculations, but Iâd like a convenient way to be able to abstract away from those things.
This question was motivated by previous discussion on anonymous closures. If I can force the compiler to precompute pinv for closed_ols, then I could achieve performance goals with:
Ys .|> (let x=X; y -> ols(x, y) end)
while also maximizing reuse of package code and abstraction layers provided by package code. Creating the closure within the piped function would also allow the closure to be garbage collected sometime after ols is calculated.
This would be even better with syntactic sugar, which other people have proposed in other threads:
Ys .|> ols(X, _)
Which would allow high performance, high readability, and high code reuse.
This category of problem shows up all the time in my workflows. The MWE used ols, but I am interested in a more general answer.