Hi community, I have an extended question from this post.
I have a loss function f(y, X, β)::Number
which requires data y
and X
. Matrix X
changes often and I want to compute:
- The last column of the Hessian of
f
with respect toβ
(this is the only thing I don’t know how to do yet) - The last entry of the Hessian
- The last entry of gradient
- The first
5*5
block of the Hessian (solved in previous post)
Can someone teach me how to accomplish (1)? Below I show code for doing items (2)-(4). Importantly, I want to do these without computing the entire gradient/Hessian.
Examples
As in the original post, let f(\beta) = \frac{1}{2}\|y - X\beta\|_2^2 be the least squares loss. The analytical gradient is \nabla f(\beta) = -X'(y-X\beta) and analytical Hessian is \nabla^2f(\beta) = X'X
using ForwardDiff
function f(y::AbstractVector, X::AbstractMatrix, β::AbstractVector)
0.5*sum(abs2, y - X*β)
end
f(β::AbstractVector) = f(y, X, β)
f(y::AbstractVector, X::AbstractMatrix, β1, β2) = f(y, X, [β1; β2])
f1(β1::AbstractVector) = f(y, X, β1, β2)
f1(β1::Number) = f(y, X, β1, β2)
f2(β2::AbstractVector) = f(y, X, β1, β2)
f2(β2::Number) = f(y, X, β1, β2)
# simulate data
n = 3
p = 5
X = randn(n, p)
y = randn(n)
β = randn(p)
β1 = β[1:end-1]
β2 = β[end]
# analytical gradient and Hessian
grad_true = -X'*(y-X*β)
hess_true = X'*X
(2) Compute last entry of Hessian:
get_hess_last(β2::Number) = ForwardDiff.derivative(β2 -> ForwardDiff.derivative(f2, β2), β2)
@show get_hess_last(β2) ≈ hess_true[end, end]
(3) compute last entry of gradient
get_grad_last(β2::Number) = ForwardDiff.derivative(f2, β2)
@show get_grad_last(β2) ≈ grad_true[end]
(4) compute first (p-1)*(p-1)
block of Hessian:
get_hess_first(x::AbstractVector) = ForwardDiff.hessian(f1, x)
@show all(get_hess_first(β1) .≈ hess_true[1:end-1, 1:end-1])
Can someone teach me how to compute ForwardDiff.hessian(f, β)[:, end] without having to compute the entire Hessian?