Not using PreallocationTools.jl correctly

This isn’t an answer to the question you directly posed, but I think a better solution to your underlying problem is to use Enzyme.jl.

Here’s how I’d do this with Enzyme:

using LinearAlgebra, Enzyme

function setup_loglikelihood(X, Y)
    Q, N = size(X)
    D = size(Y, 1)
    pred_storage = zeros(D, N)
    
    function loglikelihood(param)
        W = reshape(param, D, Q)
        mul!(pred_storage, W, X)
        -sum(abs2, (Y - pred_storage))
    end
    return loglikelihood
end

and then

# create mock data and parameter
X = randn(4,10) # mock inputs
Y = randn(5, 10) # mock outputs
W = randn(4,5) # mock parameters
logl = setup_loglikelihood(X, Y)

# Create a 'shadow' of W. After autodiff, this will have the property that
# dW[i] == ∂logl(W)[i]/∂W[i]
dW = make_zero(W)

# Create a 'shadow' of logl. This is necessary because logl is a closure that
# stores data which is mutated during differentiation.
dlogl = make_zero(logl)

# Run the autodiff
Enzyme.autodiff(Reverse, Duplicated(logl, dlogl), Duplicated(W, dW))

Looking at the gradient accumulated into dW we see

julia> dW
4×5 Matrix{Float64}:
 -20.4806    -1.63579    10.6586    6.69816   1.45061
 -11.3634    16.5473     19.6695   -2.8299   -9.35841
  26.4333    -0.970109  -19.8093    5.12396  23.1772
  -4.74061  -14.9736     -8.79992  13.625     9.38448

This seems to agree with what I find using finite differences.

1 Like