The assumption in the MixedModels
package is that the random effects model matrices are generated from the indicators of a categorical covariate. In the simplest case (a simple, scalar random effect) the columns of X and Z would be a set of indicators for two categorical covariates. That is not the case here.
You could create the internal representation used to evaluate the log-likelihood and piggy back on that mechanism or just emulate it. I use BlockArrays
and some general representations of random-effects terms but you can skip that.
For a model like this you can “profile” the log-likelihood so it is a function of two parameters, the ratios of the standard deviations, λ₁ = σ₁/σ₃ and λ₂ = σ₂/σ₃ where, to save myself LaTeXing I translated α → 1, β → 2 and ε → 3.
Start with the X’X type of matrix from hcat(X, Z, Y)
using LinearAlgebra, Random
X = [1 0 0;
1 1 0;
1 1 0;
1 1 1;
1 1 1;
1 1 1]
Z = [0 0 0;
0.2 0 0;
0.2 1 0;
0.2 1.2 0;
0.2 1.3 1;
0.2 2 2]
const n = size(Z, 1)
const α = [0, 1, 2]
const β = [1, 2, 3]
Y = X*α .+ Z*β .+ randn(MersenneTwister(1234321), n)*0.01
XZY = hcat(X, Z, Y);
const A = Symmetric(XZY'XZY, :L)
const L = copy(A.data);
Given λ₁ and λ₂ you form the Cholesky factor of a matrix that I write in shorthand form as Λ'AΛ + I
, where
Λ is block diagonal in 3 diagonal blocks: λ₁I
, λ₂I
, and I
of sizes length(α), length(β) and 1, respectively and the I
is added only to the random-effects part. (In your case both α and β behave like random effects).
I would do it like this
function obj(λ)
k = LinearAlgebra.checksquare(A)
copyto!(L, A)
m = λ[1]
for j in eachindex(α)
for i in 1:k
L[i, j] *= m
L[j, i] *= m
end
L[j, j] += 1
end
offset = length(α)
m = λ[2]
for j in eachindex(β)
jj = offset + j
for i in 1:k
L[i, jj] *= m
L[jj, i] *= m
end
L[jj, jj] += 1
end
cholesky!(Symmetric(L, :L))
objective = 0.0
for j in 1:(offset + length(β)) # evaluate the logarithm of the determinant
objective += 2log(L[j, j])
end
objective += n * (1 + log(2π) + log(abs2(L[end,end])/n))
objective
end
Then minimize this objective with respect to the two-dimensional λ, both components of which are constrained to be non-negative.