The assumption in the MixedModels package is that the random effects model matrices are generated from the indicators of a categorical covariate.  In the simplest case (a simple, scalar random effect) the columns of X and Z would be a set of indicators for two categorical covariates.  That is not the case here.
You could create the internal representation used to evaluate the log-likelihood and piggy back on that mechanism or just emulate it.  I use BlockArrays and some general representations of random-effects terms but you can skip that.
For a model like this you can “profile” the log-likelihood so it is a function of two parameters, the ratios of the standard deviations, λ₁ = σ₁/σ₃ and λ₂ = σ₂/σ₃ where, to save myself LaTeXing I translated α → 1, β → 2 and ε → 3.
Start with the X’X type of matrix from hcat(X, Z, Y)
using LinearAlgebra, Random
X = [1 0 0;
     1 1 0;
     1 1 0;
     1 1 1;
     1 1 1;
     1 1 1]
Z = [0 0 0;
     0.2 0 0;
     0.2 1 0;
     0.2 1.2 0;
     0.2 1.3 1;
     0.2 2 2]
const n = size(Z, 1)
const α = [0, 1, 2]
const β = [1, 2, 3]
Y = X*α .+ Z*β .+ randn(MersenneTwister(1234321), n)*0.01
XZY = hcat(X, Z, Y);
const A = Symmetric(XZY'XZY, :L)
const L = copy(A.data);
Given λ₁ and λ₂ you form the Cholesky factor of a matrix that I write in shorthand form as Λ'AΛ + I, where
Λ is block diagonal in 3 diagonal blocks: λ₁I, λ₂I, and I of sizes length(α), length(β) and 1, respectively and the I is added only to the random-effects part. (In your case both α and β behave like random effects).
I would do it like this
function obj(λ)
    k = LinearAlgebra.checksquare(A)
    copyto!(L, A)
    m = λ[1]
    for j in eachindex(α)
        for i in 1:k
            L[i, j] *= m
            L[j, i] *= m
        end
        L[j, j] += 1
    end
    offset = length(α)
    m = λ[2]
    for j in eachindex(β)
        jj = offset + j
        for i in 1:k
            L[i, jj] *= m
            L[jj, i] *= m
        end
        L[jj, jj] += 1
    end
    cholesky!(Symmetric(L, :L))
    objective = 0.0
    for j in 1:(offset + length(β))  # evaluate the logarithm of the determinant
        objective += 2log(L[j, j])
    end
    objective += n * (1 + log(2π) + log(abs2(L[end,end])/n))
    objective
end
Then minimize this objective with respect to the two-dimensional λ, both components of which are constrained to be non-negative.