Vecp, transition matrix and kronecker product

Hi to everybody,
Given 4 different matrices \boldsymbol{A}, \boldsymbol{C}, \boldsymbol{D}, \boldsymbol{E}, I want to evaluate the following quantity

\operatorname{vecp}(\boldsymbol{A})^{T} \boldsymbol{B}_{n}^{+}(\boldsymbol{E} \otimes \boldsymbol{C})\left(\boldsymbol{B}_{n}^{+}\right)^{T} \operatorname{vecp}(\boldsymbol{D})

where \operatorname{vecp} is the operator which transforms a symmetric matrix

X=\left(\begin{array}{llll} x_{11} & x_{12} & \cdots & x_{1 p} \\ x_{21} & x_{22} & \cdots & x_{2 p} \\ & & & \\ x_{p 1} & x_{p 2} & \cdots & x_{p p} \end{array}\right)

in a vector

\operatorname{vecp}(X)=\left(\begin{array}{c} x_{11} \\ x_{12} \\ x_{22} \\ \vdots \\ x_{1 p} \\ \vdots \\ x_{p p} \end{array}\right)

This is different from the vec method implemented in Julia Base, since the operation I am interested in reduces the number of elements.

The second object I am interested in is the transition matrix, defined by

\left(B_{p}\right)_{i j, g h}=\frac{1}{2}\left(\delta_{i g} \delta_{j h}+\delta_{i h} \delta_{j g}\right), i \leq p, j \leq p, g \leq h \leq p

Is there any package which implements them? I have done a research, but I did not find something.
Thank you all,
Marco

If I’m reading this correctly, you need a fairly special p for length(vecp(A)) to be equal to n^2. Then the 5-minute version which is probably full of bugs is:

julia> vecp(x::AbstractMatrix) = [x[i,j] for j in axes(x,2) for i in 1:j];

julia> p = 8; n = 6; A = Symmetric(rand(p,p)); D = Symmetric(rand(p,p)); n^2 == length(vecp(D))
true

julia> actb(x) = (Symmetric(x) .+ Diagonal(x)) ./ 2; # this is action of B, keep changing my mind...

julia> beeA = actb(reshape(vecp(A), n,n)); beeD = actb(reshape(vecp(D), n,n));

julia> E = randn(n,n); C = randn(n,n);

julia> @tullio out := beeA[i,j] * E[i,i'] * C[j,j'] * beeD[i',j']
4.990384713976477

julia> vec(beeA)' * kron(E,C) * vec(beeD)
4.990384713976476
4 Likes

Thank you for your answer, @mcabbott !
I had already written something for vecp which is equivalent of your, but I did not manage to write something (easily) for \boldsymbol{B}_n^+.
I do not completely understand what you did (from an algebric point of view), expecially since (shame on me: I was not clear at the beginning) the four matrices have the same dimension. It is possible that I am not understanding something, since you have verified an identity similiar to the one I know

\operatorname{Tr}[\boldsymbol{A} \boldsymbol{C} \boldsymbol{D} \boldsymbol{E}]=\operatorname{vecp}(\boldsymbol{A})^{T} \boldsymbol{B}_{n}^{+}(\boldsymbol{E} \otimes \boldsymbol{C})\left(\boldsymbol{B}_{n}^{+}\right)^{T} \operatorname{vecp}(\boldsymbol{D})

However (for some silly physicists motivation) I do need to evaluate \boldsymbol{B}_{n}^{+}(\boldsymbol{E} \otimes \boldsymbol{C})\left(\boldsymbol{B}_{n}^{+}\right)^{T}. I believed that the easiest way was to create \boldsymbol{B}_{n}, but maybe there is a smarter solution.
I also have a Python code to create it, but before translating it in Julia I was wondering if it has already been implemented by someone and I was just missing it.
Thanks,
Marco

Ok, but then I find the vecp thing a little confusing, since its output has a different dimension. It is, however, pretty similar to B so perhaps they are somehow writing the same thing twice? If I write BZ_ij = B_ij,gh * Z_gh then only the upper triangle of Z matters, exactly the elements vecp keeps, perhaps it’s just saying vec(actb(A))' * kron(E,C) * vec(actb(D))? But I may have lost a transpose on B here.

1 Like

Sorry for my late answer: I had several deadlines and I couldn’t work on this item.
Luckily, now I had a little time.

Mathematical Background

First of all, I want to link some resources
1 The elimination Matrix
2 Symmetry, 0-1 Matrices and Jacobians

Given 4 matrices A, B, C, D, the following identity holds

(\operatorname{vec} A)^{\prime}(E \otimes B) \operatorname{vec} C=\operatorname{tr} A^{\prime} BCE^{\prime}\qquad (1)

if the right-hand side exists.
If A is a m,n matrix, then \operatorname{vec} A is defined by

\operatorname{vec} A=\left(\begin{array}{c} A_{\cdot 1} \\ \vdots \\ A_{\cdot n} \end{array}\right)

If A is a square symmetric matrix, \operatorname{vecp} A is the collection of all unique elements of A.
\operatorname{vec} A and \operatorname{vecp} A can be mapped one to another using the duplication D and elimination L (their definitions can be found in references 1 and 2; in particular, reference 1 gives an explicit definition of the elimination and duplication matrix):

\operatorname{vec} A = D \operatorname{vecp} A \qquad (2)
\operatorname{vecp} A = L\operatorname{vec} A

In particular, Equation (1) can be rewritten using Equation (2)

(\operatorname{vecp} A)^{\prime} D^{\prime}(E \otimes B) D\operatorname{vecp} C=\operatorname{tr} A^{\prime} BCE^{\prime}\qquad (3)

Implementation

Here is my implementation of the duplication and elimination matrix

function uᵢⱼ(n::Int, i::Int, j::Int)
    u = zeros(floor(Int,0.5*n*(n+1)))
    u[floor(Int64, (j-1)*n+i-0.5*j*(j-1))] = 1
    return u
end

function eᵢ(n::Int, i::Int)
    e = zeros(n)
    e[i] = 1
    return e
end

function Eᵢⱼ(n::Int, i::Int, j::Int)
    E = zeros((n,n))
    E[i,j] = 1
    return E
end

function Tᵢⱼ(n::Int, i::Int, j::Int)
    if i != j
        T = Eᵢⱼ(n,i,j) + Eᵢⱼ(n,j,i)
    else
        T = Eᵢⱼ(n,i,i)
    end
    return T
end

function EliminationMatrix(n::Int)
    L = zeros((floor(Int64,0.5*n*(n+1)), n*n))
    for i in 1:n
        for j in 1:i
            L += kron(uᵢⱼ(n,i,j), transpose(vec(Eᵢⱼ(n,i,j))))
        end
    end
    return L
end

function DuplicationMatrix(n::Int)
    D = zeros((n*n, floor(Int64,0.5*n*(n+1))))
    for i in 1:n
        for j in 1:i
            D += transpose(kron(uᵢⱼ(n,i,j), transpose(vec(Tᵢⱼ(n,i,j)))))
        end
    end
    return D
end

Checks

Using some random matrix, I checked that the algebric properties are satisfied

#generating random matrices
n = 10
A = rand(n,n)
A = A + transpose(A)
B = rand(n,n)
B = B + transpose(B)
C = rand(n,n)
C = C + transpose(C)
E = rand(n,n)
E = E + transpose(E)

AB = zeros(n,n)
CE = zeros(n,n)
ABCE = zeros(n,n)

LinearAlgebra.mul!(AB, transpose(A), B)
LinearAlgebra.mul!(CE, C, transpose(E))
LinearAlgebra.mul!(ABCE, AB, CE)

#result with first technique
LinearAlgebra.tr(ABCE)


kronEB = kron(E,B) 

Dáµ€kronEB = zeros(floor(Int,n*0.5*(n+1)),n*n)
LinearAlgebra.mul!(Dáµ€kronEB, Transpose(DuplicationMatrix(n)), kronEB)
Dáµ€kronEBD = zeros(floor(Int,n*0.5*(n+1)),floor(Int,n*0.5*(n+1)))
LinearAlgebra.mul!(Dáµ€kronEBD, Dáµ€kronEB, (DuplicationMatrix(n)))

vecpA = zeros(floor(Int,n*0.5*(n+1)))
LinearAlgebra.mul!(vecpA, EliminationMatrix(n), vec(A))
vecpC = zeros(floor(Int,n*0.5*(n+1)))
LinearAlgebra.mul!(vecpC, EliminationMatrix(n), vec(C))

vecpAáµ€Dáµ€kronEBD = zeros(1,floor(Int,n*0.5*(n+1)))
LinearAlgebra.mul!(vecpAáµ€Dáµ€kronEBD, transpose(vecpA), Dáµ€kronEBD)

vecpAáµ€Dáµ€kronEBDvecpC = zeros(1,1)
#result with the second method
LinearAlgebra.mul!(vecpAáµ€Dáµ€kronEBDvecpC , vecpAáµ€Dáµ€kronEBD, vecpC)

The results agree (up to machine precision).

@mcabbott , are you aware of any packages including the duplication and elimination matrix as implemented here? I looked into LinearAlgebra, but I found nothing similar.
Thank you,
Marco

It’s extremely inefficient to use matrix multiplication with matrices that are almost all zeros (some of your matrices have just 1 or 2 nonzero entries, and the nonzero entries are usually 1). You could use sparse-matrix data structures, but your matrices are so structured that this is a prime case for just writing out your own loops that perform the underlying operations directly, without matrices.

PS. On a random note, things like floor(Int64,0.5*n*(n+1)) can be written simply as (n*(n+1))>>1, avoiding the conversion to/from floating-point. But this is minor relative to the point above.

4 Likes

To be honest, I didn’t focus on optimization since before I wanted to find a a method to compute those matrices.
Now that I found it, I can work on optimization. What is your suggestion? Should I use the structure of these matrices to write them down, without allocating intermediate quantities such as u, T and E?
Best regards,
Marco

I would write out the nonzero operations that you actually want to perform — for something like multiplying by a transition matrix, they will be pretty simple. I also think you want to avoid doing the Kronecker products entirely, because it looks like you are computing a bunch of entries that you end up just discarding. i.e. in an expression like B(E \otimes C) B^T, you essentially want to multiply E and C by B and B^T, respectively, before the Kronecker product. (I think it will be clearer if you write B = (e_g e_h^* + e_h e_g^*)/2 as a low-rank sum of outer products of unit vectors, since that way you can do matrix–vector products instead of matrix–matrix products…though even those are suboptimal unless you exploit the sparsity of the vectors.)

I suspect that the whole thing may actually become clearer — it looks like you’re really obscuring the essence of what you are doing by hiding the interesting details inside the matrices.

1 Like

So, if I understand correctly…you suggest to avoid the Kronecker product and instead defined a function that directly performs only the relevant operations, rather than using the standard Kronecker product and removing the redundant elements…am I right?

Yes

1 Like