For reference, I applied the same technique as in those notes for the m \times m complex matrix X = X^\dagger with m \times n constraints XA = B (using CR calculus to handle the complex derivatives), and found that you can reduce it to a smaller m \times n Sylvester-transpose equation for the minimum-norm solution (minimizing \Vert X \Vert_F), if it exists:
X=A \Lambda^\dagger + \Lambda A^\dagger
where \Lambda is an m \times n matrix (of Lagrange multipliers) satisfying the Sylvester-transpose equation:
A \Lambda^\dagger A + \Lambda A^\dagger A = B
This equation does not have a solution for a general right-hand side B, but if you have consistent A, B you can solve it quickly by an iterative method. For example:
using LinearAlgebra, KrylovKit
# generate consistent data
m,n = 1000, 123
A = randn(ComplexF64, m,n); B = hermitianpart!(randn(ComplexF64,m,m)) * A;
λ, = linsolve(y -> let Y = reshape(y,m,n); vec(A*(Y'*A)+Y*(A'*A)); end, vec(B));
Λ = reshape(λ, m, n); X = A*Λ' + Λ*A';
gives a solution (verified by small backwards error) very quickly with the GMRES implementation in KrylovKit.jl (for me, it converged in 39 matvecs in the example above).
This has the advantage of being a smaller system (mn unknowns rather than m^2), assuming n \ll m, and the complexity of each matvec (each iterative step) is O(mn^2) rather than O(m^2 n) compared to iteratively solving either your equation or the smaller Sylvester-transpose system YA + Y^\dagger A = B. (And explicitly forming the Kronecker-product matrices is vastly more expensive, of course.)
PS. I was a little surprised in retrospect that KrylovKit doesn’t get confused by the complex conjugations in my operator? It’s not strictly a linear operator unless you reinterpret the input/output vectors in terms of real and imaginary parts, though you can easily do this of course. Maybe the standard GMRES method is actually equivalent to this, however(?), since least-squares and orthogonality in the complex-vector sense maps exactly onto least-squares and orthogonality for a vector of real/imaginary parts. Might be safer to use a reinterpret
to map between complex vectors and real vectors of twice the length.