I’ve been using Yao.jl to simulate quantum evolution and encountered an issue with the time evolution function being unexpectedly slow. Notably, it performs much slower than in Mathematica. Here’s a snippet of my code:

function test(nbit, dt)
ham = sum([kron(nbit,i=>Z,i%nbit+1=>Z) for i in 1:nbit]) + sum([put(nbit, i=>X) for i in 1:nbit]);
state = zero_state(nbit);
state |> time_evolve(ham, dt)
end
@time test(9, 5) # 0.11 sec
@time test(10, 5) # 140 sec

Note that I am running this code on my server and the memory usage is far from being depleted.

Has anyone else experienced similar issues, or does anyone have suggestions for optimizing the performance of the time evolution function in Yao.jl?

I haven’t used Yao.jl so far, but looking at your example there are a few things to clarify that might help find the issue:

What is dt? Is it a global variable? In that case, you might want to make it part of the function, since global variables can affect performance – in general, the section on Performance Tips · The Julia Language has many useful tips.

What are you trying to benchmark exactly? Right now, it looks like you are constructing some time evolution function and then apply it directly to the state. Do you also want to measure the performance of both steps combined? Or just how fast the time_evolution acts on a state?

Here’s an example of avoiding a global dt (again, not sure if this is really the issue) and benchmarking only the application of the time evolution to the state with BenchmarkTools.jl (useful to get more informative results, but @time is also fine):

function make_time_evolution(nbit, d, dt)
ham = sum([kron(nbit,i=>Z,i%nbit+1=>Z) for i in 1:nbit]) + sum([put(nbit, i=>X) for i in 1:nbit]);
return time_evolve(ham, dt)
end
@btime state |> time_evolution setup=(
dt = 10.0;
d = 5;
nbits = 10;
state = zero_state(nbits);
time_evolution = make_time_evolution(nbits, d, dt))

After playing a bit with the parameters, I also see slower performance for nbits = 9 than for nbits = 10, but only about a factor of 10, not 1000 … there do seem to be many allocations going on in the time evolution function though.

What exactly did you do in Mathematica? I suspect that you are conparing different algorithms. Yao.jl’s time_evolve apparently uses Krylov subspace propagation which is a good algorithm for short times (i.e. small dt).

For the system size you are showing here (9 and 10) you can comfortably use exact diagonalization. ED becomes painful at around 14 spins if you cannot exploit symmetries to reduce the mateix size.

Regarding dt, I apologize for the typo in my previous message; it has been corrected.

The function I am benchmarking computes the matrix exponential of a 2^{\text{nbit}} \times 2^{\text{nbit}} matrix applied to a 2^{\text{nbit}}-length vector. Consequently, we expect the computational cost to roughly quadruple when nbit is increased by one.

The primary cost arises from applying the time evolution operator to the state, while there is no computation when constructing the operator itself in Yao.jl.

I’ve confirmed that nbit=10 requires 1000 times more computational resources compared to nbit=9, although there might be some optimization or caching involved during the second execution.

@time test(9, 5) # 6.693782 seconds (1.38 M allocations: 63.818 MiB, 0.52% gc time, 392.13% compilation time)
@time test(10, 5) # 145.690263 seconds (1.48 M allocations: 190.477 MiB, 0.09% gc time, 0.01% compilation time)
@time test(9, 5) # 0.126992 seconds (5.80 k allocations: 9.043 MiB)
@time test(10, 5) # 146.079248 seconds (1.48 M allocations: 190.437 MiB, 0.15% gc time)

Thank you for your insights. I believe the large dt in this specific algorithm is indeed causing the significant slowdown. However, it appears that Yao.jl does not offer the option to use other algorithms for time evolution.

I am now using MathLink.jl to pass the state and the Hamiltonian to Mathematica and complete the time evolution.

What do you do in Mathematica? What is your goal in the end?

The transverse-field Ising model you solve above could be implemented in Julia rather easily as well. I study disordered spin systems and have some libraries for constructing spin models and stuff. You could write an equivalent code to your OP like:

using SpinModels # not registered - see https://github.com/abraemer/SpinModels.jl
function psi_t(N, h, t)
# TFIM
H = NN(PBC(Chain(N))) * ZZ(0.5) + h*X(ones(N))
U = cis(Hermitian(Matrix(-t*H)))
state = zeros(2^N)
state[1] = 1.0
return U*state
end

Of course this is not efficient if you want different times and different states. So what is it you want to achieve in the end

Also if you wanted to exploit some symmetries, you can use SpinSymmetries.jl to construct projectors onto the differen sector. This is usually still worth it even if you state lives in all sectors (as it does in this example)

What I did in Mathematica is MatrixExp[-iHt, state] where H is the Hamiltonian and t is the evolution time. When t is large, Mathematica gives the result much faster, as I mentioned earlier.
I am actually dealing with a chaotic Hamiltonian. So I may expect less acceleration by exploiting any symmetries. Also I decide to continue using Yao.jl because I want to experiment some circuit operations. Anyway, thank you for your recommendation and I may recommend you as the solution

This is due to some issues that related to precision. If you are familiar with developing a package, could you please help us test the performance gain by switching back to ExponentialUtilities?

Maybe we should consider having two backends in Yao.

Hi Jingguo, thank you for your suggestion! I’m not fully prepared to develop the package, but I agree that having dual backends in Yao would be valuable.