One immediate improvement is to change the AD backend in Turing to use ReverseDiff.jl instead of ForwardDiff.jl (the default)
using Turing, ReverseDiff
Turing.setadbackend(:reversediff)
Turing.setrdcache(true)
See also Automatic Differentiation
In my experience Turing will still be slower than Stan for these types of models where the number of parameters is large.
We had some discussions in the past which might also be helpful to you: