Why not just @. du = ...? You’re not fusing the whole operation: you forgot a few dots so it’s allocating a few steps. That RHS function is not written in a very fast way, so that will impact the speed you see.
Edit: I see, matmul. Everything else should be dotted those, and it’s not.
Is that the right solver? Sounds like more of a problem for VCABM, QNDF, CVODE_BDF, LSODA, ROCK2, or ROCK4. Did you do a comparative study of methods?
Not sharing code makes it impossible to really debug, but from experience, I would guess that your function is not type-stable because you made p3 a Vector when it should be a Tuple since it’s filled with different typed objects. Did you look at @code_warntype to see?