Making Turing Fast with large numbers of parameters?

I completely understand, the first number I look at when running a Stan model is the grad eval time (which CmdStan prints first). This can vary a lot but it helps distinguish whether I should work on the coding or the math. If you can get a similar number from Turing, you’d be better positioned to understand what’s happening. Since it’s just Julia, shouldn’t you be able to @btime your grad eval?