Possible speedup in matrix-diagonal products?

Thanks. I’ve learned that package.

dig deeper in the call chain

I really want to, but I’m having some trouble with the Debugger package.
@which can only reveal the top layer. Are you aware of some alternative ways to let the user fathom the stacktrace?