Improve the performance of multiplication of an arbitrary number of matrices

Thanks!

But this is actually for my package:

https://github.com/ronisbr/ReferenceFrameRotations.jl

I am almost finishing a toolbox with functions related to Satellite simulations. In this toolbox, there are a lot of functions to create rotations between reference frames (J2000, GCRF, PEF, TOD, MOD, TEME, etc.). I want that each function can compute the rotation using Quaternions ou DCMs. If I do not have this compose_rotation function, then I will have to create two functions for each rotation (which is a lot, really…).

I will try to benchmark inside a function and will post the results here!