Craig-Bampton method on large problems

I’ve implemented the Craig-Bampton method for model reduction at Now I would like to perform the reduction on very large models (at least 1 M x 1 M sized matrices), but some of the matrix operations (such as calculating a huge matrix inversion or instead perform huge matrix chain multiplication) would need an enormous amount of computational resources.

Would anyone have an idea how this method is implemented in commercial softwares? Is Julia able to operate on matrices at this scale? If it is, then what would be the trick to perform the matrix operations without getting an OutOfMemoryError or without taking a huge amount of time? Most implementations of this method are for academic use only and they only work on very small examples.

The inputs will be sparse and you never want to form their explicit inverses (these are in general dense). Use e.g. Kmm \ X' instead of inv(Kmm) * X'.

If you do something like Finite element modeling and you accidentally form an ndof x ndof matrix, you are, for any real problem, out of memory.