As of now, the multithreading of OpenBLAS is pretty much entirely separate, that is -t won’t have any effect on the matrix operations (technically, that’s only true if you don’t call OpenBLAS from multiple Julia threads, but that’s a different story and irrelevant here). You can query the number of OpenBLAS threads via using LinearAlgebra; BLAS.get_num_threads() and set them via BLAS.set_num_threads(N) (or via the environment variable OPENBLAS_NUM_THREADS=N). A natural choice for N would be the number of available physical CPU cores, for example.
Another thing you could try to speed up things is switching out OpenBLAS in favor of MKL. This is super easy with Julia >= 1.7 using MKL.jl. In fact, it’s just ] add MKL and using MKL.