Apart to above suggestions you may also take a look at Julia 1.8 [BLAS performance testing for Julia 1.8]. I have not followed it recently, however as I understand it currently, there might be some significant changes and automation associated with BLAS and 1.8 release.
@carstenbauer Would BLISBLAS.jl work on Neoverse N1 (Ampere Altra)? I have heard some words that on this particular CPU BLIS might be one of the most favorable options.