Alternative to SharedArrays for multi-node cluster?

If by MPI you mean repeated data transfer between cores then yes, Julia does have a performance bottleneck in this, as the arrays are not pre-allocated unlike MPI. This case is different as there’s not much transfer involved. I believe the data-transfer lags might be mitigated to some extent with ArrayChannels.jl, but I haven’t used it myself.