Simple matrix multiplication using MtlArray kills REPL

First off, for those working on the development of Metal.jl, if you read this, thank you!

Now for my issue / bug. In trying to do something with metal, I ran into the following reproducible problem. Consider the simple matrix multiply code below. Running this in the REPL, if I go line by line through this, the benchmark run of mat_mul will run just fine. However, if I try to run the benchmark command a second time, it kills my REPL before I can even read what error it is throwing. This happens every time. If I start a new REPL, the final command testing the mat_mul! function runs, but if you run it a second time, it kills the REPL. I wish I could provide an error trace, but the REPL dies immediately. Anyone run into something like this? Any thoughts? Am I just doing something silly here?

All version info is in a screenshot below.

using Metal, BenchmarkTools

Metal.versioninfo()

A, B, C = Float32.(randn(100,100)), Float32.(randn(100,100)), Float32.(randn(100,100));
Am, Bm, Cm = MtlArray(A), MtlArray(B), MtlArray(C);

function mat_mul!(Out,M,N)
    Out .= M*N
end

@benchmark mat_mul!($Cm,$Am,$Bm)

Screenshot 2023-08-17 at 9.22.00 AM

I’m not seeing this with Metal.jl v0.5.0 — perhaps try updating?

Hmm. Thanks for trying it.

I can’t seem to get to v0.5.0. I updated Metal.jl before testing this today. Just tried to update again. I’m still on 0.4.1. I’m assuming this is either because of an OSX version issue (though I’m on v13.0 at least), Julia version (on 1.9.0), or some other dependency mismatch.

Any thoughts?

This is likely Command buffer callbacks can cause bus error during thread adoption · Issue #138 · JuliaGPU/Metal.jl · GitHub or Crash during MTLDispatchListApply · Issue #225 · JuliaGPU/Metal.jl · GitHub. Both are caused by bugs in Julia that manifest under memory pressure. 1.9.3 will contain fixes for these bugs, and if you want you can try a build from release-1.9: Backports for 1.9.3 by KristofferC · Pull Request #50507 · JuliaLang/julia · GitHub.

I’m happy to send the error output, but as I mentioned, whatever error it is, is killing the REPL. Any suggestion for how I can capture the error text before the REPL dies?