Hey,

I’m wondering how to implement a method with multi-threading that also allows for mutating external variables in-place. The idea is that each thread/base requires their “own” external variable for data race free mutations.

One solution I came up with is based on `FLoops.jl`

(down below). The `@floop`

macro can run the for-loop in single or multiple threads (`ex=SequentialEx()`

or `ex=ThreadedEx()`

, respectively). The `@init`

macro, together with a `deepcopy()`

, should handle data race free mutating variables. Does anyone understand what I want and is the code correct and safe?

```
using FLoops
# method to emulate some "runtime"
sleep2 = t -> (b=time(); while b+t > time() end)
function FL_ex3!(y, varexternal, niter; ex=ThreadedEx())
@floop ex for i in 1:niter
@init c = deepcopy(varexternal)
sleep2(0.25)
c .= (1.0*i, 2.0*i, 3.0*i)
y[i, :] .= c
end
y
end
### sequential and multi-threading give the same output (as a check)
FL_ex3!(zeros(10, 3), zeros(3), 10, ex=SequentialEx()) == FL_ex3!(zeros(10, 3), zeros(3), 10)
# true
### compare performance multi-threaded vs. single-threaded
Threads.nthreads() # 10
using BenchmarkTools
@btime FL_ex3!(zeros(10, 3), zeros(3), 10)
# 250.124 ms (150 allocations: 10.45 KiB)
@btime FL_ex3!(zeros(10, 3), zeros(3), 10, ex=SequentialEx())
# 2.500 s (7 allocations: 960 bytes)
```

Multi-threading is a bit new to me, so sorry if the terminology is a bit off. Would be very glad for input!