Hi,

I am working with 3D + temporal dimension (128*128*96*50) MRI datasets that I want to fit using curve_fit with the following model.

The memory consumption is particularly important using multi-threading and does not improve the time to fit the whole dataset.

```
ima = Matrix(reshape(abs.(MRI_image, :, 50)')
T= eltype(ima)
function test_reco(ima,TE_vec,L)
model_fit(t, p) = sqrt.((p[1] * exp.(-t / p[2])) .^ 2 .+ 2 * L * p[3]^2)
Threads.@threads for i in axes(ima,2)
y=view(ima,:,i)
p0 = [maximum(y), T.(30),minimum(y)]
fit = curve_fit(model_fit, TE_vec, y, p0,autodiff=:forwarddiff).param
end
end
@time test_reco(ima,TE_vec,4)
```

which gives

232.342639 seconds (584.71 M allocations: 143.190 GiB, 9.08% gc time)

and without Threads.@threads

203.611481 seconds (584.71 M allocations: 143.190 GiB, 8.70% gc time)

Any advice ?