Julia multithreading no performance improvement

Beware of pushing to res within a thread, I think you might open yourself up to race conditions. It would be safer to specify the indices beforehand:

    function mythread(df)
        n1 = length(df)
        nstep = 1000
        nseg = n1 ÷ nstep
        res = Vector{Vector{Float64}}(undef, nseg + 1)
        Threads.@threads for i = 1:nseg
            res[i] = evaluate(rbx, reduce(hcat, grids[((i-1)*nstep+1):(i*nstep)]))
        end
        res[end] = evaluate(rbx, reduce(hcat, grids[((nseg-1)*nstep+1):end]))
        return rbx
    end

Out of curiosity, do you really need to fill res if you don’t return it?