Nondeterministic results for subset sum using @threads

I’ve encountered nondeterministic results for a naive subset sum implementation using @threads. This algorithm fills the dynamic programming table column by column from left to right. There’s one column for each integer in set s, and the length of each column is given by the value of the target sum S, plus one. Computation of elements of each column are independent, since they depend only on values of the previous column to the left. Therefore there should be no race conditions when I parallelize the inner loop:

import Base.Threads.@threads
function subsetSumThreads(s, S)
    n = length(s)
    F = falses(S+1,n)
    for i in 1:n
        F[1,i] = true
    end
    s[1]≤ S && (F[s[1]+1,1] = true)
    @inbounds for j in 2:n
        @threads for i in 2:S+1
            F[i,j] = F[i,j-1]
            i > s[j] && (F[i,j] = F[i,j] || F[i-s[j],j-1])
        end
    end
    return F
end

However, when I test this function, it often doesn’t work. For instance, for the set s = [6, 5, 3, 1, 8, 10] and the target sum S=43, I sometimes get wrong values in the F matrix:

1 1 1 1 1 1  1 1 1 1 1 1 
0 0 0 1 1 1  0 0 0 1 1 1 
0 0 0 0 0 0  0 0 0 0 0 0 
0 0 1 1 1 1  0 0 1 1 1 1 
0 0 0 1 1 1  0 0 0 0 0 0 
0 1 1 1 1 1  0 1 1 1 1 1 
1 1 1 1 1 1  1 1 1 1 1 1 
0 0 0 1 1 1  0 0 0 1 1 1 
0 0 1 1 1 1  0 0 1 1 1 1 
0 0 1 1 1 1  0 0 1 0 1 1 
0 0 0 1 1 1  0 0 0 1 1 1 
0 1 1 1 1 1  0 1 1 1 1 1 
0 0 0 1 1 1  0 0 0 1 1 1 
0 0 0 0 1 1  0 0 0 0 1 1 
0 0 1 1 1 1  0 0 1 1 1 1 
0 0 0 1 1 1  0 0 0 0 1 1 
0 0 0 0 1 1  0 0 0 0 1 1 
0 0 0 0 1 1  0 0 0 0 0 1 
0 0 0 0 1 1  0 0 0 0 1 1 
0 0 0 0 1 1  0 0 0 0 1 1 
0 0 0 0 1 1  0 0 0 0 1 1 
0 0 0 0 0 1  0 0 0 0 0 1 
0 0 0 0 1 1  0 0 0 0 1 1 
0 0 0 0 1 1  0 0 0 0 0 1 
0 0 0 0 0 1  0 0 0 0 0 1 
0 0 0 0 0 1  0 0 0 0 0 1 
0 0 0 0 0 1  0 0 0 0 0 1 
0 0 0 0 0 1  0 0 0 0 0 0 
0 0 0 0 0 1  0 0 0 0 0 1 
0 0 0 0 0 1  0 0 0 0 0 1 
0 0 0 0 0 1  0 0 0 0 0 1 
0 0 0 0 0 0  0 0 0 0 0 0 
0 0 0 0 0 1  0 0 0 0 0 1 
0 0 0 0 0 1  0 0 0 0 0 0 
0 0 0 0 0 0  0 0 0 0 0 0 
0 0 0 0 0 0  0 0 0 0 0 0 
0 0 0 0 0 0  0 0 0 0 0 0 
0 0 0 0 0 0  0 0 0 0 0 0 
0 0 0 0 0 0  0 0 0 0 0 0 
0 0 0 0 0 0  0 0 0 0 0 0 
0 0 0 0 0 0  0 0 0 0 0 0 
0 0 0 0 0 0  0 0 0 0 0 0 
0 0 0 0 0 0  0 0 0 0 0 0 
0 0 0 0 0 0  0 0 0 0 0 0 

On the left is the F matrix from the sequential computation (code above with @threads removed), and on the right is the F matrix from the multithreaded computation with two threads.

versioninfo()

Julia Version 1.4.0
Commit b8e9a9ecc6 (2020-03-21 16:36 UTC)
Platform Info:
OS: macOS (x86_64-apple-darwin18.6.0)
CPU: Intel® Core™ i5-7360U CPU @ 2.30GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-8.0.1 (ORCJIT, skylake)
Environment:
JULIA_EDITOR = atom -a
JULIA_NUM_THREADS = 2

I’m assuming sequential consistency for data-race-free programs, so I can’t make sense of this behaviour.

BitArrays are not threadsafe (e.g. falses(...)).

2 Likes

As it happens, this was recently merged: https://github.com/JuliaLang/julia/pull/33754/files.

1 Like

Of couse! Thanks!