Search in CUDA vector

I have a vector of binary values, need to find all indices that are one, and randomly select one to update another vector at the same index selected.

Please provide a minimum working example (MWE) of your code, as I requested in the #gpu channel in Slack.

For reference, the following code was provided in Slack:

# These are the binary vectors for candidate flips as columns in a matrix
candflips = abs.(sOld .- s′) 
# Iterate over columns
for j=1:size(candflips, 2)
        ff = findall(isone, candflips[:, j])
        if isempty(ff)
            # This's just some random elment when stuck at local minima
            Eoffset[j] += offset
        else
            idx = rand(ff, 1)[1]
            # Here implement the flip
            sNew[idx, j] = abs(one(eltype(sOld)) - sOld[idx, j])
        end
    end

This’s the CPU implementation of it

N, trials = 300, 10000
# These are the binary vectors for candidate flips as columns in a matrix
Eoffset = zeros(trials)
offset = 0.1
candflips = rand([0, 1], N, trials)
# Iterate over columns
for j=1:size(candflips, 2)
        ff = findall(isone, candflips[:, j])
        if isempty(ff)
            # This's just some random elment when stuck at local minima
            Eoffset[j] += offset
        else
            idx = rand(ff, 1)[1]
            # Here implement the flip
            sNew[idx, j] = abs(one(eltype(sOld)) - sOld[idx, j])
        end
end

How are you initializing sNew and sOld?

N, trials = 300, 10000
sNew = rand([0, 1], N, trials)
sOld = rand([0, 1], N, trials)
# These are the binary vectors for candidate flips as columns in a matrix
Eoffset = zeros(trials)
offset = 0.1
candflips = rand([0, 1], N, trials)
# Iterate over columns
for j=1:size(candflips, 2)
        ff = findall(isone, candflips[:, j])
        if isempty(ff)
            # This's just some random elment when stuck at local minima
            Eoffset[j] += offset
        else
            idx = rand(ff, 1)[1]
            # Here implement the flip
            sNew[idx, j] = abs(one(eltype(sOld)) - sOld[idx, j])
        end
end

I solved with the following:

using CUDA
N, trials = 300, 10000
sNew = cu(rand([0, 1], N, trials))
sOld = cu(rand([0, 1], N, trials))
# These are the binary vectors for candidate flips as columns in a matrix
Eoffset = CUDA.zeros(trials)
offset = 0.1
candflips = cu(rand([0, 1], N, trials))
# find which trial (col in candflips) has a flip and who hasn't
hascandflips = sum(candflips; dims=1)' .> (zero(eltype(candflips)))
hasnotcandflips = sum(candflips; dims=1)' .== (zero(eltype(candflips)))
# use previous as a mask to update Eoffset vector
Eoffset .= hasnotcandflips .* (Eoffset .+ offset)
# generate rand matrix only with value (non-zero) is cand flip
# then select the max of these rand values and find the one that 
# equal and use it as a maks to update cusNew
maxes = CUDA.ones(1, trials)
mask = CUDA.rand(N, trials) .* candflips
maximum!(maxes, mask)
maskInv .= mask .== maxes
cusNew .= maskInv .* hascandflips' .* abs.(one(eltype(cusOld)) .- cusOld) .+ (one(eltype(maskInv)) .- (maskInv .* hascandflips')) .* cusOld