I am trying to parallelize an algorithm which looks like this. I couldn’t express my code right in my previous post

The following algorithm takes an element from an odd column and adds it to the adjacent even coloumn.

```
using BenchmarkTools, Random, StatsBase, Distributions
using FLoops, FoldsThreads, ThreadsX
function test()
rows, coloumns = 990, 99990
#########Initializing Matrix A####################################
A = zeros(Int64, rows, coloumns)
########Filling only 75% of rows##################################
entries :: Vector{Int} = sample(1:floor(Int,0.75*rows), coloumns; replace=true)
for i in axes(A,2)
A[1:entries[i],i] .= shuffle(1:entries[i])
end
#########Declaring Dummy Variables####################################
dummy_A = deepcopy(A)
dummy_entries = deepcopy(entries)
# println(dummy_A')
@floop for i in 1:2:coloumns-1
index1 :: Int = rand(1:dummy_entries[i])
index2 :: Int = rand(1:dummy_entries[i+1])
temp :: Int = dummy_A[index1,i]
###############Removing an element from ith coloumn#########################
for j in index1:dummy_entries[i]
dummy_A[j,i] = dummy_A[j+1,i]
end
dummy_entries[i] = dummy_entries[i] - 1
###############Adding an element in i+1th coloumn#########################
for j in dummy_entries[i+1]:-1:index2
dummy_A[j+1,i+1] = dummy_A[j,i+1]
end
dummy_A[index2,i+1] = temp
dummy_entries[i+1] = dummy_entries[i+1] + 1
###############Updating only if condition satisfies#########################
if rand() <= 0.5
entries[i], entries[i+1] = dummy_entries[i], dummy_entries[i+1]
A[1:entries[i],i] .= dummy_A[1:entries[i],i]
A[1:entries[i+1],i+1] .= dummy_A[1:entries[i+1],i+1]
end
end
# println(dummy_A')
end
@benchmark begin
test()
end
```

Results for the Serial Version

```
BenchmarkTools.Trial: 6 samples with 1 evaluation.
Range (min … max): 784.253 ms … 877.906 ms ┊ GC (min … max): 2.06% … 10.24%
Time (median): 871.206 ms ┊ GC (median): 10.31%
Time (mean ± σ): 856.444 ms ± 36.009 ms ┊ GC (mean ± σ): 9.12% ± 3.41%
█ █ ██ ██
█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁██▁██ ▁
784 ms Histogram: frequency by time 878 ms <
Memory estimate: 1.91 GiB, allocs estimate: 149748.
```

Results for Parallel Version with 10 threads

```
BenchmarkTools.Trial: 6 samples with 1 evaluation.
Range (min … max): 749.625 ms … 1.022 s ┊ GC (min … max): 2.24% … 25.19%
Time (median): 886.070 ms ┊ GC (median): 16.99%
Time (mean ± σ): 888.100 ms ± 114.955 ms ┊ GC (mean ± σ): 16.82% ± 10.66%
█ █ █ █ █ █
█▁▁▁▁▁▁▁▁█▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁█▁▁▁▁▁█ ▁
750 ms Histogram: frequency by time 1.02 s <
Memory estimate: 1.91 GiB, allocs estimate: 149580.
```

Can someone please let me know where am I making a mistake…?