# Is there a simple way to parallelize or distribute a function call that updates a local variable within a loop?

Hi!

I am working on implementing the hill-climbing algorithm to solve the one-dimensional bin packing problem but I got to a bottleneck. The function, which I called `main` for lack of a better name, repeats the same set of actions until the number of attempts has been consumed or if it has found the optimum which is given beforehand as part of an instance.

I want to know if this type of looping can be performed in parallel or distributed. And if so, is there a simple way to implement it? I have zero experience doing things in parallel and distributed . The idea is simple, loop and replace If the current â€śsolutionâ€ť at the current `attempt` is better than the better one found at the moment, if the optimal â€śsolutionâ€ť has been found then stop, continue otherwise.

``````function main(instance::BPPInstance, attempts::Int; earlybreak = false)
# Objective function
f(x) = sum(Î» -> (Î».fullness)^2, x)
# First attempt:
# Get a random sorting of the bins, then search the search space, and then compute the score!!!
bestbins = randomsort_bins(instance)
climb!(bestbins, instance.item_weights, instance.bin_capacity)
fbest = f(bestbins)
bins_count = length(bestbins)
# Remaining attempts
for attempt in 2:attempts
# Return the found configuration early and the attempts made to find it
if earlybreak && bins_count == instance.optimal_solution
return bestbins, attempt
end
# Get a random sorting of the bins, then search the search space, and then compute the score!!!
bins = randomsort_bins(instance)
climb!(bins, instance.item_weights, instance.bin_capacity)
fbins = f(bins)
if fbins > fbest
bestbins = bins
fbest = fbins
bins_count = length(bestbins)
end
end
return bestbins, attempts
end
``````

Some time ago I saw the existence of Dagger, I wonder if this package can be used in this case.

First of all: Did you profile your code and are sure that this the slowest part of your code? Optimizing for performance only makes sense for the parts that most time is spent in. If you use VSCode you can use `@profview myfunction()` to get a Flamegraph for the performance. Bar size indicates time spent and bars are arranged such that you can see time spent in subfunctions beneath the each bar.

Parallelization is sadly not magic. As your algorithm looks right now, you canâ€™t apply parallelization in a straight-forward manner. Youâ€™ll need to change it (slightly).

Right now each of you loop iterations share variables (e.g. `bestbins`, `fbest`,â€¦) and maybe depend in some other way on each other through `climb!`. I donâ€™t know what `climb!` does but assuming it does not modify `instance` then the only issue I see are the shared variables but each iteration is essentially independent, right? So the simplest thing you could do is to run a couple of `main` in parallel and in the end take the best solution across these. You lose the early-stopping feature but thatâ€™s maybe acceptable (and could be fixed if really needed). In code:

``````function main_parallel(instance, attempts)
# Objective function
f(x) = sum(Î» -> (Î».fullness)^2, x)