Hi everybody,

I have a problem when using parallelization in a `for`

loop. The computation speed is extremely slow, even slower than without parallel. The `for`

loop code is as follows:

```
N = 20
d_u = 0.5*rand(N)
d_v = 0.5*rand(N)
cost_update = SharedArray(Float64, N)
@parallel for k in eachindex(d_u)
a = p_1 # dynamic lower bound
b = p_2 # dynamic upper bound
x_1 = a + (1-gr)*(b-a)
x_2 = a + gr*(b-a)
# run golden section search
cost = zeros(2,1)
while norm(b-a) > tol
# compute cost for upper and lower bounds
lambda_12 = [norm(x_1-p_1) / l_0;
norm(x_2-p_1) / l_0]
theta_gnd = [atan2(x_1[2]-p_i[2], x_1[1]-p_i[1]); # backward propagation
atan2(x_2[2]-p_i[2], x_2[1]-p_i[1])]
x_gnd = [x_1[1]; x_2[1]] - vertex_min[1] + 1
y_gnd = [x_1[2]; x_2[2]] - vertex_min[2] + 1
cost = [norm(x_1-p_i); norm(x_2-p_i)] .*
cost_profile_wind(x_wf, y_wf, u_wf, v_wf, d_u[k], d_v[k],
x_gnd, y_gnd, V_gnd, theta_gnd) +
lambda_12.*u_2 + (1-lambda_12).*u_1
# update upper or lower bounds as necessary
if cost[1] < cost[2]
b = x_2
x_2 = x_1
x_1 = a + (1-gr)*(b-a)
else
a = x_1
x_1 = x_2
x_2 = a + gr*(b-a)
end
end
# update cost to return
cost_update[k] = mean(cost)
end
```

The variables `x_wf`

, `y_wf`

, `u_wf`

, `v_wf`

defines a vector field of wind. Inside the `for`

loop, there is a `while`

loop, as well as a function called `cost_profile_wind()`

computing a cost due to wind.

Can anybody help me to get it run faster? If more code is needed, I can always provide.

Thanks!