# Why is this loop type not supported by LoopVectorization?

I am trying to understand why I can’t use `@turbo` in part of my code. I can reduce the issue to the following example

``````using LoopVectorization

f = [15, 13, 12, 10, 9, 7, 5, 4, 2, 1]
y0 = zeros(16)
y1 = zeros(16)

for i = 1:10
y0[f[i]] += 1.0
y0[f[i] + 1] += 2.0
end

@turbo for i = 1:10
y1[f[i]] += 1.0
y1[f[i] + 1] += 2.0
end
``````

I then get

``````julia> [y0 y1]
16×2 Matrix{Float64}:
1.0  1.0
3.0  2.0
2.0  2.0
1.0  1.0
3.0  2.0
2.0  2.0
1.0  1.0
2.0  2.0
1.0  1.0
3.0  2.0
2.0  2.0
1.0  1.0
3.0  2.0
2.0  2.0
1.0  1.0
2.0  2.0
``````

So I am not getting the same output when I use `@turbo`. Why is it not appropriate to use it in this case? It was not obvious to me from reading LoopVectorization.jl limitations, so I suspect I may have misunderstood something. Thanks.

From the `@turbo` documentation:

It assumes that loop iterations are independent.

Here, the loop iterations are somewhat dependent (though they can be executed in any order) because different iterations modify the same `y1` element, which means that they cannot be executed in parallel (SIMD is instruction-level parallelism). And you can see that it precisely for these elements (which have value `3.0` after the scalar loop) that the results differ.

1 Like

What is the definition of independence?

1 Like

If you can rearrange the order of the iterations without changing the result of the calculation.

For example, what happens if you run this many times:

``````for i = shuffle(1:10)
y0[f[i]] += 1.0
y0[f[i] + 1] += 2.0
end
``````

Do you always get the same result?

1 Like

I would guess that invariance to shuffling is not enough either. The different iterations should be able to execute simultaneously, so check for possible data races.

In this case yes, because addition is commutative. But the iterations cannot execute arbitrarily in parallel without data races.

1 Like