I have a use case where an integer array contains only values between 1 and some `maxValue`

, and I would like the histogram of the array values. I found that `counts(array, maxValue)`

was too slow, and as best as I could understand, it scans the array `maxValue`

times to count each possible value separately, instead of scanning the array once.

I thought then to implement this by hand with

```
using LoopVectorization
function mycounts(array, maxValue)
count = zeros(Int64, maxValue)
@turbo for i in eachindex(array)
count[array[i]] += 1;
end
return count;
end
```

However, I found that this produces the wrong results. For example `mycounts(ones(Int64, 1024)`

gives `32`

instead of `1024`

, and in fact increasing the size of the array, I always get 1-32th of the correct value. So perhaps this might have to do with the loop unrolling factor of `@turbo`

? I can confirm that replacing it with `@inbounds @simd`

, I get the correct results. Is there some nuance to the safe operation of `@turbo`

relative to `@simd`

that I am missing, or is this an actual bug that I should report?

(Since I have `maxValue`

of order 100, the speed savings with `@simd`

are already significant; yet the (wrong) implementation with `@turbo`

gains another x5 performance boost compared to `@simd`

which would be nice to have.)