# What's the most performant way to return some bit of an Int in Bool?

Shifting left and right takes two operations and the result needs to be converted. Is there a method with one operation or that can skip the conversion?

1 Like

If the bit position is known beforehand:

`````` f(x) = isodd(x >>> 61)
``````

, which compiles to a `shr` and a `and` instruction.

If the bit position is not know, but itâ€™s known that itâ€™s in `0:N-1`, where N is the number of bits in the integer, then

``````f(x, y) = isodd(x >>> (y & 63))
``````

(where `63` is N-1), which compiles to a `bt` and a `setb` instruction

If you donâ€™t know the bit position, and it might be lower than 0 or higher than N-1:

``````f(x, y) = isodd(x >>> y)
``````

, which compiles to significantly more complicated assembly.

1 Like

Though, whether that ends up as a conversion in the generated assembly depends on a lot of factors. It might just end up as bitmask and a nonzero check. Do you have a particular example youâ€™re interested in?

For example, say I have a vector `v` of length 64. Then I have an Int `n` which works like a filter: the entries of `v` map to the bits of `n` and the entries that map to the zero bits are deleted or ignored. Then a `reduce` operation is applied to whatâ€™s left.

The two implementations I would consider would be things like

``````checkbit_1(x::Base.BitInteger, b::Integer) = x & (one(x) << b) != zero(x)
checkbit_2(x::Base.BitInteger, b::Integer) = (x >>> b) & one(x) != zero(x)
``````

(you might want to use `(b-1)` in place of `b`, depending on how you want to index bits). An inspection of the standalone `@code_native` suggests `checkbit_2` (which should be basically identical to the suggested `isodd(x >>> y)` above) may be slightly better.

A primitive function like this will vary considerably in performance based on its context, so a straight answer as to â€śwhat is best?â€ť is difficult. For example, if `b` is compile-time known then this compiles down to 2 or 3 instructions while if variable itâ€™s more like 12-15. If itâ€™s unknown but has a known range then it might land anywhere between. What you do with the `Bool` afterward can also have a big impact. For example, whether you assign the value to a memory location, add it to an accumulator, or use it as a condition to an `if` statement can dramatically change the compiled code.

All that said, itâ€™s difficult to imagine that any of the options I gave, or anything anyone else suggested, actually give a significant performance difference in a real use case. Compilers are quite good at the basic bit operations involved here and it will probably get pretty close to optimal from most sane starting points. Further, whatever actual work you do based on this outcome could easily dwarf any possible performance difference.

Youâ€™ll likely need to assemble a few candidates and then benchmark in your actual (or a truly representative) context to find what is best. Even then, the differences may be negligible or within the noise.

Ok, but thatâ€™s still not enough to deduce what ultimately happens once itâ€™s compiled - for one, you didnâ€™t specify the element type of the vector, which is crucial. If itâ€™s a `Vector{UInt8}`, itâ€™s possible a SIMD reduction happens (including preamble & postamble). Or just a regular loop in the assembly, if the reduction isnâ€™t SIMDable or the element type isnâ€™t SIMDable.

Thereâ€™s no blanket statement that covers all possibilities without a concrete example. What you are describing sounds a lot like youâ€™re asking about SIMD intrinsics, in which case SIMD.jl will be of interest to you.

Thanks for that helpful information. SIMD seems to be quite useful for my project. However, in terms of the original question, @jakobnissen 's answer is sufficient so Iâ€™m giving that the credit. @mikmoore 's answer is also helpful. I should have phrased the question better.