Best way to convert packed bits to vector

I need to expand a bit sequence packed in UInt32 words, into a vector that is eventually complex floating point (with the sequence elements represented by \pm 1 in the real part).

Currently I’m doing something like the following, for an example with a length-51 sequence stored in two words, w1 and w2:

bstr = bits(w1) * bits(w2);
bcod = zeros(Complex64, 51);
for i in 1:51
  bcod[i] = 2.0f0*(49.0f0 - Float32(Cchar(bstr[i]))) - 1.0f0;
end

This works but seems kludgy and inefficient. Is there a way to expand the sequence to Int or Float values without going through the intermediate string representation, or bit-shifting and -anding in a loop?

Bit-shifting is definitely the way to go. I would do something like this:

function expandbits2(w1::UInt32, w2::UInt32)
    bstr = (UInt64(w1) << 32) | w2
    mask = one(UInt32)
    bcod = Array{Complex32}(51)
    for i in 1:51
        b = (bstr >>> (i - 1)) & mask
        bcod[i] = 2.0f0*(49.0f0 - Float32(b)) - 1.0f0;
    end
    return bcod
end

which in Julia v0.6.3, is oddly slightly slower, even though it allocates less:

julia> @btime expandbits($w1, $w2);
  375.176 ns (6 allocations: 896 bytes)

julia> @btime expandbits2($w1, $w2);
  470.459 ns (1 allocation: 336 bytes)

If I understand the purpose of the math in there correctly, you are fixing the offset due to the Cchar and then expanding to be -1 or 1. In that case, then the subtraction from 49 can go away; you can also throw in an @inbounds as well.

Additionally, this is several times faster for me on v0.7beta2 if you are using that:

julia> function expandbits2(w1::UInt32, w2::UInt32)
           bstr = (widen(w1) << 32) | w2
           mask = one(UInt32)
           bcod = Array{ComplexF32}(undef, 51)
           @inbounds for i in 1:51
               b = (bstr >>> (i - 1)) & mask
               bcod[i] = ComplexF32(b)
           end

           bcod .= 2.0f0.*bcod .- 1.0f0

           return bcod
       end
expandbits2 (generic function with 1 method)

julia> @btime expandbits2($w1, $w2);
  82.415 ns (1 allocation: 544 bytes)