I am happy to announce BitPermutations.jl, a package to efficiently permute the bits in some bitstring, given an arbitrary permutation.

Given a permutation over **n** bits, we can actually reshuffle bitstring in **O(log n)** time, by judiciously choosing operations which reshuffle the bits in parallel.

This is described more in detail here as well as in the package README.

In practice, this gives a speedup if one has to apply the same permutation to many different bitstrings.

In typical situations, you can expect a speedup of around 10x to 50x, compared to applying `permute!`

on a `BitVector`

.

The largest speedups are achieved by using intrinsics available on modern x86 CPUs.

If such operations are available, the permutation will be performed not with a traditional Beneš network, but by using reshuffling stages known as GRP, described originally in Donald Knuth’s **The Art of Computer Programming**.

Here is a usage example taken from the README.

We first create a given permutation `p`

.

```
using BitPermutations
v = [2,6,5,8,4,7,1,3]
p = BitPermutation{UInt8}(v)
```

we can now apply it to a `UInt8`

bitstring

```
x = 0x08 # 0 0 0 1 0 0 0 0 (LSB -> MSB)
bitpermute(x, p) # 0 0 0 0 1 0 0 0, or 0x10
p(x) # idem
```

Contributions, feature request and general suggestions are very welcome.