Given a non-zero UInt8, I would like to randomly and uniformly select the index of a bit that is one. For example, if I get UInt8(14), the bitstring would be 00001110, and thus it would return UInt8(2), UInt8(3), and UInt8(4) at a uniform rate. My solution thus far has been
- Select a number from one to the number of ones in the inputted UInt8
- Find the index of the bit corresponding to that many ones (from the right)
This second step is similar to the bit twiddling hack shown here: Bit Twiddling Hacks but this is built very much with 64-bit integers in mind so I spun my own. Here’s my solution so far:
const rng = Xoshiro() # step 2 @inline function get_decisions(d1::UInt8, d2::UInt8, i::UInt8, j::UInt8) to_return_1, to_return_2 = 0x08, 0x08 for n = 0x00:0x06 to_return_1 -= isodd(d1 >> n) && Base.ctpop_int(d1 >> n) == i ? 0x07 - n : 0x00 to_return_2 -= isodd(d2 >> n) && Base.ctpop_int(d2 >> n) == j ? 0x07 - n : 0x00 end return to_return_1, to_return_2 end # step 1 function select_random_decision(d1::UInt8, d2::UInt8) a, b = Base.ctpop_int(d1), Base.ctpop_int(d2) return get_decisions(d1, d2, a == 0x01 ? 0x01 : a == 0x02 ? rand(rng, (0x01, 0x02)) : a == 0x03 ? rand(rng, (0x01, 0x02, 0x03)) : a == 0x04 ? rand(rng, (0x01, 0x02, 0x03, 0x04)) : rand(rng, 0x01:a), b == 0x01 ? 0x01 : b == 0x02 ? rand(rng, (0x01, 0x02)) : b == 0x03 ? rand(rng, (0x01, 0x02, 0x03)) : b == 0x04 ? rand(rng, (0x01, 0x02, 0x03, 0x04)) : rand(rng, 0x01:b) ) end
This can be checked for correctness visually with this plot:
using Plots a = rand(0x01:0xff) b = rand(0x01:0xff) N = 1_000_000 results = map(x -> select_random_decision(a, b), 1:N) histogram(map(x -> results[x], 1:N) .- 1/8, xlims = (0, 9), label = "", xticks = (1:8, reverse(split(bitstring(a), "")) .* reverse(split(bitstring(b), ""))), bar_width = 1/4) histogram!(map(x -> results[x], 1:N) .+ 1/8, label = "", bar_width = 1/4, dpi = 300, format = :png, grid = false, yaxis = false, yticks = , bgcolor = :transparent)
and benchmarked with this:
using BenchmarkTools @benchmark select_random_decision(a, b) setup = (a = rand(0x01:0xff); b = rand(0x01:0xff))
BenchmarkTools.Trial: memory estimate: 0 bytes allocs estimate: 0 -------------- minimum time: 10.060 ns (0.00% GC) median time: 14.625 ns (0.00% GC) mean time: 14.621 ns (0.00% GC) maximum time: 22.273 ns (0.00% GC) -------------- samples: 10000 evals/sample: 999
This blog post (What you can learn from accessing an element of an array in Julia) mentions that tuples >= 4 in length are highly optimized for random selection in Julia. This lines up with my benchmarking, as both going to 3 and 5 are slower. However, I don’t understand why this is, since the base Julia seems to have optimizations for 1, 2, 3, and N-sized tuples (lines 500-553 of julia/generation.jl at 0ff49d0457300013f9e802850a49e64569426a0b · JuliaLang/julia · GitHub).
Why do these tuple optimizations work up to 4, can I write my own implementations for 5-8 (8 seems not that far away from 4, perhaps naively), and are there other ways to optimize this function?