What is the most efficient binary representation of unsigned integers in Julia?

Hello everybody,

I would like to know what is the most efficient binary representation in Julia? My use case is to convert an unsigned integer into a binary representation, then to mutate the elements of this binary representation and convert it back to an unsigned integer. The focus should be that is fast and memory efficient.

I know, that I can use bitstring() or Base.bin(UInt8(2), 8,false), but as far as I know, I can’t change the single elements of the resulting string type.

Best regards
Volker

I would not convert it at all but just work with the integer using bitwise operators

7 Likes

I haven’t thought about it. It is probably the fastest way, but a little bit uncomfortable and difficult to read, if you have in mind to mutate the elements by some genetic operators e.g. single point crossover, etc…


v1 = Base.bin(UInt8(15), 4,false)
v2 = Base.bin(UInt8(0), 4,false)

v1b=BitVector([parse(Bool, elem) for elem in v1])
v2b=BitVector([parse(Bool, elem) for elem in v2])

singlepoint(v1b,v2b)


"""
    Code Snippet of Evolutionary.jl
    singlepoint(v1, v2)
Single point crossover between `v1` and `v2` individuals.
"""
function singlepoint(v1::T, v2::T) where {T <: AbstractVector}
    l = length(v1)
    c1 = copy(v1)
    c2 = copy(v2)
    pos = rand(1:l)
    for i in pos:l
        vswap!(c1, c2, i)
    end
    return c1, c2
end

# Utilities
# =========
function vswap!(v1::T, v2::T, idx::Int) where {T <: AbstractVector}
    val = v1[idx]
    v1[idx] = v2[idx]
    v2[idx] = val
end

But I will think about how to make use of your suggestion and how to make the code still looks pretty.

What do you mean by “binary representation”? Are you aware that primitive types are already represented in the machine native way in Julia?

6 Likes

What’s your maximum bit vector length?

No, how I can just edit the bits? I think my maximum length will be 32 when I even consider Float32.

Here is a simple example how to encode a standard cross-over (I am writing it out of my head so there might be bugs or corner cases would have to be handled in some special way).

function crossover(a::UInt32, b::UInt32, n)
    low = UInt32(1<<n - 1)
    high = ~low
    return (a & low) | (b & high), (b & low) | (a & high)
end

and here is a single point cross-over:

function crossover(a::UInt32, b::UInt32, n)
    mask = UInt32(1<<(n - 1))
    notmask = ~mask
    return (a & mask) | (b & notmask), (b & mask) | (a & notmask)
end

7 Likes

Your original post talked about “efficient”, and the standard storage with bitwise operators will be the most efficient by far. Anything else will orders of magnitudes slower.

11 Likes

Thank you guys for replying so fast.

You’re right.

1 Like

If you’re getting used to bit manipulation, two references are

  • Bit Twiddling Hacks from Stanford Graphics
  • and Hacker’s Delight, which is a book on bit manipulation algorithms.
    Have fun. It’s a great game.
2 Likes

Julia has two built-in 32 bit Integer types, one is unsigned (UInt32) and one is signed (Int32). They are equally fast. If you are working at the bit level then UInt32 is most appropriate.

fyi: Julia has a function bitstring which will print out the 32 bits as a sequence of ‘1’ s and '0’s. It is not easy to see which bit is where that way, you may prefer

julia> bits(x::UInt32) =
          join((bitstring(x)[i:i+3] for i=1:4:32), "_")

julia> x = 0x0a4a4fc5
0x0a4a4fc5
julia> typeof(x)
UInt32
julia> bitstring(x)
"00001010010010100100111111000101"
julia> bits(x)
"0000_1010_0100_1010_0100_1111_1100_0101"
4 Likes

@JeffreySarnoff, printing pretty bits seems to be a good candidate for the awesome PrettyTables.jl package. Not sure if you like it but, see herein a basic pretty_bits() function, that hopefully is also correct.

PS: further edited after feedback from @JeffreySarnoff

using PrettyTables

function pretty_bits(x)
    bx = reverse(bitstring(x))
    N = length(bx);
    Nc = min(16, N)  # max columns to display = 16 bits
    Nr = N÷Nc
    data = [parse(Int,bx[Nc*i+j]) for i in 0:Nr-1, j in 1:Nc][:,end:-1:1]
    hl1 = Highlighter((data,i,j) -> data[i,j]==1, crayon"red bold")
    hl2 = Highlighter((data,i,j) -> data[i,j]==0, crayon"cyan bold")
    if Nr>1
       pretty_table(data, [x for x = Nc-1:-1:0], header_crayon = crayon"light_gray bold italics",
           vlines = collect(1:4:Nc),
           row_names = [string(Nc*i-1," - ",Nc*(i-1)) for i in 1:Nr],
           row_name_alignment = :c,
           formatters = ft_printf("%i",1:Nc), highlighters=(hl1,hl2))
    else
        pretty_table(data, [x for x = Nc-1:-1:0], header_crayon = crayon"light_gray bold italics",
           vlines = collect(4:4:Nc-4),
           formatters = ft_printf("%i",1:Nc), highlighters=(hl1,hl2))
    end
end

x = 100   # Int64
pretty_bits(Int8(x))
pretty_bits(Int16(x))
pretty_bits(Int32(x))
pretty_bits(x)   # Int64
pretty_bits(Int128(x))
pretty_bits(convert(UInt8,x))  # 0x64
pretty_bits(convert(UInt32,x)) # 0x00000064

A more compact variant, pretty_bits2(), produces:

function pretty_bits2(x)
    bx = reverse(bitstring(x))
    N = length(bx);
    Nc = min(16, N)  # max columns to display = 16 bits
    Nr = N÷Nc
    data = [bx[Nc*i+j:Nc*i+j+3] for i in 0:Nr-1, j in 1:4:Nc][:,end:-1:1]
    hl1 = Highlighter((data,i,j) ->  occursin("1",data[i,j]), crayon"red bold")
    hl2 = Highlighter((data,i,j) -> !occursin("1",data[i,j]), crayon"cyan")
    if Nr>1
       pretty_table(data, ["$x-$y" for (x,y) in zip(Nc-1:-4:0,Nc-4:-4:0)],
           header_crayon = crayon"light_gray bold italics",
           vlines = [1],  row_name_alignment = :c,
           row_names = [string(Nc*i-1," - ",Nc*(i-1)) for i in 1:Nr],
           formatters = ft_printf("%i",1:Nc), highlighters=(hl1,hl2))
    else
        pretty_table(data, ["$x-$y" for (x,y) in zip(Nc-1:-4:0,Nc-4:-4:0)],
           vlines = :none, header_crayon = crayon"light_gray bold italics",
           formatters = ft_printf("%i",1:Nc), highlighters=(hl1,hl2))
    end
end

x = 100   # Int64
pretty_bits2(Int8(x))
pretty_bits2(Int16(x))
pretty_bits2(Int32(x))
pretty_bits2(x)   # Int64
pretty_bits2(Int128(x))
pretty_bits2(convert(UInt8,x))  # 0x64
pretty_bits2(convert(UInt32,x)) # 0x00000064
10 Likes

(thanks for the feedback)

the column headings are off-by-one because bit positions (generally) are numbered from 0 (the least significant bit) up to e.g. f ==15 (0b1111). Some bit diagrams prepend msb and postpend lsb. That is usually done when some or all of the explicit positions are not given as positions (even when the corresponding bit values are present).

the multirow tables need something … maybe pretty_nibbles (hex digits)
would read more clearly – or else using hex digits in the leftmost column

Most people can pick-up and discern groups of 4 adjacent bits (more than that may be challenging, depends on one’s genescape).

3 Likes

@JeffreySarnoff, thanks for your advice. Made some edits above, accordingly.

PS: the road ahead is straight but the slope is steep…

looks much better
one more suggestion, when the left hand column has entries like
nn - nn make sure all of the - are aligned by adding white space before digits on the right of the - and as appropriate before the digits on the left e.g.

  15 -  0
  31 - 16

and, perhaps

  15 -  0 
  31 - 16
..
  95 - 80
 111 - 96
 127 - 112

or if preferred by PrettyTables

  15 -   0 
  31 -  16
..
  95 -  80
 111 -  96
 127 - 112
2 Likes

@JeffreySarnoff, thank you. Code above was further edited accordingly. (PS: got also rid of a meta-programming line which was not needed)

1 Like

This is awesome! I had to dig bits a lot of times and I have never thought about using PrettyTables for this activity! Thanks!! :slight_smile:

P.S.: I think it would be very nice to have a package for those dumps!

1 Like

@Ronis_BR, You have made it pretty, now everybody is flirting with it.
:wink:

2 Likes