iskyd
December 1, 2022, 4:39pm
1
I’m looking for a way to convert a BigInt number to bytes representation (in python this is done via (int).to_bytes()).
The only solution I came up with is doing BigInt → Hex → Bytes.
For example:
number::BigInt = parse(BigInt, "100001111001101100000100011100110110101110110110101010100001011100110011000110001100101001111111011111100100110011111101001110101011100111000011010011101100100010000011110011011100011001100110101111101011001111111110111011101110101001000100011001110100011011001010", base=2)
hex2bytes(string(number, base=16))
Is there any better approach?
Other than BigInt → bytes is there any way to just perform binary → bytes so that starting with a binary string I can came up with it’s binary representation in one way?
Thanks.
quinnj
December 1, 2022, 4:48pm
2
iskyd:
number::BigInt = parse(BigInt, "100001111001101100000100011100110110101110110110101010100001011100110011000110001100101001111111011111100100110011111101001110101011100111000011010011101100100010000011110011011100011001100110101111101011001111111110111011101110101001000100011001110100011011001010", base=2)
julia> string(number; base=2)
"100001111001101100000100011100110110101110110110101010100001011100110011000110001100101001111111011111100100110011111101001110101011100111000011010011101100100010000011110011011100011001100110101111101011001111111110111011101110101001000100011001110100011011001010"
Sukera
December 1, 2022, 4:49pm
3
Do you mean you want the bytes making up that BigInt
, so a Vector{UInt8}
?
Dan
December 1, 2022, 5:28pm
4
digits(UInt8, number; base=256)
does the job. It is least significant bytes first. You may want to reverse(digits(UInt8, number; base=256))
if the other direction is better.
1 Like
A low-level way to do this is to call the low-level mpz_export
function of GMP:
function to_bytes(n::BigInt; bigendian::Bool=true)
bytes = Vector{UInt8}(undef, (Base.GMP.MPZ.sizeinbase(n, 2) + 7) ÷ 8)
order = bigendian ? Cint(1) : Cint(-1)
count = Ref{Csize_t}()
@ccall "libgmp".__gmpz_export(bytes::Ptr{UInt8}, count::Ref{Csize_t}, order::Cint,
1::Csize_t, 1::Cint, 0::Csize_t, n::Ref{BigInt})::Ptr{UInt8}
@assert count[] ≤ length(bytes)
return resize!(bytes, count[])
end
This is by far the fastest of the methods discussed so far:
julia> using BenchmarkTools
julia> n = factorial(big(999));
julia> @btime to_bytes($n, bigendian=true);
2.120 μs (1 allocation: 1.14 KiB)
julia> @btime hex2bytes(string($n, base=16));
5.632 μs (3 allocations: 3.39 KiB)
julia> @btime reverse!(digits(UInt8, $n; base=256));
667.791 μs (10670 allocations: 1.79 MiB)
julia> to_bytes(n, bigendian=true) == reverse!(digits(UInt8, n; base=256))
true
(The digits
function seems terribly slow, probably could be improved?)
Note that the hex2bytes
method fails if string(n, base=16)
has an odd number of digits (in which case you need to pad with a zero).
5 Likes
Interestingly enough, we already had some discussions of this, resulting in an optimized digits
for base ≤ 62
(faster digits(::BigInt) by stevengj · Pull Request #37075 · JuliaLang/julia · GitHub ). Seems like we could just add an additional fast path when base == 256 * sizeof(T)
ispow2(base) && base-1 ≤ typemax(T)
.
Update : pull request submitted (faster digits(::BigInt, base=2^n) via mpz_export by stevengj · Pull Request #47774 · JuliaLang/julia · GitHub )
2 Likes