Hello guys,
I need to convert the output of sha256 to a UInt256. What is the optimal way to do that performance-wise?
using SHA, BitIntegers
x=sha256("okok")
# convert(UInt256, x)
for context, sha256 returns a Vector{UInt8}
Hello guys,
I need to convert the output of sha256 to a UInt256. What is the optimal way to do that performance-wise?
using SHA, BitIntegers
x=sha256("okok")
# convert(UInt256, x)
for context, sha256 returns a Vector{UInt8}
Laconically today,
julia> using BitIntegers
julia> reinterpret(UInt256, rand(UInt8, 64))
2-element reinterpret(UInt256, ::Vector{UInt8}):
0x1a12ed72cbc1aa4f71e3dd40bd6a1af939664d1cbbc8fa06aa4de57f60da72cd
0x8350a89b2786c47309bbf6eeb71a21a3dec3bc73aca23396ffc10356223e9570
Thanks a lot,
This works reinterpret(UInt256, x)[1]
This might not use the byte order you want:
julia> h = rand(UInt8, 32);
julia> bytes2hex(h)
"b9a868cba886baa10e08c841e8c0eabac34b83de68207c8bad6f2c869bc4a547"
julia> reinterpret(UInt256, h)
1-element reinterpret(UInt256, ::Vector{UInt8}):
0x47a5c49b862c6fad8b7c2068de834bc3baeac0e841c8080ea1ba86a8cb68a8b9
Because the hash is returned in bigendian order, in principle you can use ntoh(reinterpret(UInt256, h))
.
Currently ntoh
fails because bswap
is not implemented for UInt256
, but that seems like an oversight in the BitIntegers.jl package (BitIntegers.jl#26) that could be easily remedied.
ah thanks, that saved me lot of debugging time!, also thanks for opening the issue
not a great solution but in the meantime, this works:
reinterpret(UInt256, reverse(x))[1]
I think I would go with pointer manipulation for speed, and bypass reinterpret
:
GC.@preserve x bswap(unsafe_load(Ptr{UInt256}(pointer(x))))
Really Cool! this is almost 4 times faster,
would you mind explaining what’s going on here?
specifically, what does the GC.@preserve, and also is the “unsafe_load” actually safe to use in all use cases?
AFAIU it instructs the GC not to interfere with the unsafe_load
(otherwise you could end in undefined behavior land) and I’m really unsure if we should recommend this to unsuspecting users…
No, exactly (the problem being it could look like it for a long time;).
If performance is of outermost importance in the meantime, would you propose a MWE?
Ahah, that’s what I thought, I read about it and will stick with the other (more julian) option for now, thanks.
Performance is important in my case, but not at the cost of readability nor unsafeness, and in this example, the hash function takes most of the execution time anyway.
I also saw that if you put GC.@preserve inside a function and then define it differently, the first definition stays, and it could add some confusion for users.
my MWE would be the following:
using BitIntegers, SHA
f(x) = reinterpret(UInt256, reverse(sha256(x)))[1]
Nice!
I tried to tuple
a bit and found this one
x = sha256("The quick brown fox jumps over the lazy dog")
y = (x...,)
@show typeof(x)
@show reinterpret(UInt256, x)
@show typeof(y)
@show reinterpret(UInt256, y)
yielding
typeof(x) = Vector{UInt8}
reinterpret(UInt256, x) = UInt256[0x92e5c937bfd0022d76db3c6de451568d4f2e08b0bc9aca699480d707b3fba8d7]
typeof(y) = NTuple{32, UInt8}
ERROR: bitcast: expected primitive type value for second argument
Reasonable (or in other words: since when is Vector
primitive)?
unsafe_load
is safe to use when you know that the pointer points at a valid object; here after obtaining the pointer via pointer(x)
, x
might be garbage collected before unsafe_load
retrieves the value, so GC.@preserve x ...
makes sure x
is kept alive for the duration of the expression.
If you want to avoid unsafe_load
, and if performance is important, you might be better off with
bswap(reinterpret(UInt256, (x))[1])
rather than
reinterpret(UInt256, reverse(sha256(x)))[1]