A little tangential: Python just (unhelpfully, I might add) displays byte “strings”/arrays in a weird string-y fashion.
An equivalent output to unhexlify
can be obtained with just hex2bytes
(i.e. hex2bytes
is exactly unhexlify
from python), which already gives you a vector of bytes. If you want to compare it to pythons’ representation of those bytes, you can map(Char, s)
the result (but this will give you a vector of characters, not bytes):
julia> s = hex2bytes("038decd6ff8f45d6b523c25eb5cc669fa1fc2fd33aaa5f56408abad126aa3e68");
julia> map(Char, s)
32-element Vector{Char}:
'\x03': ASCII/Unicode U+0003 (category Cc: Other, control)
'\u8d': Unicode U+008D (category Cc: Other, control)
'ì': Unicode U+00EC (category Ll: Letter, lowercase)
'Ö': Unicode U+00D6 (category Lu: Letter, uppercase)
'ÿ': Unicode U+00FF (category Ll: Letter, lowercase)
'\u8f': Unicode U+008F (category Cc: Other, control)
'E': ASCII/Unicode U+0045 (category Lu: Letter, uppercase)
'Ö': Unicode U+00D6 (category Lu: Letter, uppercase)
'µ': Unicode U+00B5 (category Ll: Letter, lowercase)
'#': ASCII/Unicode U+0023 (category Po: Punctuation, other)
'Â': Unicode U+00C2 (category Lu: Letter, uppercase)
'^': ASCII/Unicode U+005E (category Sk: Symbol, modifier)
'µ': Unicode U+00B5 (category Ll: Letter, lowercase)
'Ì': Unicode U+00CC (category Lu: Letter, uppercase)
'f': ASCII/Unicode U+0066 (category Ll: Letter, lowercase)
'\u9f': Unicode U+009F (category Cc: Other, control)
'¡': Unicode U+00A1 (category Po: Punctuation, other)
'ü': Unicode U+00FC (category Ll: Letter, lowercase)
'/': ASCII/Unicode U+002F (category Po: Punctuation, other)
'Ó': Unicode U+00D3 (category Lu: Letter, uppercase)
':': ASCII/Unicode U+003A (category Po: Punctuation, other)
'ª': Unicode U+00AA (category Lo: Letter, other)
'_': ASCII/Unicode U+005F (category Pc: Punctuation, connector)
'V': ASCII/Unicode U+0056 (category Lu: Letter, uppercase)
'@': ASCII/Unicode U+0040 (category Po: Punctuation, other)
'\u8a': Unicode U+008A (category Cc: Other, control)
'º': Unicode U+00BA (category Lo: Letter, other)
'Ñ': Unicode U+00D1 (category Lu: Letter, uppercase)
'&': ASCII/Unicode U+0026 (category Po: Punctuation, other)
'ª': Unicode U+00AA (category Lo: Letter, other)
'>': ASCII/Unicode U+003E (category Sm: Symbol, math)
'h': ASCII/Unicode U+0068 (category Ll: Letter, lowercase)
There are some characters that my terminal and julia happily displays, unlike python. In the example above, that would be \xec
i.e. 'ì'
, which is the third byte in the vector obtained via hex2bytes
and the third byte in your byte string:
b'\x03\x8d\xec [...]
What your original unhexlify
did was create a String
of all those characters joined together:
julia> inp = "038decd6ff8f45d6b523c25eb5cc669fa1fc2fd33aaa5f56408abad126aa3e68"
"038decd6ff8f45d6b523c25eb5cc669fa1fc2fd33aaa5f56408abad126aa3e68"
julia> function unhexlify(str) # had to fix a few issues before I could get it to run
result = ""
for i in 1:2:length(str) # more idiomatic, assumes ascii though
result *= Char(parse(Int64,str[i:i+1], base=16)) # the base is a keyword
end
return result
end
unhexlify (generic function with 1 method)
julia> join(map(Char, hex2bytes(inp))) == unhexlify(inp)
true
Additionally, julia String
s are UTF-8 encoded by default and are nothing like the byte “strings” python has. One important distinction is that julias’ Char
is really a unicode codepoint, not a single byte.
Also, ranges include both of their endpoints.