odd byte length primitive types and reinterpret()


#1

julia> primitive type Int24 24 end
julia> Int24(x::Int) = Core.Intrinsics.trunc_int(Int24, x)
Int24
julia> Int(x::Int24) = Core.Intrinsics.zext_int(Int, x)
Int64
julia> x = b"\x10\x45\x12\x20\x30\x40"
6-element Array{UInt8,1}:
0x10
0x45
0x12
0x20
0x30
0x40

julia> y = reinterpret(Int24,x)
2-element Array{Int24,1}:
Int24(0x124510)
Int24(0x004030)


The 4th byte is ignored here. Julia seems to take every 4 bytes and truncate to Int24, so it uses the first 4 bytes for the first 3-byte number, but that is not what I want.

The above seems to show that julia forces reinterpret() to align on standard integer boundaries even when the primitive type is an odd byte length. Is this a design constraint on reinterpret() or is there a way to specify Int24 that does not cause the loss of the 4th byte of the UInt8 array with reinterpret() of 6 bytes into two integers?


#2

You can use NTuple{3,UInt8} (or create a struct with a single NTuple{3,UInt8} field):

julia> x = b"\x10\x45\x12\x20\x30\x40"
6-element Array{UInt8,1}:
 0x10
 0x45
 0x12
 0x20
 0x30
 0x40

julia> reinterpret(NTuple{3,UInt8}, x)
2-element Array{Tuple{UInt8,UInt8,UInt8},1}:
 (0x10, 0x45, 0x12)
 (0x20, 0x30, 0x40)

#3

I think this is a bug; something somehow gets confused in the computations of the offsets/alignments and you end up reading/writing out-of-bounds memory. Could you open an issue or should someone else on this thread do so?

versioninfo()
#Julia Version 0.6.2

primitive type Int24 24 end
A=UInt8[i for i =1:12];

Ab=reinterpret(Int24, A)
#4-element Array{Int24,1}:
# Int24(0x030201)
# Int24(0x070605)
# Int24(0x0b0a09)
# Int24(0x000000) This is oob memory, potentially belonging to a different object.

unsafe_load(pointer(Ab,2))
#Int24(0x060504) correct

unsafe_load(pointer(Ab)+3)
#Int24(0x060504) correct
unsafe_load(pointer(Ab),2)
#Int24(0x070605) WRONG
Ab[2]
#Int24(0x070605) WRONG

Fixed on current master 0.7 (but a testcase would still be good, especially since @Keno plans to do something to reinterpret).

Edit: To be more precise, this issue has nothing to do with reinterpret and also happens on 0.7:

primitive type Int24 24 end
Int24(x::Int) = Core.Intrinsics.trunc_int(Int24, x)
Base.zero(::Type{Int24})=Int24(0)

A=zeros(Int24, 1000)
1000-element Array{Int24,1}:
#0.6:
# corrupted double-linked list 
#signal (6): Aborted

#0.7:
#corrupted double-linked list
#signal (6): Aborted

#4

Ah, thanks for the clarification of the bug. I have a work around that involves only handling 3 bytes at a time. Would you be willing to open the issue?