Reinterpret returns wrong values

I’m working on the parsing a binary file header. The structure looks like Adummy.
First, I tried to use the reinterpret but some values (p9 and p10) were incorrect.
If I sequentially read the data one by one, I got correct values.

Is this an expected behavior or, this is a bug?

using StaticArrays
struct Adummy
    p1::SVector{4, UInt8}
    p2::SVector{2, UInt8}
    p3::UInt16
    p4::UInt32
    p5::UInt32
    p6::UInt32
    p7::UInt32
    p8::UInt8
    p9::UInt32
    p10::UInt32
end

src = [
    80, 67, 79, 32,
    3, 0,
    33, 0,
    10, 0, 0, 0,
    12, 0, 0, 0,
    4, 0, 0, 0,
    3, 0, 0, 0,
    32,
    33, 0, 0, 0,
    80, 0, 0, 0,
    43, 0, 0,
]

ioBuff = IOBuffer(UInt8.(src))

## read sequentially
seekstart(ioBuff)
seSt = []
for T in fieldtypes(Adummy)
    push!(seSt, read(ioBuff, T))
end

## reinterpret data
seekstart(ioBuff)
data = read(ioBuff, sizeof(Adummy))
reSt = reinterpret(Adummy, data)[1]

## show results
for (n, val) in zip(propertynames(reSt), seSt)
    v = getproperty(reSt, n)
    println("$n should be $val but $v")
end

>> p1 should be UInt8[0x46, 0x44, 0x49, 0x46] but UInt8[0x46, 0x44, 0x49, 0x46]
>> p2 should be UInt8[0x03, 0x00] but UInt8[0x03, 0x00]
>> p3 should be 33 but 33
>> p4 should be 10 but 10
>> p5 should be 12 but 12
>> p6 should be 4 but 4
>> p7 should be 3 but 3
>> p8 should be 32 but 32
>> p9 should be 33 but 20480
>> p10 should be 80 but 11008
1 Like

Google “struct padding” to explain what you are seeing

See also GitHub - JuliaIO/StructIO.jl: Experimental new implementation of StrPack.jl-like functionality

5 Likes

To be specific, the memory layout of Adummy apparently has 3 bytes of padding between p8::UInt8 and p9::UInt32. Here’s a terrible hack that makes this particular example work:

using StaticArrays

struct Adummy
    p1::SVector{4, UInt8}
    p2::SVector{2, UInt8}
    p3::UInt16
    p4::UInt32
    p5::UInt32
    p6::UInt32
    p7::UInt32
    p8::UInt8
    p9::UInt32
    p10::UInt32
end

src = [
    80, 67, 79, 32,
    3, 0,
    33, 0,
    10, 0, 0, 0,
    12, 0, 0, 0,
    4, 0, 0, 0,
    3, 0, 0, 0,
    32, 0, 0, 0,
    33, 0, 0, 0,
    80, 0, 0, 0,
]

ioBuff = IOBuffer(UInt8.(src))

# read sequentially
seekstart(ioBuff)
seSt = []
for T in fieldtypes(Adummy)
    push!(seSt, read(ioBuff, T))
    if T === UInt8
        read(ioBuff, T)
        read(ioBuff, T)
        read(ioBuff, T)
    end
end

## reinterpret data
seekstart(ioBuff)
data = read(ioBuff, sizeof(Adummy))
reSt = reinterpret(Adummy, data)[1]

## show results
for (n, val) in zip(propertynames(reSt), seSt)
    v = getproperty(reSt, n)
    println("$n should be $val but $v")
end

## Output:
# p1 should be UInt8[0x50, 0x43, 0x4f, 0x20] but UInt8[0x50, 0x43, 0x4f, 0x20]
# p2 should be UInt8[0x03, 0x00] but UInt8[0x03, 0x00]
# p3 should be 33 but 33
# p4 should be 10 but 10
# p5 should be 12 but 12
# p6 should be 4 but 4
# p7 should be 3 but 3
# p8 should be 32 but 32
# p9 should be 33 but 33
# p10 should be 80 but 80
2 Likes

@stevengj
Thank you for your comments. I literally googled it and watched a video about the idea of the struct padding. :grin:

@danielwe
Thank you for specifying the problem clearly. I appreciate it.

Since files that I’d like to read have the format like Adummy already (not having the 3 bytes padding), I would use sequential reading but I’m concerned about the performance penalty of Venctor{Any}.
Could we take another way to avoid this penalty?

could you apply

like this, waiting for better solutions?

datap=similar(data)
copyto!(datap,1,data,1,24)
padd=[data[25];zeros(UInt8,3);data[26:end-3]]
copyto!(datap,25,padd,1,12)

reSt = reinterpret(Adummy, datap)[1]


I was curious what the performance difference might be so I quickly did a benchmark.

Setup Code (identical to OP)
using StaticArrays, BenchmarkTools
struct Adummy
    p1::SVector{4, UInt8}
    p2::SVector{2, UInt8}
    p3::UInt16
    p4::UInt32
    p5::UInt32
    p6::UInt32
    p7::UInt32
    p8::UInt8
    p9::UInt32
    p10::UInt32
end

src = [
    80, 67, 79, 32,
    3, 0,
    33, 0,
    10, 0, 0, 0,
    12, 0, 0, 0,
    4, 0, 0, 0,
    3, 0, 0, 0,
    32,
    33, 0, 0, 0,
    80, 0, 0, 0,
    43, 0, 0,
]

ioBuff = IOBuffer(UInt8.(src))
function read_seq(ioBuff)
	## read sequentially
	seekstart(ioBuff)
	seSt = []
	for T in fieldtypes(Adummy)
		push!(seSt, read(ioBuff, T))
	end
	return Adummy(seSt...)
end

## reinterpret data
function read_reinterpret(ioBuff)
	seekstart(ioBuff)
	data = read(ioBuff, sizeof(Adummy)-3)
	padd = [data[1:25];zeros(UInt8,3);data[26:end]]
	reSt = reinterpret(Adummy, padd)[1]
end

@show @belapsed read_seq($ioBuff) # 2.15μs
@show @belapsed read_reinterpret($ioBuff) # 145ns

The reinterpret version with padding roughly 10x faster (median 145ns vs. 2.15μs). I tried some other things without hardcoding the padding but they all were slower than read_seq. If you don’t plan to reduce millions of these object then the flexibility might outweigh the gain in speed. Beyond these functions Julia should be able to infer that a Adummy type is returned, so everything using one of these methods should behave exactly identically.

if the src structure is always the one with the extra 3 bytes (result of some sort of anti-padding?), you can use this space to perform the padding with a shift of the data tail

function reipadd(ioBuff)
    seekstart(ioBuff)
    data = read(ioBuff, sizeof(Adummy))
    copyto!(data,28,data,25,9)
    reinterpret(Adummy, data)[1]
end

julia> @btime reipadd($ioBuff)
  35.146 ns (2 allocations: 120 bytes)
Adummy(UInt8[0x50, 0x43, 0x4f, 0x20], UInt8[0x03, 0x00], 0x0021, 0x0000000a, 0x0000000c, 0x00000004, 0x00000003, 0x20, 0x00000021, 0x00000050)

You cannot rely on the padding being there and you mustn’t read it. That is undefined behavior. The padding can change depending on the platform the code runs on.

2 Likes

The original src is strange, it’s written in rows as if it stores the bytes of the fields without padding, but it has length 36 just like the padded sizeof(Adummy). Moreover, there are 11 rows, while Adummy only has 10 fields. The p1 report is inconsistent with the original src too, when I run the original code I get UInt8[0x50, 0x43, 0x4f, 0x20]. Other comments either omit the last row (3 elements) from src or don’t read it, and they seem to work with the values I got.

Making an unrestricted heterogenous array that you splat into the Adummy constructor requires overhead that may not be necessary. You can combine strategies, push unpadded bytes to an array, add padding as needed, and use reinterpret at the end. The following read_pad_reinterpret combines both functions in abraemer’s code, but it adapts to any type’s fields’ padding, not hardcoded byte-skipping or padding for a specific type. It’s more efficient than read_seq because the array is not heterogenous, but it’s slower than read_reinterpret, I assume because data reads must jump from field to field instead of a few large chunks.

function read_pad_reinterpret(ioBuff, ::Type{T}) where T
  seekstart(ioBuff)
  paddedbytes = UInt8[]
  sizehint!(paddedbytes, sizeof(T)) # allocate at once
  for i in 1:fieldcount(T)
    minsize = sizeof(fieldtype(T, i))
    for _ in 1:minsize
      push!(paddedbytes, read(ioBuff, UInt8))
    end
    if i != fieldcount(T)
      actualsize = Int(fieldoffset(T, i+1) - fieldoffset(T, i))
    else
      actualsize = sizeof(T) - Int(fieldoffset(T, i))
    end
    pad = actualsize - minsize
    if pad > 0
      for _ in 1:pad
        push!(paddedbytes, zero(UInt8))
      end
    end
  end
  reinterpret(Adummy, paddedbytes)[1]
end
julia> @btime read_seq($ioBuff);
  2.587 μs (10 allocations: 800 bytes)

julia> @btime read_reinterpret($ioBuff);
  318.601 ns (6 allocations: 424 bytes)

julia> @btime read_pad_reinterpret($ioBuff, $Adummy);
  547.112 ns (3 allocations: 160 bytes)
1 Like

I’m not sure I understand the remark, but I submit this variation for comments.


function reipadd(ioBuff)
    seekstart(ioBuff)
    data = read(ioBuff, sizeof(Adummy))
    resize!(data,sizeof(Adummy))
    copyto!(data,28,data,25,9)
    reinterpret(Adummy, data)[1]
end
1 Like

That’s still not correct. It will still read bytes from the data stream that shouldn’t be read, because they only exist in the in-memory representation of the Adummy struct.

Padding bytes aren’t part of the struct - they are inserted to make some other work of the compiler easier, for example by allowing the compiler to assume a certain in-memory alignment of objects. Your code, as written, isn’t portable between 32 and 64 bit architectures, because it assumes that the 3 bytes inserted on 64 bit will always be there (which they won’t). Moreover, resizing an array to the same size it’s already at won’t fix the underlying issue.

On top of all of this, you can’t use reinterpret to (properly) construct objects - it doesn’t use the constructor after all, and so you are able to create invalid objects that may have relied on the constructor to do some checking for them.

I’m actually a bit surprised reinterpret worked for this in the first place - it should be impossible to create padding out of real values.

3 Likes

Do you think my method read_pad_reinterpret adapts to each architecture? I think so, just wanted to check. Also this means that writing an array of struct values to a file as bytes isn’t portable because of field padding, right?

Simple example, an inner constructor could demand that all integers must be >10. If it cannot be guaranteed that ioBuff stores valid unpadded fields of Adummy, abraemer’s read_seq using Adummy constructor is the only safe way, and even then you must assume the stored fields are recursively valid to read instead of construct. In this case, integers don’t have invalid bytes, and neither does Adummy.

Possibly, but I still err on the side of “don’t do that” in general because it circumvents the constructor.

Also, it’s very possible that what you’ve written only works for isbitstype fields and doesn’t anymore for non-isbitstypes or mutable structs.

You can only write the “plain bits” if the struct has no padding and all its fields are also plain bits. Otherwise, yes, you need to serialize field-by-field, storing a type tag alongside. That’s also what the Serialization stdlib does and why it does that.

Yes, that is correct, though that mustn’t be confused with being padding-free. It’s also important to note that you still have to check the invariants from the constructor - you don’t get to skirt around that requirement just through being conscious about the padding.

1 Like

Necessarily this way. This reinterpret method requires the result type and the input array eltype to both be isbits. Incidentally, this method doesn’t show up in the website docs, I assume because it’s actually an inner constructor method for ReinterpretArray.

read and unsafe_read however does work with Ptr and Ref in the source code, not really sure how that’s used in practice.

Okay. Now I’m sure I didn’t understand: the topic is too complex/abstract. Maybe I’ll come back to it when I’ve increased my knowledge.

In my intentions, the resize was meant to handle the case in which there weren’t the 3 extra bytes at the end of src.
In this sense, I had understood simplistically your observation.

The 3 bytes aren’t there due to src, they’re there because sizeof(Adummy) already includes them in its byte count, because sizeof returns the number of bytes an object of that type takes up in the currently running Julia session (non-recursively). So first reading in sizeof(Adummy) bytes and then resizing the resulting array to sizeof(Adummy) bytes doesn’t do anything - sizeof returns the same value either way.

The discussion became too advanced for me but I’d like to check my understanding.
Even though read_pad_reinterpret seems working if the structure is isbitstype, it’s unsafe since it bypasses the constructor. The required check by the constructor won’t be performed.
So, the only safe way to read them is read_seq even it’s 10 times slower than read_reinterpret.

## Read each field sequentially and store them into Vector{Any}
function read_seq(ioBuff, ::Type{T}) where T
	seekstart(ioBuff)
	seSt = []
	for t in fieldtypes(T)
		push!(seSt, read(ioBuff, t))
	end
	return T(seSt...)
end

## Reinterpret data (UNSAFE)
function read_reinterpret(ioBuff, ::Type{T}) where T
	data = read(ioBuff, sizeof(T))
	return reinterpret(T, data)[1]
end

## Reinterpret padded data (UNSAFE)
function read_pad_reinterpret(ioBuff, ::Type{T}) where T
	paddedbytes = UInt8[]
	sizehint!(paddedbytes, sizeof(T)) # allocate at once
	for i in 1:fieldcount(T)
		minsize = sizeof(fieldtype(T, i))
		for _ in 1:minsize
			push!(paddedbytes, read(ioBuff, UInt8))
		end
		i != fieldcount(T)
			actualsize = Int(fieldoffset(T, i+1) - fieldoffset(T, i))
		else
			actualsize = sizeof(T) - Int(fieldoffset(T, i))
		end
		pad = actualsize - minsize
		if pad > 0
			for _ in 1:pad
				push!(paddedbytes, zero(UInt8))
			end
		end
	end
	reinterpret(T, paddedbytes)[1]
end
1 Like

Yes, that is (in general) correct. reinterpret really is a very difficult-to-use-correctly low level tool.

1 Like

I’ve learned a lot! Thank you very much.

1 Like

Also, since you mention that you’re writing a parser - this is likely going to be of interest:

https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/