Preferred method of loading binary files?

Paethon · July 10, 2019, 10:56am

I am currently writing code to read from a binary file format. So the file has a structure and I have structs that mirror that structure and I would like to read the data from file and fill my structs with it.

What is the preferred way of doing that? I can just load the data in the struct by using a read! and a Ref{} to a struct, but that seems like it might fail catastrophically if Julia decides to add padding bytes to the struct.

Should I just loop through the Fields and read the corresponding types from the file iteratively?

c42f · July 10, 2019, 11:24am

The best technique depends on the file format specification.

If you know that the file format is a direct dump of structs which follow the C ABI, then the padding will be in the file and you can just read it directly as you’re suggesting. For this to work the file needs to be read and written on the same architecture / OS so it’s a little brittle.

If you know that all structs in the file were written without padding, I’d suggest just reading the fields sequentially one by one. After you’ve read the fields, pass them to the constructor for your struct type.

This second option is very flexible but it can be rather verbose if you need to implement it for many different structs. If that’s the case you could consider some code generation using reflection facilities like fieldnames and fieldtype.

Which format are you trying to parse?

Paethon · July 10, 2019, 11:41am

Thanks! I am trying to load some archive/map data of some old games for visualization, so those formats are obviously not well specified.

Guess I will just have a look of how they behave and write code accordingly. Since the main goal of all of this is to get into Julia a bit more I would prefer to not just hack something together and have a nice Julian solution

Using some code generation using fieldnames and fieldtypes seems like an elegant solution.

c42f · July 11, 2019, 2:36am

That makes sense; you’ll basically have to reverse engineer these formats then.

My suggestion would be to start with writing out the loader code explicitly as a big lot of read(io, T) while you work on understanding the format. Once you start to see patterns you’ll know whether it’s possible to attack it with a code generator or not. The reason I say this is that many ad hoc binary formats are rather “clever” in being optimized for file size, access time, or backward compatibility and there can be a ton of special cases which are not well suited to code generation. YMMV

Topic		Replies	Views
Reading byte-aligned struct from binary file New to Julia binaryio	2	1365	April 15, 2021
Reading binary file to Julia New to Julia question	10	600	August 25, 2023
Unpacking binary data into a Julia struct General Usage question , binaryio	7	2568	November 25, 2020
Reading structs from binary stream General Usage binaryio	1	481	January 28, 2020
Importing a binary file to a struct New to Julia question , binaryio	3	934	November 5, 2020

Preferred method of loading binary files?

Related topics