How to read big-endian data

danielc · September 24, 2019, 8:27pm

Hello,

I have an IOStream and I want to read a series of floats that are stored in big-endian order, while my computer is usually little-endian. So there are a few parts to my question:

If I know that the computer is little-endian, can I read the float like this?:

value = bswap(read(io,Float32))

My thinking is that read(io,Float32) will at least give me the correct set of bytes, and bswap() seems to preserve the type.

What I wrote above is fine for a small file, but I eventually want to read several GiB of data. Is there a way to read many floats and then convert them all to little-endian? I know you can read all the bytes with

bytes = read(io,sizeof(Float32)*ncells)

But I don’t know what to do with those bytes once I have them? How do I put them into an array of floats?

To make my code more portable, it would be nice if it could test endianness. How do I determine whether the computer I’m currently running on is little-endian or big-endian?

Thanks!

jling · September 24, 2019, 8:31pm

I think bswap has no cost so you shouldn’t worry about performance when scaling it.

danielc · September 24, 2019, 8:41pm

The cost wouldn’t be with bswap but with disk IO. I don’t want to read a large file 4 bytes at a time. But in any case, I think I found a solution to that part of my problem: It looks like you can pre-allocate an array and use read!() to populate that array:

data = zeros(Float32,ncells)
read!(io,data)
for j in 1:ncells
    data[j] = bswap(data[j])
end

That seems to work. So the read!() function must be inferring a few things from the type and length of the data array.

jling · September 24, 2019, 8:42pm

in any case you can always read all the bytes you need out once and then bswap when interpolating them as needed.

stevengj · September 24, 2019, 10:53pm

x = reinterpret(Float32, read(io, sizeof(Float32)*ncells))
x .= ntoh.(x)

(Note that you can use ntoh rather than bswap, which will do the right thing for big-endian data regardless of whether you are running on a big-endian or little-endian machine.)

danielc · September 24, 2019, 11:13pm

Aha! Thanks!

Topic		Replies	Views
Fast reading of multiple big-endian binary files Performance binaryio	1	793	December 18, 2020
Converting bytes to floats when byte order is reversed General Usage	2	1690	February 21, 2018
Reinterpret byte to Float in julia Data binaryio , float , io	4	353	March 15, 2024
Big endian conversion on custom datatypes Performance performance , memory	3	1411	November 14, 2019
Reading binary data from raw PCM files General Usage binaryio	12	5810	February 2, 2019

How to read big-endian data

Related topics