How to read big-endian data

Hello,

I have an IOStream and I want to read a series of floats that are stored in big-endian order, while my computer is usually little-endian. So there are a few parts to my question:

  1. If I know that the computer is little-endian, can I read the float like this?:
value = bswap(read(io,Float32))

My thinking is that read(io,Float32) will at least give me the correct set of bytes, and bswap() seems to preserve the type.

  1. What I wrote above is fine for a small file, but I eventually want to read several GiB of data. Is there a way to read many floats and then convert them all to little-endian? I know you can read all the bytes with
bytes = read(io,sizeof(Float32)*ncells)

But I don’t know what to do with those bytes once I have them? How do I put them into an array of floats?

  1. To make my code more portable, it would be nice if it could test endianness. How do I determine whether the computer I’m currently running on is little-endian or big-endian?

Thanks!

I think bswap has no cost so you shouldn’t worry about performance when scaling it.

The cost wouldn’t be with bswap but with disk IO. I don’t want to read a large file 4 bytes at a time. But in any case, I think I found a solution to that part of my problem: It looks like you can pre-allocate an array and use read!() to populate that array:

data = zeros(Float32,ncells)
read!(io,data)
for j in 1:ncells
    data[j] = bswap(data[j])
end	

That seems to work. So the read!() function must be inferring a few things from the type and length of the data array.

in any case you can always read all the bytes you need out once and then bswap when interpolating them as needed.

x = reinterpret(Float32, read(io, sizeof(Float32)*ncells))
x .= ntoh.(x)

(Note that you can use ntoh rather than bswap, which will do the right thing for big-endian data regardless of whether you are running on a big-endian or little-endian machine.)

4 Likes

Aha! Thanks!