Reading different types of data from single line in binary file

Edit: I changed the description in a following answer to clarify the issue.


I need to read a binary file which contains more than one type of variable written in a single line, as if it was written with

i = 1 ; x = 1e0 ; y = 1e0 ; z = 1e0
f = open("file","w")
write(f,i,x,y,z)
close(file)

I could not find out how to use read to read this data back. Furthermore, with a little more generality, assume that instead of x,y,z one had a vector of Float64 types of n positions. How to read that?

Thank you.

You can try eg. the following: read in the file as a vector of UInt8:
f = open(read,"filename")
Make an IOBuffer out of it that can be parsed:
io = IOBuffer(f)
and then you can read specified types from it:
n = read(io,Float64)

This is the most basic way to approach this I think. You can try memory mapping, or move to a proper binary serialization format (like BSON) if possible.

Your code will not work since you do not save the IO stream from open. But if you fixed that, your file would contain

11.01.01.0

so I am not sure what you are after here — it may be difficult to parse this unambiguously.

v = Vector{Float64}(undef, n)
read!("file", v)

See read!. Generally I think you would benefit from reading https://docs.julialang.org/en/v1/manual/networking-and-streams/.

1 Like

Are you trying to read fixed width ascii files? I.e., files where each variable occupies fixed column positions and there is some kind of schema that describes the mapping of columns to variables?

Do you mean a binary file or a text file? The code you wrote would generate a text file. If you actually mean a binary file you could read the individual values as follows, assuming you know what their types are a priori, e.g.

results = open("file","r") do stream
  i = read(stream,Int)
  x = read(stream,Float64)
  y = read(stream,Float64)
  z = read(stream,Float64)

  (i,x,y,z)
end
1 Like

Hi all,

really sorry, I had to leave to a reunion and didn’t realize that I have posted the wrong code. I meant using “write”, to write a binary file:

i = 1 ; x = 1e0 ; y = 1e0 ; z = 1e0
file = open("file.out","w")
write(file,i,x,y,z)
close(file)

That generates a binary file.

I want to read the that from that file, knowing in advance that the data consists of the types Int64,Float64,Float64,Floa64, in that order (in one “line”).

This situation is typical of Fortran unformatted binary files.

The generalization was if the data was written with, for example:

n = 1000
x = rand(1000)
file = open("file.out")
write(file,n,x)
close(file)

Thank you again, and I apologize for the wrong post there.

Thanks, that clarifies the request. I think @Tamas_Papp gave you a good answer for reading in a Vector, and my example should show you how to do it for a heterogenous set of types. Are those answers clear?

You could change read!("file", v) to read!(stream, v) (as in my use of open) if you want to read that vector in as only part of the file’s contents.

2 Likes

Thank you. It was easier than I thought. For my records (and for dummies like me):

# Writting
n = 10
x = rand(10)
file = open("file.out","w")
write(file,n,x)
close(file)
# Reading
file = open("file.out","r")
n = read(file,Int64)
y = Vector{Float64}(undef,n)
read!(file,y) # with ! to modify "y"
close(file)

Thank you very much.

2 Likes