Hi everyone,
Iβm writing a simple simulation program, and trying to figure out the best way of designing a format for my input files. The simulation calculates the effect of multiple types of sources (each defined in a different manner) on a set of observation points in 3D space. The number of sources and the number of observation points will be very large. Not all types of sources may be present in every input file. The data are all Float64.
The naive implementation I came up with is as follows:
- Use a CSV file with comments for βflagsβ that indicate to the program to the type of the consecutive rows of data
- Use
CSV.jl
to read the data line by line and react to the flags as required
For example, with just a few sample data lines:
# observation-pts (number, x, y, z)
1,0,0,0
2,0,0,1
3,1,2,3
# source-type-1 (x0,y0,z0,x1,y1,z1,a)
0,0,0,1,1,1,1000
0,1,2,4,5,6,2000
# source-type-2 (x0,y0,z0,r,b)
10,20,30,50,4000
My concern is that with very large quantities of data, reading line-by-line is a very inefficient method of accessing the data from the file and loading it into memory.
Are there standard/typical methods for handling data files like this?
Thanks in advance!