Hi
I’m currently trying to read a vector from a .dat file. the structure of the file is not complicated and I currently managed to read the rows/columns that I need by using the following code:
using DelimitedFiles
function get_data(path_to_file::String)
data = readdlm(path_to_file)
z = data[1145:1183, 1] # first col
pdz = data[1145:1183, 2] # Second col
return z, pdz
end
as you can see I need only some specific rows (contained by two text separators)
the file is structured as follows:
#... data percentile .....
0 1 2 3 4 5 6
(I don't need them, 1 row multiple columns)
# ... data 1...
0.1000 1.825E-029
0.3000 6.247E-016
0.5000 3.227E-007
0.7000 4.726E-008
0.9000 3.678E-008
... (data that I need, multiple rows, 2 col)
...
#.... data 2 ....
(I don't need them)
... and so on
this work but I found my solution quite inelegant.
for reference in Python, using numpy this can be obtained with just 1 line of code:
Have you tried CSV.jl? I believe it should be able to do what you want.
As a sidenote, it might be easier to clean up your data file and then read a nicely formatted file, than directly reading a file with messy formatting. It might require a temp-file, but given that you know the start row it seems like you only want to read a single file, so then that is no problem.
Also, do not worry about performance unless you have to. Premature optimization can take a lot of time and make the code less readable and/or less general. If the difference is 1 vs 5 seconds, as a one-time cost (or 1 vs 5 milliseconds more realistically), then there is little to actually be gained.
I’m looking with CSV.File but it doesn’t work as I expect,
I don’t know if I can unpack the results and for some reason it tries to generate the columns based on the first uncommented line of the file, not the ones I’m reading.
data = CSV.File(path_to_file, skipto=1144, limit=40, header=1144,ignorerepeated=true, comment="#")