Reading complex text files with vectors

Hi,

I am going to apologize ahead of time for this sort of general question, which is sort of similar others that have been asked. I am trying to read files the below, which are velocity vectors, probed at the locations listed in the very long header. My question is, is there an easy way to read these vectors with commas as vectors into an array or dataframe? More generally, what is the easiest way to store them so they are actually usable? If I have a vector of vectors and want to take a norm or a mean, how can I index them all at once?

This is my best attempt so far:

function readFoamVectors(filename, skipstart)
    #open it
    f = open(filename)
    #read the lines
    lines = readlines(f)
    
    #matrix to hold the final product
    data = Any[]

    #loop through the rows
    for i = skipstart:length(lines)
        rowdata = split(lines[i],r"\)               \(|\(|\)")
        rowadd = Any[]
        append!(rowadd,parse(Float64,rowdata[1]))
        for j = 2:length(rowdata)-1
            append!(rowadd,[readdlm(IOBuffer(string(rowdata[j])))])
        end
        push!(data, rowadd')
    end

    return data
end

It has crashed my computer, and it seems generally sloppy and is too slow. Any ideas are great - thanks.

Here is some sample data, in reality, there might be 30 probes, but if I include a file like that it is too many characters. Typically the files are large enough to be about 400Mb in this case:

# Probe 0 (0 0.05 0.1)
# Probe 1 (-0.1 0.3 0)
#           Probe                 0                 1
#            Time
   0.001694915254                 (-5.123201266 0.05644463619 0.247042423)                 (-5.181674475 -0.001740347889 0.0001904027387)
   0.001715438236                 (-5.14262357 0.0563816763 0.2243838269)                 (-5.207857987 -0.002015886388 0.0003267140884)
    0.00173599556                 (-5.133826491 0.05655950346 0.203092524)                 (-5.204641937 -0.002208989055 0.0004473137306)
   0.001756639258                 (-5.134623229 0.05663423636

It’s not quite clear to me what the desired result is. Are the lines starting with # part of the data? The last line seems to be missing a closing parentheses, right? Maybe you could show the data structure that you would like to obtain based on this dataset.

1 Like

Really? Normally one gets an error message and a stacktrace, so this may not be a Julia issue.

I don’t see a single comma in the example data you provided.

Please do put some effort into providing an example dataset and the expected result.

Your question seems similar to: https://discourse.julialang.org/t/dataframes-csv-how-to-read-vectors-from-csv/

line = "0.001694915254                 (-5.123201266 0.05644463619 0.247042423)                 (-5.181674475 -0.001740347889 0.0001904027387)"
x = split(line)
for i in 1:length(x)
x[i] = strip(x[i],['(',')']
end

@scone, in case it helps find a simple file reader below where input file has no headers/comments:

function readprobes(file::String, nprobes)
    data = Matrix{Array{Float64}}(undef,0,nprobes+1)
    open(file) do io
        while !eof(io)
            str = strip.(split(readline(io), ('(',')')))
            str = str[.!isempty.(str)]   # removes empty elements
            v = []
            for i in 1:nprobes+1
                push!(v, parse.(Float64, split(str[i])))
            end
            data = vcat(data, permutedims(v))
        end
    end
    return data
end

file = raw"C:\..\filename.txt"

a = readprobes(file, 2)   # 2 probes

3Ă—3 Matrix{Any}:
 [0.00169492]  [-5.1232, 0.0564446, 0.247042]   [-5.18167, -0.00174035, 0.000190403]
 [0.00171544]  [-5.14262, 0.0563817, 0.224384]  [-5.20786, -0.00201589, 0.000326714]
 [0.001736]    [-5.13383, 0.0565595, 0.203093]  [-5.20464, -0.00220899, 0.000447314]

Thanks Rafael! That is better than what I had! I think I said “commas” instead of “parentheses” earlier- the Tamas will have to forgive me for writing while tired, haha.