Reading csv files with missing lines

I have to read the following csv file:

#Turbine    Time(s)    dt(s)    nacelle yaw angle (degrees)
0 20000 0.5 255
1 20000 0.5 255
2 20000 0.5 255
3 20000 0.5 255
4 20000 0.5 255
5 20000 0.5 255
6 20000 0.5 255
7 20000 0.5 255
8 20000 0.5 255

0 20000.5 0.5 255
1 20000.5 0.5 255
2 20000.5 0.5 255
3 20000.5 0.5 255
4 20000.5 0.5 255
5 20000.5 0.5 255
6 20000.5 0.5 255
7 20000.5 0.5 255
8 20000.5 0.5 255

0 20001 0.5 255
1 20001 0.5 255

If I try to read it, using CSV.jl and DataFrames I get the following error:

julia> include("examples/main.jl")
ERROR: LoadError: BoundsError: attempt to access 21610×7 DataFrame at index [2:9223372036854775807, :]
Stacktrace:
 [1] getindex
   @ ~/.julia/packages/DataFrames/kcA9R/src/dataframe/dataframe.jl:607 [inlined]
 [2] importSOWFAFile(filename::String, dataLines::UnitRange{Int64})
   @ FLORIDyn ~/repos/FLORIDyn.jl/src/settings.jl:228
 [3] importSOWFAFile
   @ ~/repos/FLORIDyn.jl/src/settings.jl:217 [inlined]
 [4] prepareSimulation(wind::FLORIDyn.Wind, con::FLORIDyn.Con, paramFLORIDyn::FLORIDyn.FloriDyn, paramFLORIS::FLORIDyn.Floris, turbProp::@NamedTuple{…}, sim::FLORIDyn.Sim)
   @ FLORIDyn ~/repos/FLORIDyn.jl/src/floridyn_cl/prepare_simulation.jl:106
 [5] top-level scope
   @ ~/repos/FLORIDyn.jl/examples/main.jl:21
...

I can read this file with Matlab.

I am not sure if this is a valid .csv file. Are empty lines in a .csv file allowed? Should they be skipped?

I use this code to read the .csv file:

nacelleYaw = importSOWFAFile(joinpath(vel_file_dir, "SOWFA_nacelleYaw.csv"))

And the function importSOWFAFile looks like this:

function importSOWFAFile(filename::String, dataLines::Union{UnitRange{Int}, Vector{Tuple{Int,Int}}} = 2:typemax(Int))

    # Read full table first
    df = CSV.read(filename, DataFrame;
        delim=' ',
        ignorerepeated=true,
        missingstring="",
        header=[:Turbine, :Times, :Var3, :nacelle],
        types=Dict(:Turbine=>Float64, :Times=>Float64, :nacelle=>Float64, :Var3=>String),
        silencewarnings=true
    )

    # Filter rows if needed
    if dataLines isa UnitRange
        df = df[dataLines, :]
    elseif dataLines isa Vector
        keep_rows = falses(nrow(df))
        for (start, stop) in dataLines
            keep_rows[start:min(stop, nrow(df))] .= true
        end
        df = df[keep_rows, :]
    end

    # Select specific columns
    selected_df = df[:, [:Turbine, :Times, :nacelle]]

    # Convert to Matrix
    nacelleYaw = Matrix(selected_df)
    return nacelleYaw
end

The exception happens in CSV.read( .
Any idea?

1 Like

you likely want to set ignoreemptyrows=true Examples · CSV.jl

Thank you!

But that did not solve the problem, still getting a similar error, but now in this line:

        df = df[dataLines, :]

EDIT:
It solved the problem, after I changed:

  if dataLines isa UnitRange

to

  if dataLines isa UnitRange && dataLines != 2:typemax(Int)

Thank you so much!