Revisited : Loading first few lines from data file

I am revisiting a question I had asked earlier, but none of the solutions work on my 1.1.0 version Julia. I have a data file with 100 rows of data of the form

1 2
3 4
5 6.1
...

I want to load the first few lines, say 10 of them, as a 10X2 array. Three solutions suggested to me are now yielding errors. I am at a loss on how to proceed. I do not want to read the entire file and then truncate the number of rows. I am listing below three methods that were suggested to me and also the error messages they create.

Method I : using head -n

julia> x = open(`head -n$100 $input_file`) do io
                  readdlm(io);
                         end;
ERROR: IOError: could not spawn `head -n100 Rot_golden_Complex_N=1E6_DeltaT=1E-2.txt`: no such file or directory (ENOENT)
Stacktrace:
 [1] _spawn_primitive(::String, ::Cmd, ::Array{Any,1}) at .\process.jl:400
 [2] setup_stdios(::getfield(Base, Symbol("##505#506")){Cmd}, ::Array{Any,1}) at .\process.jl:413
 [3] _spawn at .\process.jl:412 [inlined]
 [4] #open#514(::Bool, ::Bool, ::Function, ::Cmd, ::Base.DevNull) at .\process.jl:657
 [5] open at .\process.jl:648 [inlined] (repeats 2 times)
 [6] open(::getfield(Main, Symbol("##275#276")), ::Cmd) at .\process.jl:678
 [7] top-level scope at none:0

Method II : using *open( readlines …

julia> julia> x = open(readlines, `head -n10 $(input_file)`)
ERROR: IOError: could not spawn `head -n10 Rot_golden_Complex_N=1E6_DeltaT=1E-2.txt`: no such file or directory (ENOENT)
Stacktrace:
 [1] _spawn_primitive(::String, ::Cmd, ::Array{Any,1}) at .\process.jl:400
 [2] setup_stdios(::getfield(Base, Symbol("##505#506")){Cmd}, ::Array{Any,1}) at .\process.jl:413
 [3] _spawn at .\process.jl:412 [inlined]
 [4] #open#514(::Bool, ::Bool, ::Function, ::Cmd, ::Base.DevNull) at .\process.jl:657
 [5] open at .\process.jl:648 [inlined] (repeats 2 times)
 [6] open(::typeof(readlines), ::Cmd) at .\process.jl:678
 [7] top-level scope at none:0

Method III : using CSV package

julia> CSV.read("input_file", delim = ' ', rows = 10)
ERROR: MethodError: no method matching CSV.File(::String; delim=' ', rows=10)
Closest candidates are:
  CSV.File(::Any; header, normalizenames, datarow, skipto, footerskip, limit, transpose, comment, use_mmap, ignoreemptylines, threaded, select, drop, missingstrings, missingstring, delim, ignorerepeated, quotechar, openquotechar, closequotechar, escapechar, dateformat, decimal, truestrings, falsestrings, type, types, typemap, categorical, pool, strict, silencewarnings, debug, parsingdebug, allowmissing) at C:\Users\iamsu\.julia\packages\CSV\76SRf\src\CSV.jl:262 got unsupported keyword argument "rows"
Stacktrace:
 [1] kwerr(::NamedTuple{(:delim, :rows),Tuple{Char,Int64}}, ::Type, ::String) at .\error.jl:125
 [2] (::getfield(Core, Symbol("#kw#Type")))(::NamedTuple{(:delim, :rows),Tuple{Char,Int64}}, ::Type{CSV.File}, ::String) at .\none:0
 [3] #read#68(::Bool, ::Base.Iterators.Pairs{Symbol,Any,Tuple{Symbol,Symbol},NamedTuple{(:delim, :rows),Tuple{Char,Int64}}}, ::Function, ::String) at C:\Users\iamsu\.julia\packages\CSV\76SRf\src\CSV.jl:1156
 [4] (::getfield(CSV, Symbol("#kw##read")))(::NamedTuple{(:delim, :rows),Tuple{Char,Int64}}, ::typeof(CSV.read), ::String) at .\none:0
 [5] top-level scope at none:0

Which operating system are you using? head is a Linux/Mac utility . It does not work on windows

1 Like

Oh I see. The problem did arise on my Windows computer. It works fine on my linux computer at work, but we are on lockdown :frowning:
I am a bit surprised to hear that the syntax is OS dependent. Could you suggest a way around ?

Given that you’re on a relatively outdated Julia version (I would recommend moving to 1.4.2) you might also be on an old CSV version, so I can’t promise that this works, but when using CSV you should use the limit keyword, not rows. See the docs here, you probably want:

DataFrame(CSV.File("input_file", delim = ' ', limit = 10))

You might also want to consider header=false as additional kwarg given that you say you want to get a 10x2 array from the first 10 rows (implying there’s no header to be read in).

3 Likes

Thanks, I am going to try it out. The documentation of CSV.jl here says that rows is a valid optional argument.
Also, is there a command to update my Julia version ? This question was posted here but no answer was provided.

Note that the docs you are linking to are for CSV version 0.1.1, released in 2016 - not your fault, but you have to be careful when googling for docs, as Google will randomly serve up some ancient version of the documentation. Make sure that you see stable somewhere in the URL ideally.

If you’re on Arch Linux you could use the AUR julia-bin repo which will automatically update your Julia version, other than that I don’t think there are ways to do this directly, so you’d just have to download the latest release from Download Julia.

2 Likes

I see. Henceforth I will look for the keyword stable in the url. Also, your suggestion for using the CSV worked, thank you. It looks to me like things are much easier to do on Linux. I will use your method for the moment till my office reopens.

FWIW I use Windows at work and it isn’t particularly burdensome to update Julia - just download the .exe from the website, click and install, and if desired copy your 1.2 environments folder and rename it 1.4 to move all your installed packages to the new version.

1 Like

@nilshg has got the correct answer for you

To explain what is happening, this line says ‘run the os command called head -n$100 $input_file’
julia> x = open(head -n$100 $input_file) do io

Look closely - the quotes are ` and are commonly called backticks.
https://docs.julialang.org/en/v1/manual/running-external-programs/

A search says that this comman dis equivalent to head on the Windows os

gc -head 10 filename

I woudl really advise to use the method @nilshg suggests. Writing code fro a particular OS is inflexible.