Reading headers of delimited files

Better to just pass the f object directly to CSV.read, since that will begin reading from the current position in the stream:

f = open(filename)
header = first(eachline(f), 7) # read 7 header lines
data = CSV.read(f, DataFrame, header=false)

More generally, you can take any string s (or vector of bytes), representing any chunk of the file you want to read from, and call CSV.read on IOBuffer(s).

Sure, just isolate the section you want and wrap an IOBuffer around it.

I would try to avoid reading the file line-by-line if you can. For example, you could mmap the whole file, then wrap an IOBuffer around a @view of the section of the file you want to read from. You can use StringViews.jl to perform regular-expression searches of a memory-mapped file, or just search for bytes like UInt8('\n') (line endings).

1 Like