Suppose I have a large table whose rows look like
3 1 2.5 4.0 10.1
and I know for sure that the first two columns are Int
s and the next three are Float64
s.
Is there a way to use that information to write a fast parsing function?
So far, I’m coming to something like this:
function parseline(types, line)
buf = IOBuffer(line)
tokens = ntuple(length(types)) do _
readuntil(buf, ' ')
end
map(parse, types, tokens)
end
julia> parseline((Int, Int, Float64, Float64, Float64), "3 1 2.5 4.0 10.1")
(3, 1, 2.5, 4.0, 10.1)
which works faster than
map((t, token)->parse(t, token), types, split(line))
Is there a fairly easy way to make the operation faster and type-stable?
In general, I don’t know how many columns the table has and which types they have until I read the header, so hardcoding the types is not an option.