Somewhat faster text/numeric io

I’m going to announce a package soon based on these ideas. But for a preview, see

https://github.com/dgleich/NumbersFromText.jl

Here is some sample code.

using NumbersFromText
M = readmatrix("myfile.txt") # reads a matrix of data
M = readmatrix(Int, "myfile.txt") # reads a matrix of data
m = readarray("myfile.txt") # just reads a list of Float64s from myfile.txt
m = readarray(Int, "myfile.txt") # just reads a list of Ints from myfile.txt
m = readarray!("myfile.txt", rand(Int, 5)) # read Ints into an existing array
aint, afloat = readarrays("myfile.txt", Int, Float64) # reads alternating Ints and Floats
aint, afloat = readarrays!("myfile.txt", rand(Int,5), rand(Float64,5)) # read into existing arrays

Everything works with IO streams as well.

In my in-memory processing tests, this is about 2x CSV.jl (which is the fastest I’ve seen otherwise.)

I get about 32 million integers is about 2.7-2.9 seconds on my cmputer (so about 10M integers/sec.) Note that reading from disk is still not the limit as this data is about 700MB, so we need about 200MB/sec, which isn’t hard from a SSD. (These are done quickly, so I apologize if I made a mistake.)

I’m still hunting for bugs, so be warned.