Read input stream until the next white space?

The function readuntil allows you to read an IO buffer until a specified character which could be e.g. the spacebar ' '. Is there a similar function which allows you to read until any whitespace (space, tab, newline etc.)? I can always read a line and run split, but this can be inefficient if the line is extremely long. For example, what’s the easiest way to write code to read a million integers, separated by arbitrary whitespace, one by one, without ever allocating memory to read a long line or otherwise a large chunk of the file?

P.S. This is easy with scanf() in C if I’m not mistaken, since it terminates at whitespace by default.

Here’s my own attempt.

using StaticArrays

function read_int64_until_whitespace(io::IO)::Union{Int64, Nothing}
    v = MVector{20, Char}(undef) # Char vector of length 20 is enough to hold a string of any positive or negative Int64
    len = 0 # actual length used

    c = ' '
    while !eof(io) && isspace(c)
        c = read(io, Char)
    end # read until first non-whitespace character
    while !isspace(c) # store non-whitespace characters in Char array
        len += 1
        v[len] = c
        if eof(io)
        c = read(io, Char)

    if (len == 0) # if only whitespace is found, return `nothing`
        truncated = @view v[1:len]
        parse(Int64, String(truncated))

Example usage:

julia> io = IOBuffer("23\n-45 78");

julia> read_int64_until_whitespace(io)

julia> read_int64_until_whitespace(io)

julia> read_int64_until_whitespace(io)

But my code seems too verbose compared with the C version

#include <inttypes.h>
scanf("%" SCNd64, &your_variable);

or C++ version

cin >> your_variable;

Any suggestions?

julia> using Scanf

julia> b = IOBuffer(" \t\v 3289\n\ncraoeuhrcaohuec")
IOBuffer(data=UInt8[...], readable=true, writable=false, seekable=true, append=false, size=25, maxsize=Inf, ptr=1, mark=-1)

julia> @scanf(b, " %d ", Int)
(1, 3289)

1 Like