The function readuntil
allows you to read an IO buffer until a specified character which could be e.g. the spacebar ' '
. Is there a similar function which allows you to read until any whitespace (space, tab, newline etc.)? I can always read a line and run split
, but this can be inefficient if the line is extremely long. For example, what’s the easiest way to write code to read a million integers, separated by arbitrary whitespace, one by one, without ever allocating memory to read a long line or otherwise a large chunk of the file?
P.S. This is easy with scanf()
in C if I’m not mistaken, since it terminates at whitespace by default.
Here’s my own attempt.
using StaticArrays
function read_int64_until_whitespace(io::IO)::Union{Int64, Nothing}
v = MVector{20, Char}(undef) # Char vector of length 20 is enough to hold a string of any positive or negative Int64
len = 0 # actual length used
c = ' '
while !eof(io) && isspace(c)
c = read(io, Char)
end # read until first non-whitespace character
while !isspace(c) # store non-whitespace characters in Char array
len += 1
v[len] = c
if eof(io)
break
end
c = read(io, Char)
end
if (len == 0) # if only whitespace is found, return `nothing`
nothing
else
truncated = @view v[1:len]
parse(Int64, String(truncated))
end
end
Example usage:
julia> io = IOBuffer("23\n-45 78");
julia> read_int64_until_whitespace(io)
23
julia> read_int64_until_whitespace(io)
-45
julia> read_int64_until_whitespace(io)
78
But my code seems too verbose compared with the C version
#include <inttypes.h>
scanf("%" SCNd64, &your_variable);
or C++ version
cin >> your_variable;
Any suggestions?
julia> using Scanf
julia> b = IOBuffer(" \t\v 3289\n\ncraoeuhrcaohuec")
IOBuffer(data=UInt8[...], readable=true, writable=false, seekable=true, append=false, size=25, maxsize=Inf, ptr=1, mark=-1)
julia> @scanf(b, " %d ", Int)
(1, 3289)
1 Like