I was curious and played around a little bit.
I noticed that a significant amount of time is spent on iterating over eachsplit
or split
in general.
It’s really just a guess since I am not really a string expert but Julia natively supports unicode so maybe the additional overhead causes the difference between Rust’s split
and Julia’s split
, although Rust also support UTF-8 in str
.
Btw. a couple of years ago we had ASCIIString
in Julia which was deprecated somewhere before v1. I am wondering if that would make any difference
…but again, that’s just a guess.
If you compare this (data
is the content of the provided gist file above):
function solve(data)
maximum(eachsplit(data, "\n\n")) do block
sum(parse(Int64, num) for num in eachsplit(block, "\n"))
end
end
I get
491.542 μs (3757 allocations: 234.77 KiB)
and this one
@btime collect(eachsplit($data, "\n\n"))
already yields
167.167 μs (2264 allocations: 140.75 KiB)
which is 5 times slower than Rust’s total run time. Then we even have another nested split
inside solve
which adds another 150us
-ish to the overall time in case of the Julia implementation.
I am not familiar with the Rust implementation but to me it looks like split
and probably also parse
are doing a better job in string processing, but 10x faster is really suspicious.
I did a quick check with a crude bytes-parsing implementation:
function splitbytes(bytes, seq)
out = Vector{Vector{UInt8}}()
seq_n = length(seq)
seq_idx = 1
chunk = Vector{UInt8}()
for byte ∈ bytes
# sequence found, pushing everything before it to `out`
if byte == seq[seq_idx] && seq_n == seq_idx
push!(out, chunk[1:end-(seq_n-1)])
seq_idx = 1
chunk = Vector{UInt8}()
continue
end
# reset sequence matcher if consecutive byte is not matching
if byte != seq[seq_idx] && seq_n > 1
seq_idx = 1
end
# increase sequence index to match next time
if byte == seq[seq_idx]
seq_idx += 1
end
push!(chunk, byte)
end
push!(out, chunk)
out
end
and then running that over the very same data and I only get a tiny improvement (20% or so) in runtime and the Int64
interpretation and maximum
determination are missing:
function solvebytes(bytes)
splitbytes.(splitbytes(bytes, [0xA, 0xA]), [0xA])
end
@btime solvebytes($bytes)
399.708 μs (7504 allocations: 460.39 KiB)
Anyways, are you sure that Rust is not somehow doing some caching behind the scenes, during compilation? To me it’s the only plausible explanation since I don’t know how to do this iteration including comparisons almost 10x faster…