Performance of splitting string and parsing numbers

I was curious and played around a little bit.

I noticed that a significant amount of time is spent on iterating over eachsplit or split in general.

It’s really just a guess since I am not really a string expert but Julia natively supports unicode so maybe the additional overhead causes the difference between Rust’s split and Julia’s split, although Rust also support UTF-8 in str.
Btw. a couple of years ago we had ASCIIString in Julia which was deprecated somewhere before v1. I am wondering if that would make any difference :wink:

…but again, that’s just a guess.

If you compare this (data is the content of the provided gist file above):

function solve(data)
    maximum(eachsplit(data, "\n\n")) do block
        sum(parse(Int64, num) for num in eachsplit(block, "\n"))
    end
end

I get

  491.542 μs (3757 allocations: 234.77 KiB)

and this one

@btime collect(eachsplit($data, "\n\n"))

already yields

167.167 μs (2264 allocations: 140.75 KiB)

which is 5 times slower than Rust’s total run time. Then we even have another nested split inside solve which adds another 150us-ish to the overall time in case of the Julia implementation.

I am not familiar with the Rust implementation but to me it looks like split and probably also parse are doing a better job in string processing, but 10x faster is really suspicious.

I did a quick check with a crude bytes-parsing implementation:

function splitbytes(bytes, seq)
    out = Vector{Vector{UInt8}}()
    seq_n = length(seq)
    seq_idx = 1
    chunk = Vector{UInt8}()
    for byte ∈ bytes
        # sequence found, pushing everything before it to `out`
        if byte == seq[seq_idx] && seq_n == seq_idx
            push!(out, chunk[1:end-(seq_n-1)])
            seq_idx = 1
            chunk = Vector{UInt8}()
            continue
        end
        # reset sequence matcher if consecutive byte is not matching
        if byte != seq[seq_idx] && seq_n > 1
            seq_idx = 1
        end
        # increase sequence index to match next time
        if byte == seq[seq_idx]
            seq_idx += 1
        end
        push!(chunk, byte)
    end
    push!(out, chunk)
    out
end

and then running that over the very same data and I only get a tiny improvement (20% or so) in runtime and the Int64 interpretation and maximum determination are missing:

function solvebytes(bytes)
    splitbytes.(splitbytes(bytes, [0xA, 0xA]), [0xA])
end
@btime solvebytes($bytes)
  399.708 μs (7504 allocations: 460.39 KiB)

Anyways, are you sure that Rust is not somehow doing some caching behind the scenes, during compilation? To me it’s the only plausible explanation since I don’t know how to do this iteration including comparisons almost 10x faster…