Nerd Sniping: How can I improve the read in of a human-readable file?

Fun solution. To show off my packages MemoryViews.jl and BufferIO.jl, I wrote a solution using them. On my computer, it’s 8 times faster than your code and parses the 83 MB example file in 160 miliseconds.

You can find the code here: play/ParseChallenge at master · jakobnissen/play · GitHub

Some notes and limitations with the current code

  • This solution uses what is currently BufferIO internals of how line_views work. However, I’ve wanted to document the specific behaviour for some time, so I’ll make a release that guarantees this behaviour.
  • Error handling is not great. Production code would have better error messages and a dedicated exception type.
  • The parsing is fairly strict. It requires
    • It requires you to know the number of blocks, and number of lines of integers per block
    • An integer must be exactly the regex [0-9]{1-2}, and be < 41
    • An integer line must be 0-10 integers, separated by one space, with one optional trailing space, nothing less, nothing more

It could still be improved a little, by SIMD’ing the integer parsing.

6 Likes