Julia 1.3 order of magnitude slowdown in function

Cross-posted from https://github.com/JuliaLang/julia/issues/34195

I’m noticing an order-of-magnitude slowdown in a custom function to parse “sample list” files (one ASCII value per line) to Float32 arrays. Any ideas what could be causing this?

Offending function here: https://github.com/jpjones76/SeisIO.jl/blob/master/src/Utils/Parsing/streams.jl

Affects my ASCII readers for SLIST and Lennartz SLIST files.

Before 1.3.0, X[i] = stream_float(io, 0x00) was 2-3x faster than v = readline(io); X[i] = parse(Float32, v). Now stream_float is 4x slower than readline + parse. I actually need the old speeds if possible, so any alternate syntax suggestions would be welcomed.

MWE

# install, get a test file
using Pkg
Pkg.add("BenchmarkTools")
Pkg.add("SeisIO")
run(`wget https://github.com/jpjones76/SeisIO-TestData/raw/master/SampleFiles/ASCII/0215162000.c00`)

# MWE
using BenchmarkTools
import SeisIO:stream_float
io = open("0215162000.c00", "r")
nx = 225000
X = Array(Float32, 1){undef, nx} 
@benchmark (seekstart($io); readline($io); @inbounds for i in 1:$nx; $X[i] = stream_float($io, 0x00); end)

Example outputs follow. I benchmarked this on a Dell laptop running Ubuntu 18.04 (bionic), kernel 5.0.0-37-generic, CPU Intel® Core™ i7-6820HQ CPU @ 2.70GHz, 16 GB RAM.

Julia 1.0.5

BenchmarkTools.Trial: 
  memory estimate:  272 bytes
  allocs estimate:  2
  --------------
  minimum time:     25.533 ms (0.00% GC)
  median time:      25.701 ms (0.00% GC)
  mean time:        25.867 ms (0.00% GC)
  maximum time:     30.592 ms (0.00% GC)
  --------------
  samples:          194
  evals/sample:     1

Julia 1.2.0:

  memory estimate:  272 bytes
  allocs estimate:  2
  --------------
  minimum time:     24.738 ms (0.00% GC)
  median time:      24.850 ms (0.00% GC)
  mean time:        24.910 ms (0.00% GC)
  maximum time:     27.172 ms (0.00% GC)
  --------------
  samples:          201
  evals/sample:     1

Julia 1.3.0:

BenchmarkTools.Trial: 
  memory estimate:  272 bytes
  allocs estimate:  2
  --------------
  minimum time:     236.738 ms (0.00% GC)
  median time:      237.059 ms (0.00% GC)
  mean time:        237.833 ms (0.00% GC)
  maximum time:     245.997 ms (0.00% GC)
  --------------
  samples:          22
  evals/sample:     1

The relative contributions of the seekstart and readline commands are trivial.

Referencing my reply in https://github.com/JuliaLang/julia/issues/34195

read is now thread safe which requires acquiring a lock so reading things byte by byte (small reads in general) has slowed down. You could perhaps try https://github.com/BioJulia/BufferedStreams.jl. I do think it would be good to have a goto solution for this in Base (perhaps there already is and I just don’t know about it).

2 Likes