Parsing Julia code with a large array is very slow or results in `StackOverflowError`

It isn’t, nor is it in C. But for measuring parsing time, this doesn’t matter, hence I left it in. Anyway, it’s not like I have this data in C or python around – I only made these alternate “code generators” for C and Python to be able to compare the parser performance there.

Very nice! I wrote a similar wrapper for GAP, a language originally developed for computational group theory (fun fact: around the same time as Python was created). There it looks like this:

using GAP
function gap_code(N::Int)
  io = IOBuffer()
  print(io, "x:=2;")
  print(io, "[")
  for i in 1:N
    print(io, "[$i, x^7+x^6+x^5+x^4+x+1],")
  end
  print(io, "]")
  return String(take!(io))
end
@time GAP.evalstr(gap_code(10_000));

Which gives

julia> @time GAP.evalstr(gap_code(10_000));
  0.054961 seconds (529.63 k allocations: 17.122 MiB)
1 Like

It’s not too hard to parse the input format (polynominals) and output Julia source code as I did it, with sed and/or awk (or perl) in the evalpoly format.

Anyway, I’m just looking at where the slowdown is, and geting rid of parsing tuples for each line gets it also a bit faster:

julia> function badcode(N::Int)
                io = IOBuffer()
                println(io, "x = 2; f(x, c0, c1, c2, c3, c4, c5, c6, c7) = evalpoly(x, (c0, c1, c2, c3, c4, c5, c6, c7))")
                println(io, "ll = [")
                for i in 1:N
                  println(io, "[$i, f(x, 1, 1, 0, 0, 1, 1, 1, 1)],")
                end
                println(io, "]")
                return String(take!(io))
              end
badcode (generic function with 1 method)

julia> bad=badcode(10000) ; @time include_string(Main,bad);
  1.686671 seconds (170.62 k allocations: 9.633 MiB, 0.39% compilation time)

I see; if parse time is the performance we are measuring then this example shows the slowness comes at eval/compile time.

function julia_code(N::Int)
    io = IOBuffer()
    println(io, "function get_it(x)")
    print(io, "(")
    for i in 1:N
      print(io, "($i, x^7+x^6+x^5+x^4+x+1),")
    end
    print(io, ")")
    println(io, "end")
    return String(take!(io))
end
str=julia_code(10_000) ; @time include_string(Main, str);
@time include_string(Main, "get_it(2)");

Just parsing it as a function takes 3.6 seconds. But evaluating it at x=2, took a long time to compile.

julia> str=julia_code(10_000) ; @time include_string(Main, str);
  3.640312 seconds (758.13 k allocations: 49.496 MiB, 1.22% gc time)

julia> @time include_string(Main, "get_it(2)");
1020.222837 seconds (40.05 M allocations: 1.528 GiB, 0.12% gc time, 100.00% compilation time)

Which got me thinking, how to speed up the evaluation? turns out a let block is quite fast? local vs global?

function julia_let_code(N::Int)
  io = IOBuffer()
  println(io, "ll = let x=2")
  print(io, "[")
  for i in 1:N
    println(io, "($i, x^7+x^6+x^5+x^4+x+1),")
  end
  println(io, "];")
  println(io, "end")
  return String(take!(io))
end
str=julia_let_code(10_000) ; @time include_string(Main, str);
julia> str=julia_let_code(10_000) ; @time include_string(Main, str);
  2.897644 seconds (800.39 k allocations: 46.300 MiB, 0.66% gc time, 0.41% compilation time)

@cchderrick thank you for experimenting with this. my guess would be that the function is fully compiled and optimized (which is a waste of time here), while let perhaps is closer to just evaluating the list expression in my original example?

Note that my original timings were like this:

julia> bad=badcode(10000) ; @time include_string(Main,bad);
  2.950018 seconds (708.01 k allocations: 42.467 MiB, 0.13% gc time)

julia> bad=badcode(20000) ; @time include_string(Main,bad);
  5.690603 seconds (1.42 M allocations: 84.963 MiB, 0.54% gc time)

For comparison, here is what I get with your code on my computer (still with Julia 1.6.4 like the original measurements), and it seems quite similar

julia> str=julia_let_code(10_000) ; @time include_string(Main, str);
  2.848505 seconds (758.04 k allocations: 43.652 MiB, 1.09% gc time)

julia> str=julia_let_code(20_000) ; @time include_string(Main, str);
  5.995064 seconds (1.52 M allocations: 87.330 MiB, 2.41% gc time)

You’re basically showing that the (outer) tuple is slow (I don’t think they are meant to be very large). If I only change it to an array, I get very different timings (better, still slow):

julia> function julia_code(N::Int)
           io = IOBuffer()
           println(io, "function get_it(x)")
           print(io, "[")
           for i in 1:N
             print(io, "($i, x^7+x^6+x^5+x^4+x+1),")
           end
           print(io, "]")
           println(io, "end")
           return String(take!(io))
       end

The timings (for get_it) are still superlinear:

julia> str=julia_code(1000) ; @time include_string(Main, str);
  0.488692 seconds (66.11 k allocations: 3.558 MiB)

julia> @time include_string(Main, "get_it(2)");
  3.671691 seconds (3.43 M allocations: 126.870 MiB, 2.41% gc time, 99.99% compilation time)

julia> str=julia_code(10000) ; @time include_string(Main, str);
  4.960584 seconds (678.12 k allocations: 36.435 MiB, 2.10% gc time)

julia> @time include_string(Main, "get_it(2)");
224.110427 seconds (34.08 M allocations: 1.219 GiB, 0.72% gc time, 100.00% compilation time)

but this time, finally some response to lower opt level, with:

$ ~/julia-1.7.0/bin/julia -O0

julia> str=julia_code(10000) ; @time include_string(Main, str);
  4.708956 seconds (678.12 k allocations: 36.435 MiB, 0.83% gc time)

julia> @time include_string(Main, "get_it(2)");
 14.092745 seconds (34.08 M allocations: 1.219 GiB, 12.48% gc time, 99.99% compilation time)

14x slower (for 10x increase in length), but not 61x times slower, as with the default (implying -O2)

At least the parse time is linear.

If the INNER tuples get changed to arrays too, then this gets really slow, even for smaller n:

julia> str=julia_code(100) ; @time include_string(Main, str);
  0.067056 seconds (7.62 k allocations: 458.243 KiB, 8.64% compilation time)

julia> @time include_string(Main, "get_it(2)");
  0.585371 seconds (1.09 M allocations: 51.450 MiB, 1.59% gc time, 99.93% compilation time)

for:
$ ~/julia-1.7.0/bin/julia -O0

julia> str=julia_code(100) ; @time include_string(Main, str);
  0.057833 seconds (7.62 k allocations: 458.243 KiB, 3.16% compilation time)

julia> @time include_string(Main, "get_it(2)");
  0.382963 seconds (1.09 M allocations: 51.450 MiB, 99.89% compilation time)

julia> str=julia_code(1000) ; @time include_string(Main, str);
  0.500095 seconds (66.11 k allocations: 3.560 MiB)

julia> @time include_string(Main, "get_it(2)");
 45.751689 seconds (12.06 M allocations: 558.189 MiB, 0.75% gc time, 100.00% compilation time)