Why is functional style so much slower here?

glebm · December 15, 2023, 11:40am

I wrote my first program in Julia today, which is a solution to Advent of Code day 15.

I wrote it in a functional style at first and was very surprised with how slow it ran.

I narrowed it down to the score functions. Here are 3 versions of it:

# Using .|> and |>
score_pipe((boxnum, box)::Tuple{Int, Vector{Lens}})::Int =
  box |> enumerate .|>
  (((slot, lens)::Tuple{Int, Lens},) -> boxnum * slot * lens.focal)
  |> sum

# Using sum(f, iter)
score_sum_fn((boxnum, box)::Tuple{Int, Vector{Lens}})::Int =
  sum(enumerate(box); init=0) do (slot, lens)::Tuple{Int, Lens}
    boxnum * slot * lens.focal
  end

# Using a for loop
function score_loop((boxnum, box)::Tuple{Int, Vector{Lens}})::Int
  result = 0
  for (slot, lens) in enumerate(box)
    result += boxnum * slot * lens.focal
  end
  result
end

@time sum(Iterators.map(score, enumerate(boxes))) |> println

1.   score_pipe: 0.108080 seconds (237.70 k allocations: 16.374 MiB, 99.81% compilation time)
2. score_sum_fn: 0.082646 seconds (178.00 k allocations: 12.024 MiB, 99.75% compilation time)
3.   score_loop: 0.068665 seconds (112.36 k allocations: 7.630 MiB, 99.73% compilation time)

Why is there such a huge difference between 3 and 1? Is it bad practice to write functional code in Julia? That would be unfortunate.
Why do any of these do allocations at all? They’re simply iterating over a vector and accumulating everything into a single number, there shouldn’t be any heap allocations.

Full code: https://replit.com/@glebm1/Julia-AoC-2023-Day-15#main.jl

Eben60 · December 15, 2023, 12:09pm

For the first question: Most of the time you measure is compilation time anyway.

For benchmarking, use BenchmarkTools.jl

glebm · December 15, 2023, 12:14pm

Oh, do allocations also include allocations for compilation?

Sukera · December 15, 2023, 12:19pm

glebm:

1.   score_pipe: 0.108080 seconds (237.70 k allocations: 16.374 MiB, 99.81% compilation time)
2. score_sum_fn: 0.082646 seconds (178.00 k allocations: 12.024 MiB, 99.75% compilation time)
3.   score_loop: 0.068665 seconds (112.36 k allocations: 7.630 MiB, 99.73% compilation time)

Note how more than 99% of your timing is compilation time. All of your codes are pretty much equal when accounted for that:

julia> 0.108080 * (1.0 - .9981)
0.00020535200000000136

julia> 0.082646 * (1.0 - .9975)
0.00020661499999999558

julia> 0.068665 * (1.0 - .9973)
0.00018539550000000246

@time is a very naive timing mechanism. It includes everything, no matter whether its compilation or GC work that just happened to occur during its invocation. It also only measures once, so is subject to noise on tight measurements.

barucden · December 15, 2023, 12:37pm

Also, someone correct me if I am wrong, but the type annotations in your code are completely unnecessary here. They can be useful to restrict valid input types of the function arguments but the other ones are just distraction.

mkitti · December 17, 2023, 2:55am

Technically they are acting as type assertions for the return type as well. That’s one way to ensure type stability.

ufechner7 · December 17, 2023, 7:02am

Running @time twice gets rid of the compilation time and is often good enough.

lrnv · December 22, 2023, 5:57pm

Well no it does not ensure type stability, but only “hide” type instability behind a barrier. They are still there.

Elrod · December 22, 2023, 6:19pm

Defined distinct notions of type stability and type groundedness.

Type stability (return type inserted) is enough to prevent propagation. Propagating instabilities can of course be quite bad.

The functions that take up most of your runtime being grounded is enough for instabilities to not be a major runtime performance concern (but they can still negatively impact compile times!). E.g., a function barrier is grounded, but the code calling it isn’t.

Topic		Replies	Views
Compare julia sum to a cpp implementation - julia is extremely slow?! Performance question	35	1759	October 7, 2019
Function array performance Performance	5	416	April 2, 2022
Why using `+=` instead of `=` generates slow code? Performance	5	627	May 6, 2021
How to improve performance of sum() Performance question , fast-math , tullio , loopvectorization , sum	19	4735	July 19, 2021
Significant difference in execution speed between similar functions with nested loops Performance question	4	235	March 8, 2023

Why is functional style so much slower here?

Related topics