Constant propagation vs generated functions

Maybe this is a positive news for this thread:

Expr patterns in MLStyle.jl used to be >10x faster than Match.jl, but now without any update, Expr patterns in Match.jl is now as fast as MLStyle.

This is the benchmark on 1.72: vs-match on 1.72:

vs-match on 1.3 or lower

Julia is getting smarter. Code is more optimized while latency is reduced.

It’s hard for me to believe that constant propagation can replace generated functions, but related evidences here and there make me optimistic about the future.

8 Likes

There are a lot of different concepts that are overlapping but not synonymous in use case here, so I’ll try to respond as best I can but might need a nudge if anyone is curious about something besides what I jot down here.

The problem with sub-typing in Static.jl is a complex story and if I had the time I would just give a full write up. There are some things that should really move to Base because they are a fundamental component in representing concepts that appear in LLVM and all collections throughout Julia (e.g., things with a size or position known at compile time). This hasn’t happened yet and has required changes to keep things moving that depend on Static.jl. I can answer specific questions if people want but there are so many related issues and PRs I’ve commented on that I’m not sure what else I could say that active parties haven’t already heard, so I’ll try to keep to the current theme of const prop and generated for now.

@tim.holy’s example is a case where we clearly are trying to use constant propagation. I think we’ve done a good job of describing what constant propagation is in documentation and examples. We have not done a good job of describing when we don’t want constant propagation. That same example has the float in the type parameter space at first. But why? We can probably assume it is not intended to be completely inferred through constant propagation, and that exact number is a core part of the type. Even that isn’t enough to require defining a whole new type though. We probably need to use it frequently enough for it to make sense to do anything more than just an internal __foo(::Val{S}, ::Val{T}) method. @cscherrer has enough knowledge about his field of work to know these float numbers are specific to these types and that the are part of a distinct public interface.

DataFrame construction is a good example where statically known names and constant propagation works against them. They really don’t want every column name to be known at compile time and create unique methods. Even if the DataFrame represents some information that is technically known at compile time, we don’t what a new DataFrame method to be generated for every permutation and column name. There are alternative structures that adhere to a table interface but you should probably ask yourself if you are going spend more time compiling code or running code at that point.

Static types and constant propagation can help with @generated, but I doubt they will ever completely replace them. We can probably get around using most @generated functions with the appropriate use of a handful of static types, constant propagation, and probably using Base.@assume_effects. That takes a lot of work to do properly though. Most of my PRs to Static.jl and ArrayInterface.jl in the past 6 months have been aimed at replacing generated functions and reducing invalidations. I have to check package start up time, invalidations, inference, and benchmarks to ensure none of them suffer for each change. It’s fine to put that sort of work into code that will be used by hundreds of packages, but sometimes you just want to write your code and get some real work done. So use a generated function until you decide that functionality will be needed for a long time and is worth several hours of work.

I think some of the responses concerning improper use of static types are in the same vein as comments that were given for use of Base.@pure. Now we have a more well thought out Base.@assume_effects so we don’t have to tell people to just stop using @pure. @assume_effects is a similar but more disciplined and clearly defined tool. That’s what we’re doing with Static.jl in a lot of ways. It’s frustrating when we create a bunch of static types and people comment just to say stop it (just like the use of @pure) when we clearly don’t have another tool that does the same thing. So we need a more refined approach and created a generic approach to static types in Static.jl. It’s not a perfect or fully finished product. Even if it was, the better tool is not always the right one, so there will never be a one size fits all solution.

The original post echos some of my frustration that a handful of experienced Julia developers still can’t just read my mind and understand why some things are important. Maybe the mind reading tab is in the works for git repos. Until that day, this all sounds like the art of programming in Julia. It takes time and thought to get to the sweet spot when composing your own code. Style guides are good. Tools that make it so we don’t have to think about it are better, but in the end there’s no replacement for experience.

6 Likes

The beauty of @generated is that it guarantees compiler behavior across Julia versions in performance critical code.

The optimiser is not getting better in all cases, and sometimes gets worse (see e.g multiple compiler optimisation problems in https://github.com/JuliaObjects/Accessors.jl/pull/23). Locking down the most intensive parts of algorithms with a generated function can be very relieving.

7 Likes

Optimal sorting for Tuples up to 25 elements is a start SortingNetworks.jl

1 Like

As an alternative to the two options, maybe see the work @Keno has been doing on stage 2 of diffractor: Very WIP: Stage2 revival by Keno · Pull Request #78 · JuliaDiff/Diffractor.jl · GitHub

1 Like

I think comptime.jl (Juliacon presentation) is related here.

3 Likes

Oh, that looks cool!

1 Like