SciML Style: A Style Guide for Stylish Julia Programmers

Hey everyone,
To address some coding uniformity and correctness issues, we have implemented a SciML Style Guide. Details in the Github repo below:

The SciML Style emphasizes the points which are essentially required for robust generic coding. This is the style that SciML has been developed in so that it can support everything from Quaternion numbers on GPUs to reverse-mode automatic differentiation with arbitrary precision numbers. We layout many general principles, along with the reasons for following these principles, before finally going into more detailed bits about things like the preferred way to do spacing and comments.

@yingboma has added support for the SciML Style into the JuliaFormatter:

This requires the now released JuliaFormatter.jl v1.0. You can see it in action over at the ModelingToolkit repo, for example:

We will soon be going through all of our code bases and running the formatter to ensure everything reformats into this standardized style, along with having our CI bots enforce this style on all pull requests.

The style guide is open, so please make issues and PRs if you disagree with any of the bits. I hope that this style guides people into a direction with more uniform and maintainable code while adhering to generic programming principles.

36 Likes

For newbies like me, could you elaborate on the β€œavoid closures” guildeline? In particular, should I think of it as β€œthe julia compiler is not smart enough YET, so unnatural/ugly workarounds like Fix2 are required” or is it a more fundamental limitation of the language that requires fundamental rethinking of my attitude to closures.

Obviously β€œunnatural” and β€œugly” are somewhat subjective judgement calls, but I think it is fair to say that closures (especially short lambads) are elegant to write and more natural to use in something like a map or a fold, compared to the less readable / less general / not exported Fix1 and Fix2.

6 Likes

I find

map(Base.Fix2(getindex, i), vector_of_vectors)

much harder to read than

map(v -> v[i], vector_of_vectors)

I don’t see where this might lead to type instability.

8 Likes

It’s two of the oldest Julia issues. They are not easy to solve, and they will likely be around for awhile. Your compile times and even in some cases runtime will be happy. Generally it’s a good style to avoid things that you know might hit things that go wrong, which is why usually we try to keep closures to a minimum. That doesn’t mean don’t use them, but you should probably double check when you do, or put a let block around it.

1 Like

So another workaround is like this:

function f(vector_of_vectors, i)
    map((let i=i; v -> v[i]; end), vector_of_vectors)
end
julia> @btime map(v -> v[$i], $vector_of_vectors)                          
  24.975 ns (1 allocation: 80 bytes)                                       
3-element Vector{Int64}:                                                   
 2                                                                         
 4                                                                         
 6                                                                         
                                                                           
julia> @btime map(Base.Fix2(getindex, $i), $vector_of_vectors)             
  25.100 ns (1 allocation: 80 bytes)                                       
3-element Vector{Int64}:                                                   
 2                                                                         
 4                                                                         
 6                                                                         
                                                                           
julia> @btime map((let i=$i; v -> v[$i]; end), $vector_of_vectors)         
  25.100 ns (1 allocation: 80 bytes)                                       
3-element Vector{Int64}:                                                   
 2                                                                         
 4                                                                         
 6                                                                         

For me, I also do not like their semantics.

let
x = 4
f = y -> begin
    x = 3abs(y)
    return log(x)
end
f(-6)
@test x == 4
end

I prefer explicit over implicit behavior.

Additionally, to avoid constant debates with @benchmarks and tests over whether something is necessary, there are two approaches:

  1. Don’t bring it up until performance problems are noted.
  2. Program defensively to avoid potential performance problems in the first place.

The β€œavoid closures” rule is in part to say β€œ2.” is better than β€œ1.”.

4 Likes

@PetrKryslUCSD, thank you for posting benchmarks as a point to show my the example was harmless.
Posting that as an example of their safety is a perfect demonstration of the danger of closures, and why they should be avoided.

I can reproduce them (using a tuple of vectors):

fix_simple(x, i) = map(Base.Fix2(getindex, i), x)
clos_simple(x, i) = map(v -> v[i], x)
tuple_of_vectors = Tuple(rand(4) for _ in 1:10);
@benchmark fix_simple($tuple_of_vectors, 1)
@benchmark clos_simple($tuple_of_vectors, 1)
function clos(x, i)
    i += 1
    i -= 1
    res = map(v -> v[i], x)
    i += 1
    i -= 1
    res
end
function fix(x, i)
    i += 1
    i -= 1
    res = map(Base.Fix2(getindex, i), x)
    i += 1
    i -= 1
    res
end
@benchmark fix($tuple_of_vectors, 1)
@benchmark clos($tuple_of_vectors, 1)

For the simple example, I get

julia> @benchmark fix_simple($tuple_of_vectors, 1)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  4.875 ns … 16.417 ns  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     4.959 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   4.978 ns Β±  0.208 ns  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

                         β–ˆ          ▁                         
  β–‚β–β–β–β–β–β–β–β–β–β–β–„β–β–β–β–β–β–β–β–β–β–β–β–ˆβ–β–β–β–β–β–β–β–β–β–β–ˆβ–β–β–β–β–β–β–β–β–β–β–β–ƒβ–β–β–β–β–β–β–β–β–β–β–‚ β–‚
  4.88 ns        Histogram: frequency by time        5.08 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark clos_simple($tuple_of_vectors, 1)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  4.750 ns … 15.083 ns  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     4.875 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   4.899 ns Β±  0.222 ns  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

                              β–ˆ         ▁                     
  β–‚β–β–β–β–β–β–β–β–β–‚β–β–β–β–β–β–β–β–β–β–ƒβ–β–β–β–β–β–β–β–β–ˆβ–β–β–β–β–β–β–β–β–β–ˆβ–β–β–β–β–β–β–β–β–β–ƒβ–β–β–β–β–β–β–β–β–‚ β–‚
  4.75 ns        Histogram: frequency by time           5 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

But as soon as we make it a little more complicated – our function is still unrealistically trivial compared to where such code may actually be used:

julia> @benchmark fix($tuple_of_vectors, 1)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  4.834 ns … 19.834 ns  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     4.959 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   4.977 ns Β±  0.212 ns  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

                              β–ˆβ–        β–„                     
  β–‚β–β–β–β–β–β–β–β–β–‚β–β–β–β–β–β–β–β–β–β–…β–β–β–β–β–β–β–β–β–ˆβ–ˆβ–β–β–β–β–β–β–β–β–ˆβ–β–β–β–β–β–β–β–β–β–ƒβ–β–β–β–β–β–β–β–β–‚ β–‚
  4.83 ns        Histogram: frequency by time        5.08 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark clos($tuple_of_vectors, 1)
BenchmarkTools.Trial: 10000 samples with 229 evaluations.
 Range (min … max):  327.690 ns …   3.281 ΞΌs  β”Š GC (min … max): 0.00% … 87.27%
 Time  (median):     331.332 ns               β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   337.029 ns Β± 104.356 ns  β”Š GC (mean Β± Οƒ):  1.08% Β±  3.14%

  β–β–ƒβ–‚β–β–β–†β–ˆβ–‡β–†β–„β–‚            ▁▁   β–‚β–„β–‚β–‚                              β–‚
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–†β–„β–…β–β–…β–…β–†β–†β–†β–‡β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–†β–…β–†β–…β–‡β–†β–‡β–‡β–†β–†β–„β–†β–†β–†β–†β–…β–…β–…β–†β–…β–…β–…β–…β–„ β–ˆ
  328 ns        Histogram: log(frequency) by time        360 ns <

 Memory estimate: 272 bytes, allocs estimate: 12.

If people test code at all (unlikely), they’ll try microbenchmarks and prove their closure is perfectly safe.
Maybe they’ll even put it in a simple function, and prove it is safe.
Then put it in an actual realistic piece of code, and it tanks performance and no one even knows.
And, as my let example shows from my previous comment, aside from randomly tanking performance depending on where you place your closure, it can also randomly start giving you incorrect answers.

Closures are a plague on Julia.

EDIT:
@greatpet’s solution also works, and would probably be acceptable, and definitely preferred over a closure without the let. The point is defensive programming so code doesn’t accidentally get awful performance or wrong answers, and no one later has to spend hours of their time combing over a huge code base to find where all the allocations, type instabilities, etc come from.

6 Likes

The following is given as an example of how to write generic functions:

function f(A,B)
    @. A = A + B
end

I realize it’s not the point under consideration, but since this function is mutating, wouldn’t it be better to name it f!(A, B)?

8 Likes

Why!? Why is let making the closure work better!?

Tangentially relevant: Acessors.jl provides (among other features) a simpler way to write β€œanonymous functions” that are not actually closures. For example, map(@optic(_[i]), A) and more complex @optic hypot(_[2].abc, i). They are ComposedFunctions of Base.Fix1/2 and custom objects under the hood.

3 Likes

See here. The TL;DR (and IIUC) is that let introduces new variable bindings which cannot outlive the scope of the let block, thus making it β€œeasier” for type inference to determine their types because they can only change within that block.

6 Likes

So I guess it is determined then that closures really are poor mans objects and not the other way around :stuck_out_tongue:

What are peoples favourite strategies to avoid them? I find that it is easy to end up with the code littered with silly structs (e.g. variants of Base.Fix2) which are only useful in a single context. Sometimes I feel the desire to stick them in a submodule just to avoid that they are listed when auto completing.

I suppose the ideal is to just write code which does not make use of indirection at all, or at least not in such a way so that it becomes tempting to close over stuff, but I often find it hard to do so in practice without significantly reducing flexibility.

Naming your callable structs isn’t a bad thing.

1 Like

I tend to do this alot, mainly for making any stacktraces a little bit easier on the eyes, but I sometimes get a bit worried that someone will call me out for being the OO-programmer-in-rehabilitation that I am :slight_smile:

1 Like