SciML Style: A Style Guide for Stylish Julia Programmers

ChrisRackauckas · May 25, 2022, 7:05pm

Hey everyone,
To address some coding uniformity and correctness issues, we have implemented a SciML Style Guide. Details in the Github repo below:

The SciML Style emphasizes the points which are essentially required for robust generic coding. This is the style that SciML has been developed in so that it can support everything from Quaternion numbers on GPUs to reverse-mode automatic differentiation with arbitrary precision numbers. We layout many general principles, along with the reasons for following these principles, before finally going into more detailed bits about things like the preferred way to do spacing and comments.

@yingboma has added support for the SciML Style into the JuliaFormatter:

This requires the now released JuliaFormatter.jl v1.0. You can see it in action over at the ModelingToolkit repo, for example:

We will soon be going through all of our code bases and running the formatter to ensure everything reformats into this standardized style, along with having our CI bots enforce this style on all pull requests.

The style guide is open, so please make issues and PRs if you disagree with any of the bits. I hope that this style guides people into a direction with more uniform and maintainable code while adhering to generic programming principles.

Krastanov · May 28, 2022, 7:06pm

For newbies like me, could you elaborate on the “avoid closures” guildeline? In particular, should I think of it as “the julia compiler is not smart enough YET, so unnatural/ugly workarounds like Fix2 are required” or is it a more fundamental limitation of the language that requires fundamental rethinking of my attitude to closures.

Obviously “unnatural” and “ugly” are somewhat subjective judgement calls, but I think it is fair to say that closures (especially short lambads) are elegant to write and more natural to use in something like a map or a fold, compared to the less readable / less general / not exported Fix1 and Fix2.

PetrKryslUCSD · May 28, 2022, 9:00pm

I find

map(Base.Fix2(getindex, i), vector_of_vectors)

much harder to read than

map(v -> v[i], vector_of_vectors)

I don’t see where this might lead to type instability.

ChrisRackauckas · May 28, 2022, 10:00pm

github.com/JuliaLang/julia

performance of captured variables in closures

opened 03:19PM - 28 Feb 16 UTC

timholy

performance lowering

``` jl using Images: realtype function ifi{T<:Real,K,N}(img::AbstractArray{T,N}…, kern::AbstractArray{K,N}, border::AbstractString, value) if border == "circular" && size(img) == size(kern) out = real(ifftshift(ifft(fft(img).*fft(kern)))) elseif border != "inner" prepad = [div(size(kern,i)-1, 2) for i = 1:N] postpad = [div(size(kern,i), 2) for i = 1:N] fullpad = [nextprod([2,3], size(img,i) + prepad[i] + postpad[i]) - size(img, i) - prepad[i] for i = 1:N] A = padarray(img, prepad, fullpad, border, convert(T, value)) krn = zeros(typeof(one(T)*one(K)), size(A)) indexesK = ntuple(d->[size(krn,d)-prepad[d]+1:size(krn,d);1:size(kern,d)-prepad[d]], N)::NTuple{N,Vector{Int}} AF = ifft(fft(A).*fft(krn)) out = Array(realtype(eltype(AF)), size(img)) end out end ``` Test: ``` jl julia> @code_warntype ifi(rand(3,3), rand(3,3), "replicate", 0) Variables: #self#::#ifi img::Array{Float64,2} kern::Array{Float64,2} border::ASCIIString value::Int64 prepad::Box ... ``` Now comment out the `indexesK = ...` line (the output of which is not used at all). Suddenly `prepad` is inferred as `Array{Int, 1}`.

It’s two of the oldest Julia issues. They are not easy to solve, and they will likely be around for awhile. Your compile times and even in some cases runtime will be happy. Generally it’s a good style to avoid things that you know might hit things that go wrong, which is why usually we try to keep closures to a minimum. That doesn’t mean don’t use them, but you should probably double check when you do, or put a let block around it.

greatpet · May 28, 2022, 10:33pm

So another workaround is like this:

function f(vector_of_vectors, i)
    map((let i=i; v -> v[i]; end), vector_of_vectors)
end

PetrKryslUCSD · May 28, 2022, 10:45pm

julia> @btime map(v -> v[$i], $vector_of_vectors)                          
  24.975 ns (1 allocation: 80 bytes)                                       
3-element Vector{Int64}:                                                   
 2                                                                         
 4                                                                         
 6                                                                         
                                                                           
julia> @btime map(Base.Fix2(getindex, $i), $vector_of_vectors)             
  25.100 ns (1 allocation: 80 bytes)                                       
3-element Vector{Int64}:                                                   
 2                                                                         
 4                                                                         
 6                                                                         
                                                                           
julia> @btime map((let i=$i; v -> v[$i]; end), $vector_of_vectors)         
  25.100 ns (1 allocation: 80 bytes)                                       
3-element Vector{Int64}:                                                   
 2                                                                         
 4                                                                         
 6

Elrod · May 28, 2022, 11:27pm

For me, I also do not like their semantics.

let
x = 4
f = y -> begin
    x = 3abs(y)
    return log(x)
end
f(-6)
@test x == 4
end

I prefer explicit over implicit behavior.

Additionally, to avoid constant debates with @benchmarks and tests over whether something is necessary, there are two approaches:

Don’t bring it up until performance problems are noted.
Program defensively to avoid potential performance problems in the first place.

The “avoid closures” rule is in part to say “2.” is better than “1.”.

Elrod · May 28, 2022, 11:39pm

@PetrKryslUCSD, thank you for posting benchmarks as a point to show my the example was harmless.
Posting that as an example of their safety is a perfect demonstration of the danger of closures, and why they should be avoided.

I can reproduce them (using a tuple of vectors):

fix_simple(x, i) = map(Base.Fix2(getindex, i), x)
clos_simple(x, i) = map(v -> v[i], x)
tuple_of_vectors = Tuple(rand(4) for _ in 1:10);
@benchmark fix_simple($tuple_of_vectors, 1)
@benchmark clos_simple($tuple_of_vectors, 1)
function clos(x, i)
    i += 1
    i -= 1
    res = map(v -> v[i], x)
    i += 1
    i -= 1
    res
end
function fix(x, i)
    i += 1
    i -= 1
    res = map(Base.Fix2(getindex, i), x)
    i += 1
    i -= 1
    res
end
@benchmark fix($tuple_of_vectors, 1)
@benchmark clos($tuple_of_vectors, 1)

For the simple example, I get

julia> @benchmark fix_simple($tuple_of_vectors, 1)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  4.875 ns … 16.417 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     4.959 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.978 ns ±  0.208 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                         █          ▁                         
  ▂▁▁▁▁▁▁▁▁▁▁▄▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▂ ▂
  4.88 ns        Histogram: frequency by time        5.08 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark clos_simple($tuple_of_vectors, 1)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  4.750 ns … 15.083 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     4.875 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.899 ns ±  0.222 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                              █         ▁                     
  ▂▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▂ ▂
  4.75 ns        Histogram: frequency by time           5 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

But as soon as we make it a little more complicated – our function is still unrealistically trivial compared to where such code may actually be used:

julia> @benchmark fix($tuple_of_vectors, 1)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  4.834 ns … 19.834 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     4.959 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.977 ns ±  0.212 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                              █▁        ▄                     
  ▂▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▁▅▁▁▁▁▁▁▁▁██▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▂ ▂
  4.83 ns        Histogram: frequency by time        5.08 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark clos($tuple_of_vectors, 1)
BenchmarkTools.Trial: 10000 samples with 229 evaluations.
 Range (min … max):  327.690 ns …   3.281 μs  ┊ GC (min … max): 0.00% … 87.27%
 Time  (median):     331.332 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   337.029 ns ± 104.356 ns  ┊ GC (mean ± σ):  1.08% ±  3.14%

  ▁▃▂▁▁▆█▇▆▄▂            ▁▁   ▂▄▂▂                              ▂
  ███████████▇▆▄▅▁▅▅▆▆▆▇█████▇████████▇▆▅▆▅▇▆▇▇▆▆▄▆▆▆▆▅▅▅▆▅▅▅▅▄ █
  328 ns        Histogram: log(frequency) by time        360 ns <

 Memory estimate: 272 bytes, allocs estimate: 12.

If people test code at all (unlikely), they’ll try microbenchmarks and prove their closure is perfectly safe.
Maybe they’ll even put it in a simple function, and prove it is safe.
Then put it in an actual realistic piece of code, and it tanks performance and no one even knows.
And, as my let example shows from my previous comment, aside from randomly tanking performance depending on where you place your closure, it can also randomly start giving you incorrect answers.

Closures are a plague on Julia.

EDIT:
@greatpet’s solution also works, and would probably be acceptable, and definitely preferred over a closure without the let. The point is defensive programming so code doesn’t accidentally get awful performance or wrong answers, and no one later has to spend hours of their time combing over a huge code base to find where all the allocations, type instabilities, etc come from.

maxkapur · May 29, 2022, 12:31am

The following is given as an example of how to write generic functions:

function f(A,B)
    @. A = A + B
end

I realize it’s not the point under consideration, but since this function is mutating, wouldn’t it be better to name it f!(A, B)?

Krastanov · May 29, 2022, 1:21am

Why!? Why is let making the closure work better!?

aplavin · May 29, 2022, 5:13am

Tangentially relevant: Acessors.jl provides (among other features) a simpler way to write “anonymous functions” that are not actually closures. For example, map(@optic(_[i]), A) and more complex @optic hypot(_[2].abc, i). They are ComposedFunctions of Base.Fix1/2 and custom objects under the hood.

helgee · May 29, 2022, 7:40am

See here. The TL;DR (and IIUC) is that let introduces new variable bindings which cannot outlive the scope of the let block, thus making it “easier” for type inference to determine their types because they can only change within that block.

DrChainsaw · May 31, 2022, 12:45pm

So I guess it is determined then that closures really are poor mans objects and not the other way around

What are peoples favourite strategies to avoid them? I find that it is easy to end up with the code littered with silly structs (e.g. variants of Base.Fix2) which are only useful in a single context. Sometimes I feel the desire to stick them in a submodule just to avoid that they are listed when auto completing.

I suppose the ideal is to just write code which does not make use of indirection at all, or at least not in such a way so that it becomes tempting to close over stuff, but I often find it hard to do so in practice without significantly reducing flexibility.

ChrisRackauckas · May 31, 2022, 12:47pm

Naming your callable structs isn’t a bad thing.

github.com

SciML/SciMLBase.jl/blob/master/src/function_wrappers.jl

mutable struct TimeGradientWrapper{iip, fType, uType, P} <: AbstractSciMLFunction{iip}
    f::fType
    uprev::uType
    p::P
end

function TimeGradientWrapper{iip}(f::F, uprev, p) where {F, iip}
    return TimeGradientWrapper{iip, F, typeof(uprev), typeof(p)}(f, uprev, p)
end
function TimeGradientWrapper(f::F, uprev, p) where {F}
    return TimeGradientWrapper{isinplace(f, 4)}(f, uprev, p)
end

(ff::TimeGradientWrapper{true})(t) = (du2 = similar(ff.uprev); ff.f(du2, ff.uprev, ff.p, t); du2)
(ff::TimeGradientWrapper{true})(du2, t) = ff.f(du2, ff.uprev, ff.p, t)

(ff::TimeGradientWrapper{false})(t) = ff.f(ff.uprev, ff.p, t)

mutable struct UJacobianWrapper{iip, fType, tType, P} <: AbstractSciMLFunction{iip}
    f::fType

This file has been truncated. show original

DrChainsaw · May 31, 2022, 12:55pm

I tend to do this alot, mainly for making any stacktraces a little bit easier on the eyes, but I sometimes get a bit worried that someone will call me out for being the OO-programmer-in-rehabilitation that I am

Topic		Replies	Views
SciML is now a NumFOCUS Sponsored Project! Community announcement , diffeq , sciml	4	1020	September 20, 2020
Is there a PEP8 for Julia? New to Julia	11	3529	December 1, 2021
SciML: An Open Source Software Organization for Scientific Machine Learning Community announcement , diffeq	5	1109	March 29, 2020
Preprint on Differentiable Programming for Differential Equations Community sciml	0	260	June 26, 2024
This month in Julia world - 2025-01 Newsletter	3	988	February 6, 2025

SciML Style: A Style Guide for Stylish Julia Programmers

Related topics