Plenty of discussion has occured about approaches to solving performance of captured variables in closures · Issue #15276 · JuliaLang/julia · GitHub and examples of it rearing its ugly head, but I haven’t seen a succinct description of when it occurs and workarounds to fix that code.
My immediate concern is if/when the following pattern of code is caught by #15276, and hence I would be teaching students very bad coding practices…
function wrapper_algorithm()
#Bunch of calculations
x = #...construct a vector
nt = (x=x,) #and some named tuples
f(y) = x + y #function of variable and things in the function.
g(z) = z + nt.x #function of variable and things in the function.
#use f and g closure
end
The reason this is such a nice pattern is that in lectures, jupyter notebooks, etc. we could show just the inside of the function, where x
, nt
, etc. are globals but tell people that they should ALWAYS wrap this exact code in a funciton when performance is critical.
#Bunch of calculations
x = #...construct a vector
nt = (x,) #and some named tuples
f(y) = x + y #function of variable and things in the function.
g(z) = z + nt.x #function of variable and things in the function.
#use f and g closure
It is a very clean way to provide sample code, but it is a bad idea if it leads to systemic performance issues.
More generally: Making the assumption that it will not be fixed in the next 6 months or so, my question is how should we be teaching introductory users to organize code to avoid the bug. If there is a description of when it does/does not occur, I haven’t seen it and really have no sense of when it isn’t an issue. A few points:
- https://docs.julialang.org/en/latest/manual/performance-tips/#man-performance-captured-1 does a good job of explaining why this is a tough problem for compilers, but doesn’t tell me when it is safe or unsafe to use closures. The description in that section also seems to focus on functions that return a closure… does this happen when you just use a closure inside of a function?
- I have also heard people say that if you put the offending arrays (or whatever) in a structure (named tuple as well?) that inference works again…
- Should the advice to beginniners right now just be “don’t use closures or comprehensions in performance sensitive code”, for example? If the advice is nuanced, maybe we could collect the details and put it in the performance section of the docs.