This should be documented in performance tips if not a temporary bug


Here is a drastically simplified and self-contained fragment of code dealing with permutation groups. The point of my post is that without the line let s=s, i=i, e=i^s the compiler cannot infer the type of either s, i or e, as @code_warntype shows. I spent literally days on this kind of problem, almost giving up on Julia along the way, before I stumbled on a similar problem/solution in a post on this (very useful) forum, and discovered that this kind of bug is all over my code and can invariably be fixed in a similar way. So my questions are:

  • What is going on here exactly?
  • Is this a temporary bug which is going to be fixed? (it is on 0.6.3 and on 0.7alpha)
  • If not, why is this not mentioned in the section “Performance tips”?
# a permutation is defined by the list d holding the images of 1:length(d)
struct Perm 

degree(a::Perm)= length(a.d)

# i^p : apply permutation p to integer i
import Base.^
^(n::Int, a::Perm)=if n>degree(a) n else a.d[n] end

import Base.*
function *(b::Perm, a::Perm)
  if degree(b)<=degree(a)
    for i in 1:degree(b) d[i]=a.d[b.d[i]] end
    d=[i^a for i in b.d]

struct PermGroup # Permutation group with generators gens


# for each q in the orbit of point p under g, record
# in d[p] an element of G sending p to q
function orbit_and_representative(G::PermGroup,p)
  while !isempty(new)
    for s in G.gens, i in old
      let s=s, i=i, e=i^s
        get!(d,e) do

@code_warntype orbit_and_representative(G,1)


This is the infamous closure bug:


I am not sure what you mean here. Is it because the variables s,i and e appear in the function argument to get!
that they are “captured in a closure”? If so it seems a dramatic restriction on allowed code in Julia!!


You’re defining an anonymous function:



OK. You did not answer my questions except perhaps the first one. Here are two more:

  • Is there a better way to code this?
  • Is my workaround the best one?


A closure is when you create a function, and it “captures” variables in the scope in which it is defined. The bug is that the compiler has a hard time confirming types are stable when that happens.

Using variables as arguments to functions is fine, but capturing them in functions you define risks the bug.
Note that in Julia 0.6 broadcasts actually define an anonymous function, but I’ve not heard of the bug rearing it’s head in a broadcast. It is definitely a problem, but it doesn’t happen most of the time. (Perhaps someone knows what makes it more likely?) It’s also definitely getting better – a lot of the examples from the issue linked above are now inferred correctly.

And yes, your approach is probably the best when it works, because it’s a lot less cumbersome than something like this:

struct close <: Function
function (c::close)()
    push!(, c.e)
function orbit_and_representative_manual_closure(G::PermGroup,p)
  while !isempty(new)
    for s in G.gens, i in old
        e = i^s
        get!(close(new, e, s, d, i), d,e)

or than creating global constant RefValues or arrays to hold the variables you’re passing in a closure.


So is there some general way one could phrase the “let” hint to compiler? Even if it is most often not necessary?


AFAIK using let is the simplest option at the moment.


Before giving up on the language, it may be better to ask for help first. You can always give up on Julia later :wink:

Hopefully, yes. The issue linked by @Elrod has a 1.x milestone, so you may have to wait a bit.


Before giving up on the language, it may be better to ask for help first. You can always give up on Julia later

I thought the problem was with me, that I had not grasped something in the language. This should definitely be
in the section “Performance tips” of the manual: I can tell you it is not easy to find the origin of the problem and the solution when you are a new user of the language, with the resources available in the manual.

I took some time to make a simplified example and was going to post for help when I stumbled on a solution. I
still posted, hoping my plight will help others.


It’s a pretty subtle problem to explain to the average user. Maybe someone should compile a list of “Performance gotchas for advanced julia usage”. There’s a lot of them, but that’s also true in the other high-level high-performance languages I’ve looked at. You have to know and understand your compiler’s mental-blocks.


I am not sure I agree with this; this is not something inherent to Julia’s design, but a bug that should be fixed in due time. There are about 150 open bugs with the performance label at the moment, they come and go.


I am not sure I agree with this; this is not something inherent to Julia’s design, but a bug that should be fixed in due time.

Technically other gotchas are also “performance bugs” (eg globals, whose performance could be optimized). It makes sense to document the current state of the language, not the ideal implementation allowed by the design (kind of how release notes usually include a “known bugs” section). This issue in particular is very easy to run into and non-trivial to figure out.


I know that this is probably not the answer You seek, but does this work for You?

using AbstractAlgebra

G = PermGroup(3)

function orbit_and_representative(G::Generic.PermGroup, p)
  new = [p]
  orbit = Dict(p=>G())
  for g in elements(G)
    if !(g[p] in keys(orbit))
      orbit[g[p]] = g
  return orbit
orbit_and_representative(G, 1)

Of course you could break the loop when orbit is complete


I am porting thousands of lines of GAP code, including a much more complete implementation of permutations groups, as part of a much larger port, so I did not even look at other Julia implementations.
As a detail, in your implementation is x in keys(d) implemented efficiently for a Dict? I used get! since
it seems to me that it minimizes the number of hashes/accesses. Also, you assume elements(G) is known,
while I am porting basic code like Schreier-Sims base and strong generating set, a preliminary to construct


I implemented all of this in AbstractAlgebra, would be great if You contribute Schreier-Sims to this project!

EDIT: answering Your question about accessing the keys: I didn’t look into this, but this seems a premature optimisation.



Yes, I agree.
A “known performance issues” section, with the best workarounds for each case, if possible, would be great.


If it’s decided to compile such a chapter in the manual, it might be useful to point people to the same bug if their multithreaded loop does not perform as expected. Happened to me a couple of times that I had to unmacro the macro to get to at least some kind of type stability. Maybe that’s fixed though, I didn’t really follow the development


The advanced pitfalls chapter should also contain a link to the function specialization heuristics and workaround, with the warning that @code_warntype up to @code_native show the fully realized code instead of the really existing code.

(solution is some_fun(F::FT, x) where F<:Function instead of some_fun(F::Function, x) if you want any kind of type-stability and F does not appear in head-position of the AST; this is properly documented in the developer-section, but imho deserves a cross-ref because it is initially quite surprising to see all the allocations that are not present in @code_native)