Cost model of weak references

A long time ago, on another system (SBCL), I’ve been bitten by weak references having a quadratic O(n²) GC cost. I’m curious about Julia. Is there any admonition against using WeakRef and WeakDict a lot?

The sweep phase C code looks like there’s an extra pass over all weak refs? Or maybe all weak references from that generation?

1 Like

WeakRefs do have some cost since, as you note, the GC has to iterate over all of them. So I do think it’s good to avoid using huge numbers of them. They are not necessarily O(n²) though; as far as I can tell the cost is O(n*m) where n and m are the number of WeakRefs and the number of GCs. So it will be workload-dependent. If the number of WeakRefs is constantly growing, the time could briefly be super-linear but the GC should eventually increase the collection interval to account for the new objects, making GC less frequent to compensate. Is this the kind of thing you’re thinking of?

One fancy feature we haven’t implemented yet is keeping WeakRefs in separate lists based on age, so minor collections only need to sweep young WeakRefs. We have probably not yet come across a workload that really needed this.

5 Likes