Right, yes.
When you have tens of millions of “reference” variables, basically things that use pointers under the hood, they make the “Mark” step of the mark and sweep GC slow.
Reference variables are things that are not isbits, so Vectors, Strings, mutable structs, structs/tuples with abstractly typed fields etc; Or structs with fields that are reference types (like SubString referencing a String).
Basically how a mark and sweep GC works is: when ever GC is triggered it goes through all references that exist in memory and tried to find a path from them to something that is definately in scope, if it does then it marks them and then it does a sweep step that removed everything that isn’t marked.
This is good because rather than scaling with the number of times a variable is used (which is how reference counting scales), it scaled with how many variables are declared (which is naturally less than how many times it is used as the declaration is a single use).
But it is bad that it also scales (multiplicatively) with how often a GC is triggered.
So if you have a lot of references in memory for a long time it scales badly.
To combat this Julia uses a generation GC.
Basically rather than 1 pool of reference to mark and sweep it breaks references into two Pools (you can keep exending this BTW, java has 3).
It applies a heuristic that if something has stuck around for a while (2(?) mark and sweep steps in julia’s case) it probably is going to be sticking around for a while more.
So everything starts in the Young pool, and then things that stay around for a while graduate to the Old pool.
Since most variables don’t hang around for long, one can free up a lot of memory by only marking and sweeping the young pool.
But if that doesn’t get you what you need you will need to sweep the old pool too – but this should be rare.
So what should happen is if you allocate a ton of strings say at the start of your program they quickly graduate to the Old pool and then rarely impose load on the Mark and Sweep step.
However, Julia’s generation GC doesn’t work very well (but as @Oscar_Smith linked there is a PR to hopefully fix it).
The issue #40644 shows that sometimes julia will effectively sweep both pools every time – making it functionally non-generational
ShortStrings basically represents a string not with a pointer to a block of memory (which would need to be tracked by the GC( but with a isbits value (like an UInt) that it reinterprets the bits out of. InlineStrings is ShortStrings but implemented better, basically because @quinnj is smarter than me. WeakRefStrings reexports InlineStrings. WeakRefString also however has a number of other useful string types, including the titular WeakRefString.
The WeakRefString was common in older (versions of) packages but has gone away for safer constructs, however unlike much of what replace it, it is isbits so didn’t incur load on the GC.
Basically, it was a SubString but it didn’t actually contain a julia reference to the original string, rather it contained just a raw Ptr to a position in the string to start reading from (+ a length).