Functional programming is not capable of achieving absolute top performance

No. GC pressure is typically not the only issue with persistent data structures like HAMT, its also slow read access and bad cache locality and larger memory consumption from all the small heap objects.

Don’t get me wrong, these are very nice techniques and amazing datastructures – a mere 3x slowdown is an incredible accomplishment, if achievable.

Persistent datastructures tend to be quite a bit more complicated than their mutating counterparts – unsurprisingly, since you have to program a much weaker machine where memory can only be written once. Compare scala/clojure Vector with a simple array.

Worse: Persistent datastructures are often nontrivial to reason about, performance-wise. It is super easy to mess up and produce accidentally quadratic code when e.g. working with single-linked-lists.

Persistent datastructures are a marvel, but the requirements leading to them are pretty niche in practice. The thing I’m strenuously arguing against is to use them as default “just because”, or because of misguided “purity” considerations, when a mutating datastructure would do the job just as well.

A good example for well-considered use is julia’s implementation of ScopedValue. We need O(1) snapshots because Task-creation must be fast, so julia now has a HAMT in Base. I already mentioned ZFS (the filesystem), which gains robustness, snapshots, performance on SSD (most writes are sequential, from the SSD controller’s perspective), and some notion of simplicity (e.g. one should not come into the situation of having a broken/inconsistent FS due to power outages – instead one has a consistent snapshot, plus ideally a ZIL to play back).

4 Likes