Even if we assume that parallelization is not obstructed by any realistic factors like memory pressure, the theoretical maximum speedup is sub-linear when you consider that only a fraction of the program is accelerated (Amdahl’s law). It’s shockingly punishing, a 16x speedup of 95% of the original runtime has an overall ≤9.143x speedup.
Is there not a library for arbitrary precision integers that only allocate if the value exceeds a threshold close to Int (forgot the terminology for these kind of data structures)? This wouldn’t matter if the values are usually higher, though.