Emphatically agreed.
@Uranium238 Optimize single-threaded performance, with a particular view to reducing the number of allocations, first. Then start thinking about multi-threading.
Not only are single-threaded optimizations more likely to be significant, but parallelization will scale better as a result. Julia parallelization is not great for GC-heavy code.