You might have a point there. I have only 32GB on my computer and generated 30,000 rather the original 10,000 chunks. The CPU utilization for single core and multi-threaded is given below. We hit 100% CPU utilization on threaded briefly. The times are 10.666989 s and 2.904637 s for single and multithreaded respectively.
I guess I’ll gave to rent some cloud compute briefly to test this properly.
With the original 10,000 chunks we don’t even his 100% for multi-threaded applications. The green spike in the middle is for single threaded run.

