Parallel is very slow

I split it up because there is no reduction option like for the parallel loop, and I didn’t want to mess with atomics or worry about false sharing.