Why is this numerical scheme so much faster when using `collect()` than a for loop?

I was able to create an example that (I think) is benchmarked correctly. I created a new thread here: Why is `collect()` faster than a for loop in this numerical scheme? (corrected!)

I’d be very grateful if you could take a look.