Help Debugging GPU Performance Issue

I was looking at the second and third of your measurements, which have identical allocation counts. Actually, I can’t reproduce what you were seeing, and get identical counts for all three of your benchmarks. It shouldn’t matter if you assign the array to a variable or not (and unsafe_free!) shouldn’t allocate.