Performance improvements for simple molecular surface caclulation

x - c[i] allocates a temporary 3-component array in your inner loop, which is going to hurt performance a lot. As @lmiq suggested, 3-component coordinate vectors are a perfect case for SVector from the StaticArrays package.

Also, note that gaussian_kernel is a non-const global (as opposed to defining it in the usual way via function gaussian_kernel(...).

Finally, use @btime from the BenchmarkTools package for more accurate timing, and use $ to interpolate global variables when benchmarking.

The huge number of allocations here still indicates a problem. Possibly the type instability from the non-const definition of gaussian_kernel. Also, did you forget to use SVector for grid?

(If your innermost loops are allocating memory on the heap, you are leaving a lot of performance on the table. You wouldn’t call malloc in your innermost loop in a C program.)

6 Likes