Accelerate pairwise Lennard-Jones force computation

I’m afraid that these benchmarks can be misleading. These small benchmarks still confuse me: How to benchmark properly? Should defaults change?