Zygote does the automatic differentiation? In this example I’m not using any backwards pass functionality so I wouldn’t have thought any gradient computation is taking place and therefore affecting the speed up. Are you saying using Zygote should compile away bits of the forward pass if it’s redundant?