I noticed this behavior recently and had the same question. I suspect this is a result of the optimizer checking the gradient of the loss function. You might get a better sense of what’s going on under the hood by printing some additional information about the current value of x (or some subset of x[]) being passed to your loss function.
Yeah it’s because of the loss function being called twice - once directly and once in the gradient calculation. If you want to count only the iterations you can put the counter in the callback function
What is the purpose of calling println in the calculation of the gradient? Will println functions have an influence on the calculation of the gradient?