Hi there!
After careful evaluation of the problem I could finally track the error to sums involving the P norm of a complex array c and also its derivative with respect to another parameter. The expressions are something like
N = ( sum_i (conj(c_i) a_i c_i)^(P/2) )^(1/P)
and
D = sum_i (conj(c_i) a_i c_i)^(P/2 - 1)
( a is a real vector, but its expression is not important here). For large P and for some large differences in magnitude of the real and the complex parts of c, there is a large influence of floating point associativity. If we change
conj(c_i)*c_i for abs2(c_i) = real(c_i)*real(c_i) + imag(c_i)*imag(c_i)
the result changes in the 4th decimal place in some situations!
An interesting fact is that abs(c::Complex) uses hypot, which takes care of overflow and underflow.
The definition is
abs(z::Complex) = hypot(real(z), imag(z))
but
abs2(c_i) = real(c_i) real(c_i) + imag(c_i) imag(c_i)
with no special treatment for large differences with respect to its real and imaginary parts (I really don’t known if its needed, just thinking out loud)
The funny thing is the different behavior of Julia 1.4.2 and 1.5.2 and also the influence of the manufacturer (AMD/Intel).
So I guess it is how different LLVM versions are arranging things (order of operations) in different processors AND a clear tendency of our computations to lead to this unfortunate situation, of course.
Thus, the use of MCS, the large number of iterations in the optimization AND this “particularity” is the reason for the problem. We are working in some modifications of the original formulation to avoid such differences in magnitude and/or proper scaling. Alongside, I am writing a short version of this code to track the @code_llvm or @code_native versions of the code, so I can learn more about this sort of situation.
I would like to thank you all for the feedback @Palli, @ctkelley, @ffevotte, @tbenst, @Oscar_Smith, @lostella, @mbauman, @PetrKryslUCSD .