I fear I have to start debugging. But: perhaps someone has had this experience already and there is a fix?
Does 1.11.2 fix it?
any hints at all as to where the difference might lie? what kind of code is it?
'fraid not.
Perhaps 1.11.1 fixed some incorrect stuff which revealed something to fix in your code. This can be a blessing in disguise.
I’ve seen numerical values change on upgrades, but it’s always tracked back to either @simd
(or fastmath and the like) or the RNG or changes in packages that simultaneously upgraded.
No clue where or what. The first four tests run fine, the last one is broken. (pace.jl – sequential, the same problem for the MPI implementation in pace_mpi.jl): the iteration stagnates. All works fine with 1.10.5.
Can you provide a minimum working example?
So you are only checking the number of iterations some solver takes in these scripts, right? And these changed? Did you also compare the solutions and do they differ? If not then it might not be a “correctness” bug after all.
If I could, I would have found the error by now.
I do check that the solution is correct, not only that the residual drops down to zero. But not in the scripts used in timing runs.
Without a reasonably small example, I’m not sure someone else will be able to find the problem in your stead.
I thought it was clear that I was not looking for someone to “find the error in my stead”?
What has changed?
??? Not sure I follow…
Am I missing something?
The title of this post says 1.11 has broken the correctness (presumably semantic correctness) of your code.
But no details were provided.
Without debugging it is difficult to envision how you will find a solution.
Amen.
Just in case you’re interested, I found the first difference:
Two arrays are computed with different bits.
Julia 1.10.5:
fens.xyz[nl, :] = [10.0 1.0 0.2857142857142857; 10.0 1.0 0.42857142857142855; 10.0 1.0 0.5714285714285714; 10.0 1.0 0.7142857142857143; 10.0 1.0 0.8571428571428571; 10.0 1.0 1.0]
Julia 1.11.2:
fens.xyz[nl, :] = [10.0 1.0 0.2857142857142857; 9.999999999999998 1.0 0.42857142857142855; 10.0 1.0 0.5714285714285714; 10.0 1.0 0.7142857142857143; 10.0 1.0 0.8571428571428571; 10.0 1.0 1.0]
This, in combination with a typo that eliminated the use of a non-zero tolerance, produced diverging results.
What changed so that Julia does not produce bit identical patterns?
It’s still hard to say — see my list above. It’s likely to be some arithmetic or BLAS routine that internally allows for re-association and doesn’t guarantee perfectly rounded results for all inputs. Or it’s code you wrote that allows for the same.
could it possibly be related to 10.0^68 gives a different result on nightly vs 1.11 after llvm-muladd pass removal · Issue #56312 · JuliaLang/julia · GitHub ? I’m not sure what would have changed in this regard from 1.11 to 1.11.1 though