Bug? shuffle() breaks normalisation test

dcastel · October 16, 2019, 9:12am

Hi all,

I think I found a Julia bug (Version 1.1.0) I stumbled upon when writing a function for checking whether a tensor is normalized. I wrote tests for it and it seems (somewhat, sometimes) reproducible.

It comes down to this:
I generate some ‘random’ numbers that should be normalized, and they are, up to the point that the shuffle function is called, then they no longer seem normalized according to (some) of my tests.

My code and tests can be found here:
https://gist.github.com/dietercastel/541f3228b18ccf1820c916a83ebc5aaf

If this is due to numerical errors (most likely I think) how can I get around it? What’s a better, more Julia friendly, approach to write these tests or code?

I tried the same code in 3 different ways (Orig file, REPL, minimal file linked): 2/3 made the fourth test fail. And I’ve seen the second test fail as well (ocasionally).

I haven’t tried with a seeded random number or a fixed tensor, but no time atm for that. I’ll come back to it later though! I’ll also give it a try in v1.2 soon.

Looking forward to your responses.

jling · October 16, 2019, 9:22am

you just need this line for you isNorm instead:

isStrictPos(tensor) & (sum(tensor) ≈ 1)

doc:
https://docs.julialang.org/en/v1/base/math/#Base.isapprox

dcastel · October 16, 2019, 9:51am

Well that works indeed. Thanks!

Is that something that has to be done in general in Julia? Replace all == with ≈ ?
Are there some guidelines/reading material about when too and when not to use it. It feels a bit weird but of course at least it’s explicitly possible in Julia. Do you do it only in tests or always throughout your code?

Is there also an infix notation for isapprox() ?

jling · October 16, 2019, 9:55am

my line up there uses infix already, no?

well, my understanding is that usually, you use == (or even === when possible for better performance etc). But if you’re testing using/against real-world data, or you have some fitted/normalized data, or you’re aggregating many floats and expect to get some answer (usually a mathematical one, for example, normalization, or zero), you use approx. Although usually at this step you’re returning the result, so I say usage in a test is also common?

Tamas_Papp · October 16, 2019, 10:39am

I am not sure that is a good idea, those functions serve a different purpose. And of course ≈ is not transitive.

The underlying issue is floating point, so maybe this is helpful:

≈

StefanKarpinski · October 16, 2019, 1:29pm

It’s important to understand what ≈ does: by default it checks to see if two floating point values are equal for the first half of their significant digits. That’s pretty lenient but also quite standard. To be used with caution but also essential when checking results that can depend on numerical round off. I’m afraid the only answer here is to have some grasp of numerical analysis and to use judgement. Replacing all equality checks with ≈ is definitely not a good idea. Nor would it even be sufficient: 0.0 is never approximately equal to any non-zero value since it has no scale so you can’t know which bits should be considered significant or not. As the docs for ≈ say:

In particular, summation, even though it seems like an innocuous operation is sensitive to data ordering. In fact, you can sum the same set of numbers in different orders and get basically any possible result:

We don’t use naive summation by default in Julia, so things aren’t quite that bad, and if you’re only adding up positive values it’s not possible to have such a pathological situation, but keep this in mind.

cstjean · October 16, 2019, 3:31pm

Except for generators…

julia> sum(x for x in ones(Float32, 100_000_000))
1.6777216f7

StefanKarpinski · October 16, 2019, 3:52pm

True.

dcastel · October 21, 2019, 5:53pm

Ah yes, but I meant another infix notation. That wasn’t clear indeed, scusi! The thing is, I don’t like how subtle the difference is visually. (== and the expanded \approx)

I’ve been looking around a bit and custom infix notations are not around I guess, or did I overlook? Was trying to ‘fix’ it with some macro but they seem to break with UTF8 chars (I’ll report more in depth later).

Anyway, thanks for the kind help all!

Topic		Replies	Views
Approximate equality General Usage question , bug	54	34001	September 29, 2022
Bug? isapprox different results elementwise (Julia 1.1.0) General Usage bug	3	646	October 24, 2019
History of `isapprox` in languages Numerics question , numerics , approx	57	1340	September 19, 2024
How to compare vectors? General Usage question	10	190	October 14, 2024
Unit test equality for arrays? General Usage	14	7048	April 22, 2021

Bug? shuffle() breaks normalisation test

Related topics