Possible bug in unique/Set

faq

#1

I’ve been using unique as part of code that does partial fractional decomposition. Because of the numerical nature of the problem, I sometimes get arrays like:

4-element Array{Complex{Float64},1}:
 1.0-0.0im
 1.0+0.0im
 1.0+0.0im
 1.0+0.0im

We can manually test to see whether == considers -0im the same as +0im.

julia> 1.0-0im == 1.0+0im
true

Clearly then, == considers these all to be identical values, even if they are represented differently. If we call the original 4-element array x, Set and unique (since unique uses Set and not == according to https://github.com/JuliaLang/julia/issues/19311) return the following results.

julia> unique(x)
2-element Array{Complex{Float64},1}:
 1.0-0.0im
 1.0+0.0im

julia> Set(x)
Set(Complex{Float64}[1.0-0.0im, 1.0+0.0im])

To further muddy the waters, if we instead make the complex values based on ints, we get different behavior.

julia> x = [1-0im; 1+0im; 1+0im; 1+0im]
4-element Array{Complex{Int64},1}:
 1+0im
 1+0im
 1+0im
 1+0im

I believe this is due to the fact that ints don’t distinguish between -0 and 0 on the bit level whereas floats do. Still, it’s inconsistent behavior for the user.

In summary, I can’t find an issue on GitHub specifically addressing this. Does anyone know if Is this a known issue, or even if it truly is one? Are there ways to work around it that aren’t super hacky? Thanks!


#2

This is indeed problematic behavior, but might serve as a warning flag to the many pitfalls of floating point arithmetic. Consider using FixedPointNumbers package.

There are still bugs, like the one which popped up when I was writing this reply. Here is some sample code:

julia> using FixedPointNumbers

julia> Base.:^(z::Complex{<:Real}, n::Integer) = n>=0 ? Base.power_by_squaring(z,n) : Base.power_by_squaring(inv(z),-n)
WARNING: Method definition ^(Base.Complex{#s1} where #s1<:Real, Integer) in module Base at complex.jl:730 overwritten in module Main at REPL[2]:1.

julia> vals = Complex{Fixed{Int,20}}(im).^rand(0:3,10)
10-element Array{Complex{FixedPointNumbers.Fixed{Int64,20}},1}:
 -1.0Q43f20+0.0Q43f20*im
 -1.0Q43f20+0.0Q43f20*im
  0.0Q43f20+1.0Q43f20*im
  0.0Q43f20-1.0Q43f20*im
 -1.0Q43f20+0.0Q43f20*im
 -1.0Q43f20+0.0Q43f20*im
 -1.0Q43f20+0.0Q43f20*im
  0.0Q43f20+1.0Q43f20*im
  1.0Q43f20+0.0Q43f20*im
  1.0Q43f20+0.0Q43f20*im

julia> unique(vals)
4-element Array{Complex{FixedPointNumbers.Fixed{Int64,20}},1}:
 -1.0Q43f20+0.0Q43f20*im
  0.0Q43f20+1.0Q43f20*im
  0.0Q43f20-1.0Q43f20*im
  1.0Q43f20+0.0Q43f20*im

The bug which popped up was losing the specific Complex type information and giving a Complex{Int} result. But the redefinition fixed this up.

Note that floating-point arithmetic operations makes it even harder to stop the creep of numerical inaccuracies (addition is non-associative etc.), and going forward with floating-point should require using approximate comparison to take these into account.


#3

https://github.com/JuliaLang/julia/issues/9381


#4

(This is also true with fixed-point arithmetic, and in fact roundoff error growth is typically even worse with fixed point. There is no panacea when you are dealing with a finite subset of the real numbers. The only thing that is improved here is that you are getting rid of floating point’s signed zero.)


#5

True. There is no getting around the limits of finite bit representation. IMO the FixedPoints give less of a false sense of security and are easier to reason about. Perhaps even contrary to intuition, the big numerical errors which appear in FixedPoints draw programmer attention to problematic points in calculation instead of slower loss of precision in point-floaters (hence the use of creeping). But both representation have their respectable place.


#6

I feel like most of the supposed “easier reasoning” of fixed point is simply because it is decimal, so it doesn’t round human inputs. If you use decimal floating point, you get this advantage too, along with the nicer roundoff characteristics (due to increased dynamic range and fixed relative precision, rather than fixed absolute precision) of floating point.


#7

@Tamas_Papp, thanks for the link and the thorough, if not extended, discussion on why == is different from isequal. I think this best falls into the category of unusual, but intended behavior.

I guess that leaves how to get around this behavior (if desired) with the least kludgey code possible within reason. Right now, my simple quick fix looks like:

for root_idx in 1:length(p)
    if imag(p[root_idx]) == 0.0
        p[root_idx] = real(p[root_idx]) + 0im
    end
end

In writing this, I realized it could be simplified as:

neg_zero_idx = isequal.(imag(p),-0.0)
p[neg_zero_idx]=real(p[neg_zero_idx])+0im

Can anything be done further? Or am I overthinking it at this point?


#8

I am probably missing some context here for your code so I can’t comment on your solution. In general, you can just map members of an equivalence class to a well-defined element and work with that. Eg

eqmap(x) = x
eqmap(x::AbstractFloat) = iszero(x) && signbit(x) ? -x : x

Then form sets & dicts using the above.


#9

I should perhaps make a few points here. First of all, this has nothing to do with floating-point precision or rounding. It has everything to do with signed zeros being special and weird in the IEEE floating-point standards.

There are three standard equality relations in Julia:

  • === – object identity / programmatic distinguishability (aka “egal”);
  • == / < – “intuitive” / IEEE equality and comparison;
  • isequal / isless – hashing & sorting.

The === relation is not user-definable and is guaranteed to be a true equivalence relation.

The == is not an equivalence relation because IEEE dictates this whacky nonsense with NaN != NaN. It also has -0.0 == 0.0 and !(-0.0 < 0.0) (thank goodness, if these disagreed it would be a true disaster). The == and < functions are user definable, so people do whatever they want, but == should generally be an equivalence relation other than NaN-like values, and it should be compatible with <, which should be a partial pre-order.

The isequal relation is not generally meant to be used directly, but it is used for hashing and sorting. The isequal predicate is an equivalence relation (transitive, reflexive), and it must be compatible with isless. In addition to the usual non-reflexive == relation, IEEE also defines a total ordering which is actually an equivalence relation and orders -0.0 strictly before 0.0 (i.e. not equal). For floating-point values, the isless and isequal relations generally follow this IEEE total ordering. Here’s where it gets “interesting”:

  • Since we use isless for sorting and IEEE dictates that -0.0 sorts before 0.0 we must have isless(-0.0, 0.0), otherwise -0.0 and 0.0 would end up randomly scrambled in the middle of sorted vectors.

  • Because isequal is compatible with isless – i.e. for all x and y of the same type, isless(x, y) or isequal(x, y) or isless(y, x) but not more than one – we must therefore have !isequal(-0.0, 0.0).

Thus the IEEE standard seems to require that -0.0 and 0.0 hash differently. There are some obvious questions…

Q: Why not just use different relations for hashing and sorting?

A: Hashing and sorting are often used as implementation details behind similar data structures. For example, a Dict uses a hash table – i.e. hash and isless – but a SortedDict probably uses a tree which uses the isless comparison predicate. If we used different relations for these data structures, then switching from a Dict to a SortedDict would not only change the key ordering (the obvious change), but it would also subtly change the bucketing behavior. Say, for example, we sorted -0.0 before 0.0 but considered them equal for hashing. Then they would be in the same bucket in a Dict but in different buckets in a SortedDict. That sort of subtle behavior change would far worse than the current consistent separation of -0.0 and 0.0.

Q: Why not use something besides hashing for sets? Isn’t this allowing the implementation detail that Set is implemented using Dict leak through?

A: We could use a different structure, but however we do it, equivalence has to be decided by a reflexive and transitive relation. Of the three equality-like relations in the language, only === and isequal are true equivalence relations. == is not because of NaN (at least). The ObjectIdDict type uses === but this is generally not what you want since, e.g. 1 !== 1.0, [1,2] !== [1,2]. === is not a very user-friendly relation, despite its fundamental correctness. That leaves isequal. The fact that a Dict is a fast, efficient way to bucket things with respect to the isequal relation makes Dict a sensible choice for implementing sets. But we could just as well use a vector and just linearly scan through to see if any value in the vector is already isequal to a new element – that would be slow but it would work similarly with respect to -0.0 and 0.0. So using isequal for bucketing isn’t really an implementation detail of Dict at all – it’s the only viable choice.

Q: Why not just ignore IEEE and have isequal(-0.0, 0.0) and !isless(-0.0, 0.0)? Let -0.0 and 0.0 be scrambled in the middle of sorted arrays – does anyone really care? Or hack in a pass at the beginning of sorting type floating-point arrays that separates values by sign first so they’re generally ordered correctly even though isless doesn’t dictate that they must be.

A: We’ve very seriously considered this. IEEE was very clearly not designed with the consequences for programming language semantics in mind. The fact that the standard equality relations is not reflexive is pretty terrible. The fact that -0.0 and 0.0 are equal – but not really! – is, in some ways, even worse (as you’re seeing here). So we could just say that -0.0 and 0.0 sort and hash the same in Julia, to hell with IEEE.

That would leave sorting in a bit of a strange situation, however. We would like for sorting a Vector{Float64} and sorting a Vector{Any} that happens to only contain values that are either Float64 or isequal to them to sort in indistinguishable orders. However, the former uses quicksort for speed – since you don’t need a stable sorting algorithm for floating-point values. Since NaNs are all isequal, we want them to end up in the same order they were originally in (to have the same effect as a stable sort). To accomplish this, we stably move the NaN values to the end of the Vector{Float64} as a first pass before sorting the rest of the numbers. It’s unclear what one would do about signed zeros if they didn’t compare differently. If we did nothing then the signed zeros in the middle of the sorted vector would be the only thing leaking the details of the unstable sorting algorithm at the end. We could try to do something like we do for NaNs, but it’s unclear how to do that efficiently – unlike NaNs, you don’t know in advance where in an array the zeros are supposed to end up. We could also sort signed zeros by their sign as a post-processing pass, but then we give up the equivalence of sorting a Vector{Float64} and a Vector{Any} with the same values in it. Perhaps that doesn’t matter, however.

This is a somewhat unlikely end point for the seemingly innocuous questions “Why do sets consider -0.0 and 0.0 to be different? Shouldn’t they be considered the same?” Despite concerns about deviating from IEEE and the sorting of signed zeros, we could very possibly make isequal(-0.0, 0.0). But this is not a small decision and it’s unclear that such a change wouldn’t lead to just as many problems and confusions – just different ones.

See https://github.com/JuliaLang/julia/issues/18485 where this option was decided against in a previous release cycle. That doesn’t mean this decision is final, but I’m not sure it’s a can of worms we want to reopen.


#10

To put it much more succinctly, there are three options:

  1. isless(-0.0, 0.0)!isequal(-0.0, 0.0)length(unique([-0.0, 0.0])) == 2
  2. isequal(-0.0, 0.0)!isless(-0.0, 0.0)length(unique([-0.0, 0.0])) == 1
    a. sort!([0.0, -0.0]) is [0.0, -0.0] (don’t know how to implement this efficiently)
    b. sort!([0.0, -0.0]) is [-0.0, 0.0] but sort!(Any[0.0, -0.0]) is [0.0, -0.0]

Our current choice is 1. The other choices only seem to move the confusion around, not reduce it.


#11

How about introducing new floating point types, which behave like Float64 et al most of the time, but which hash and sort -0.0 differently? Then the user could select the kind of confusion that he prefers.

(I’m not saying the Julia devs should do this. But maybe someone who needs it anyway could put it in a package…)


#12

Could be an interesting experiment. My major concern would be promotion (eg the result type of +(::Float64, ::DamnIEEEFloat64)). Propagation in either direction could lead to very subtle bugs.


#13

Of course, a simple solution is this:

julia> x = [1-0im; 1+0im; 1+0im; 1+0im];

julia> unique(y->y+zero(y), x)
1-element Array{Complex{Int64},1}:
 1+0im

#14

That is a simpler solution, I might just point out that ints don’t have the issue where 0 and -0 are two (bitwise) different numbers. For instance:

julia> -0
0

julia> -0.0
-0.0

And, if we’re looking for absolute minimum characters, I think your example can be further shortened to:

unique(x+zero.(x))

This also makes it so that the sign of the zero imaginary part is predictably positive, instead of whatever happens to be in the first index position. Not a major difference for me, but something that might simplify code for some other usage. I still need to work in not coding in a Matlab accent :wink:


#15

Not sure what you mean. x+zero(x) gives predictably positive zeros. And for integers, the compiler will simplify to the identity function.

In this case, your Matlab accent works in your favor. The expression unique(x.+zero.(x)) should be slower than unique(x+zero.(x)) because it must create a new array before calling unique. However, the unique method that it calls is better optimized, so your code is faster than mine.

(unique(f::Callable, C) uses a Set{ANY}, which is really slow, but still seems to assume that f returns eltype(C), so unique(Float64, [1 2 3]) gives a surprising result.)


#16

Sure, for integers this isn’t a problem, but for floats it could help. For instance, taking your original x vector and converting it to float, the same command produces:

julia> x = [1.0-0.0im; 1.0+0.0im; 1.0+0.0im; 1.0+0.0im];

julia> unique(y->y+zero(y), x)
1-element Array{Complex{Float64},1}:
 1.0-0.0im

I apologize if this is explaining an effect you’re familiar with, but unique(y->y+zero(y), x) does create an array of identical values, but since these are only the values unique uses to find which indices represent the unique values, the first member of the original vector is what gets returned. So, for instance, if we reverse the order of x, we instead get:

julia> unique(y->y+zero(y), x[end:-1:1])
1-element Array{Complex{Float64},1}:
 1.0+0.0im

This is inconsequential for anything that involves a == comparison, but anything that involves a === or is comparison could get tripped up. Conversely, the code unique(x+zero.(x)) is identical regardless of the order of the elements of x because unique is operating on an adjusted value of x directly instead of using an adjusted value of x for finding the unique indices of the original x.

julia> unique(x+zero.(x)).===unique(x[end:-1:1]+zero.(x[end:-1:1]))
1-element BitArray{1}:
 true

I apologize if that explanation was a bit obtuse, but I hope it helped explain what I meant by “predictably positive zeros”.

EDIT: Just to show that unique(x+zero.(x)) works as promised:

julia> unique(x+zero.(x))
1-element Array{Complex{Float64},1}:
 1.0+0.0im

#17

I see a lot of explanations of why isequal(-0.0, 0.0) is false, but not why isequal(-0,0) is true. As a causaul user it seems like a gotcha that these would differ. I’m thinking about data files being interpreted as floats or ints and getting different results depending on how the data are formated in the file. Because of these real-world scenarios, I would want floats and integers to behave the same in these tests.

Not that Julia needs or wants to behave like MatLab, but it uses isequal(0.0, -0.0) == true and I haven’t seen any worms from that particular can, in the 15 years I’ve used it.

18485 didn’t really clarify much for me. For exaple, I don’t follow any of the arugments based on -0.0 being different than 0.0, unless you are talking about representing numbers less than 0 and greater than -eps(0)/2. But then I don’t see why we would fret more about errors crossing 0 than those occuring between every other pair of adjacent floating point representations.


#18

I see a lot of explanations of why isequal(-0.0, 0.0) is false, but not why isequal(-0,0) is true.

The introduction of -0.0 occurred with adoption of the original IEEE Standard for Floating-Point Arithmetic in 1985. There has never been a Standard for Integer Arithmetic that mentions -0 as a value (other than the fact that the negative sign is absorbed, so -0 === 0).

see John Cook’s note on signed zero


#19

Did you read Possible bug in unique/Set ?


#20

Just leaving this here:

julia> begin
       @show 0.0 == -0.0
       @show 1.0/0.0 == 1.0/-0.0
       end;
0.0 == -0.0 = true
1.0 / 0.0 == 1.0 / -0.0 = false

Ultimately, I’m not sure I like completely language-defined floating point semantics. That is, I’d like something that has hardware support more than something that makes sense; julia and hence all julia software sits downstream of IEEE / Intel / AMD / ARM / $device_vendor.

For example, sqrt: We have a perfectly fine CPU instruction for that, which happens to normally return NaN on negative numbers; why on earth do we check for positivity? Until we can support the positivity check via FPU traps, the default sqrt should imho not raise an exception (maybe offer “well defined” operators somewhere, but give me architecture-dependent “kinda correct but definitely fast” defaults that compile to the obvious instructions). Or the minimum of two floating point numbers; why on earth

julia> fmin(x,y)= ifelse(x<y, x, y);
julia> @code_native fmin(1.0,2.0)
	.text
; Function fmin {
; Location: REPL[43]:1
	vminsd	%xmm1, %xmm0, %xmm0
	retq
	nopw	%cs:(%rax,%rax)
;}

julia> @code_native min(1.0,2.0)
	.text
; Function min {
	vmovq	%xmm1, %rcx
	vmovq	%xmm0, %rax

	vcmpordsd	%xmm0, %xmm0, %xmm2
	vblendvpd	%xmm2, %xmm1, %xmm0, %xmm2
	vcmpordsd	%xmm1, %xmm1, %xmm3
	vblendvpd	%xmm3, %xmm0, %xmm1, %xmm3
	vmovapd	%xmm2, %xmm4

	testq	%rcx, %rcx
	jns	L58

	testq	%rax, %rax
	js	L67
L46:
	vcmpltsd	%xmm0, %xmm1, %xmm0
	vblendvpd	%xmm0, %xmm2, %xmm4, %xmm0
	retq
L58:
	vmovapd	%xmm3, %xmm4
	testq	%rax, %rax
	jns	L46
L67:
	vmovapd	%xmm3, %xmm4
	vcmpltsd	%xmm0, %xmm1, %xmm0
	vblendvpd	%xmm0, %xmm2, %xmm4, %xmm0
	retq
	nopw	%cs:(%rax,%rax)
;}

edit: The “why on earth” is not so much an honest question; I’m sure there are reasons of mathematical consistency, but at some point the silicon we’re running on is ground truth, not a math textbook.
edit2: Ok, FPU traps for sqrt are a stupid idea. Still, 99% of the code is perfectly fine with NaN square roots, and the rest should use some Base.safe_sqrt, and the same for floating point comparisons.