Negative zeros and sorting

Gus_Hart · August 3, 2022, 5:25pm

Negative zeros are equivalent to positive zeros, and yet in sorting the negatives come before positives. This behavior caused a very subtle bug in a function I wrote to compare clusters of points. I’m wondering if there is a way to disable this? Or if someone has an idea for a workaround that’s less invasive than checking every single zero entry in my 2D arrays and absing them every time I apply a rotation?

julia> 0.0==-0.0
true

julia> sort([0.0,-0.0])
2-element Vector{Float64}:
 -0.0
  0.0

Oscar_Smith · August 3, 2022, 5:27pm

Note that Julia’s default sorting algorithm isn’t stable, so even if they were considered equal, there is no guarantee that they won’t be swapped (unless you specify a stable sorting algorithm).

Gus_Hart · August 3, 2022, 5:33pm

I’m OK with that. I’m sorting columns of matrices and this behavior messed me up:


julia> sortslices(rots[22]*r1[6],dims=2)
3×3 Matrix{Float64}:
 -1.0  -0.0  0.0
  0.0   0.5  0.0
  0.0   0.5  0.0

Note that columns 2 and 3 are backwards, whereas they would be correct if the sorting algorithm treated both zeros in row 1 the same.

This caused my equivalence checker not to be reflexive:

julia> isMatrixEqvBySymmetry(r1[5],r1[6],rots)
true

julia> isMatrixEqvBySymmetry(r1[6],r1[5],rots)
false

because in one instance the negative zero wasn’t generated in the matrix multiplication but it was in the other.

As a temporary work around, I changed t = sortslices(iRot*iFig,dims=2) to

t = iRot*iFig
t[findall(t.==0.0)] .= 0.0
t = sortslices(t,dims=2)

I’m not sure how much the findall will slow me down for small matrices, but I hit this code block many, many times in a typical function call.

lmiq · August 3, 2022, 5:58pm

That will be slow, use a loop

for i in eachindex(t)
    @inbounds t[i] = t[i] == 0.0 ? 0.0 : t[i]
end

(although if your code critically depends on that comparison, I would be afraid to rely on that).

Oscar_Smith · August 3, 2022, 6:03pm

You can also use t .+= zero(eltype(t)) which will be faster.

Gus_Hart · August 3, 2022, 6:04pm

@lmiq Thanks Leandro. Your solution was about 10% faster in my use case:

julia> @benchmark isMatrixEqvBySymmetry($r1[6],$r1[5],$rots)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  127.375 μs … 199.084 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     129.583 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   130.142 μs ±   2.339 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

          ▅▄█▅▆▁▂                                                
  ▁▁▁▂▃▄▇████████▆▅▃▃▂▂▂▃▂▃▂▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  127 μs           Histogram: frequency by time          139 μs <

 Memory estimate: 91.25 KiB, allocs estimate: 1312.

┌ Warning: Replacing docs for `Main.clusters.isMatrixEqvBySymmetry :: Tuple{Any, Any, Any}` in module `Main.clusters`
└ @ Base.Docs docs/Docs.jl:240
Main.clusters.isMatrixEqvBySymmetry

julia> isMatrixEqvBySymmetry(r1[6],r1[5],rots)
false

julia> @benchmark isMatrixEqvBySymmetry($r1[6],$r1[5],$rots)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  115.667 μs … 173.458 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     117.542 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   118.143 μs ±   2.202 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

       ▁▂▇▅█▄▅▁                                                  
  ▁▁▂▃▄████████▅▄▃▂▂▂▂▂▂▃▃▃▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  116 μs           Histogram: frequency by time          127 μs <

 Memory estimate: 78.50 KiB, allocs estimate: 1168.

JeffreySarnoff · August 3, 2022, 6:04pm

sortslices(mat, dims=d, lt=(x,y)->f(x,y))
use d=1 to sort the rows, d=2 to sort the columns
isless(0.0, -0.0) == false
isless(-0.0, 0.0) == true


julia> m
3×3 Matrix{Float64}:
 -1.0  -0.0  0.0
  0.0   0.5  0.0
  0.0   0.5  0.0

julia> sortslices(m, dims=2, lt=(a,b)->isless(a,b))
3×3 Matrix{Float64}:
 -1.0  -0.0  0.0
  0.0   0.5  0.0
  0.0   0.5  0.0

julia> sortslices(m, dims=1, lt=(a,b)->isless(a,b))
3×3 Matrix{Float64}:
 -1.0  -0.0  0.0
  0.0   0.5  0.0
  0.0   0.5  0.0

julia> sortslices(m, dims=2, lt=(a,b)->isless(b,a))
3×3 Matrix{Float64}:
 0.0  -0.0  -1.0
 0.0   0.5   0.0
 0.0   0.5   0.0

julia> sortslices(m, dims=1, lt=(a,b)->isless(b,a))
3×3 Matrix{Float64}:
  0.0   0.5  0.0
  0.0   0.5  0.0
 -1.0  -0.0  0.0

you can also use lt=(a,b)->isless(-b,-a)

mikmoore · August 3, 2022, 6:06pm

You can use the lt (“less than”) argument to redefine the inequalities used for sorting. As other suggested, you also may want to use a stable sorting algorithm.

sort(x; alg=Base.Sort.DEFAULT_STABLE, lt=<).

Note that using < instead of the default isless will also change the way NaN is handled and likely cause you trouble if you have any. But, assuming a NaN-free input, should not switch -0.0 and +0.0.

EDIT:
I think I misunderstood what you were after and would rather just replace all -0.0 with +0.0.

julia> replace!([0.0,-0.0,1.0,-1.0],-0.0=>0.0)
4-element Vector{Float64}:
  0.0
  0.0
  1.0
 -1.0

Gus_Hart · August 3, 2022, 6:07pm

Oh my! That is clever. Just as fast as @lmiq 's solution (according to my benchmark, not posted) and so concise.

Gus_Hart · August 3, 2022, 6:15pm

The lt=< was what I initially was hoping for when I posted my question. That works. Thank you. And I don’t have to change any negative zeros. And using lt=< in the call to sort is just as fast (according my benchmark of my use case) as any of the other solutions that focused on replacing the negative zeros.

EDIT:
I think I misunderstood what you were after and would rather just replace all -0.0 with +0.0.
julia> replace!([0.0,-0.0,1.0,-1.0],-0.0=>0.0)
4-element Vector{Float64}:
  0.0
  0.0
  1.0
 -1.0

That’s also a nice solution to getting rid of the negative zeros.

lmiq · August 3, 2022, 6:18pm

Can any behavior regarding the difference between 0.0 and -0.0 considered part of an API? Is there a standard for that?

Oscar_Smith · August 3, 2022, 6:23pm

yes. Julia follows IEEE 754 for it’s handling of floating point numbers.

Gus_Hart · August 3, 2022, 6:40pm

Hmm…I thought it was working but it isn’t. The lt=< gives the same behavior:

julia> t
3×3 Matrix{Float64}:
 -0.5  0.0  -0.0
 -0.5  0.0   0.0
  0.0  0.0   1.0

julia> sortslices(t,dims=2)
3×3 Matrix{Float64}:
 -0.5  -0.0  0.0
 -0.5   0.0  0.0
  0.0   1.0  0.0

julia> sortslices(t,dims=2;lt=<)
3×3 Matrix{Float64}:
 -0.5  -0.0  0.0
 -0.5   0.0  0.0
  0.0   1.0  0.0

This is confusing because

julia> 0.0 > -0.0
false

julia> 0.0 < -0.0
false

Sorting with the < should not swap a negative zero with a positive one. Maybe sortslices is a bit different than sort?

Apparently so:

julia> sort([0.0,-0.0])
2-element Vector{Float64}:
 -0.0
  0.0

julia> sort([0.0,-0.0];lt=<)
2-element Vector{Float64}:
  0.0
 -0.0

mikmoore · August 3, 2022, 8:04pm

julia> [1,2,3] < [4,5,6]
true

julia> [1,2,3] < [0,5,6]
false

The lt=< suggestion I made applied to scalars. For vectors, it appears that < defers to isless which uses a lexicographic order, which recursively appears to use isless on the elements. Thus, for sortslices, < is no different than the default isless. You’ll need to write your own function that determines whether a vector is less than another vector (using < for the element comparisons rather than isless), then pass that function as lt to sortslices.

Topic		Replies	Views
Descending countingSort in Julia New to Julia	1	453	January 11, 2021
Sort and dims=1 for vector input New to Julia question	8	294	March 30, 2021
Sorting instability issue General Usage question	11	153	December 11, 2024
Julia 1.7 sort issue? General Usage	5	160	July 28, 2024
Sorting a Matrix by column1 then column2 General Usage sort , arrays	2	1320	February 25, 2021

Negative zeros and sorting

Related topics