Median of complex numbers different from Matlab output

Hi there everyone! This is my first ever post in the Julia community and since I’m unfamiliar with the various tags and categories on here, I just put this question in General.

I’m currently in the process of porting a PIV Matlab script over to Julia for a professor at my university. However, neither Julia nor Matlab are in my CS curriculum so it’s slow going finding the ins and outs of both languages!

The Issue I’m Having
The following code is part of a local median filter being iteratively used on small windows of a larger matrix composed of complex numbers and NaN values. I’ve created this little function which just filters out the NaN’s, then takes the median of the real and imaginary parts separately, and recombines them for the return value.

Julia code

using Statistics

function im_median(collection)
    i = filter(x -> !isnan(x), collection)

    if length(i) > 0
        real_part = median(real(i))
        im_part = median(imag(i))
        return real_part + im_part * im
    end

    # If all NaN
    return NaN
end

For your reference, here is the equivalent Matlab code I’m translating. Also, full transparency since this is a port, this function was created as part of the MatPIV package by John Peter Acklam:

function y = mnanmedian(x)
i = ~isnan(x);
if any(i)
  y = median(x(i));
else
  y = NaN;
end

end

I defined this matrix to try to pinpoint the differing outputs. A 3x3 matrix mimics the size of the filter window, basically a point in the grid and all of its neighbors:

nan_matrix = [
    0.5488135 + 0.71518937im  NaN 0.42365480 + 0.64589411im;
    0.43758721 + 0.89177300im  0.96366276 + 0.38344152im  NaN;
    0.56804456 + 0.92559664im  NaN 0.02021840 + 0.83261985im
]

Which yields the following output:

# Julia output:
0.493200355 + 0.77390461im

# Matlab output:
0.4932 + 0.8035i

As you can see, the output for the imaginary end of things differs by enough to cause problems for the PIV algorithm down the road.

Is there something I’m missing with my own implementation of the complex median function?

how does MATLAB compute medians of complex numbers? I see you are doing this

takes the median of the real and imaginary parts separately,

maybe MATLAB is taking the median via the absolute magnitudes of the numbers rather than separating the components?

3 Likes

Matlab understands the input as

>> mnanmedian

nan_matrix =

   0.5488 + 0.7152i      NaN + 0.0000i   0.4237 + 0.6459i
   0.4376 + 0.8918i   0.9637 + 0.3834i      NaN + 0.0000i
   0.5680 + 0.9256i      NaN + 0.0000i   0.0202 + 0.8326i
2 Likes

Hi,

just sharing my thoughts:

julia> sort(collect(imag.(nan_matrix))[:])
6-element Vector{Float64}:
 0.38344152
 0.64589411
 0.71518937
 0.83261985
 0.891773
 0.92559664

So the median clearly is 0.77390461im (as the mean of 0.71518937 and 0.83261985)

Edit: As surmised by @adienes, above.

According to this reply from The Mathworks, when the Matlab median function is given a list of complex numbers, they are sorted according to magnitude, and the median of this sorted list is calculated. So the following Julia method for complex arrays should be equivalent to the Matlab function:

function im_median(collection::AbstractArray{Complex{T}}) where {T}
    i = filter(x -> !isnan(x), collection)
    isempty(i) && return NaN
    sort!(i; by=abs)
    n = length(i)
    no2 = n ÷ 2
    isodd(n) && return i[no2+1]
    return (i[no2] + i[no2+1]) / 2
end

It produces the following result on your sample matrix:

julia> im_median(nan_matrix)
0.493200355 + 0.8034811850000001im

which appears to agree with the Matlab result.

3 Likes

Thanks a ton everyone! This really smoothed out the results from program. Also, I had no idea you could use && or || in the <cond> <bool op> <statement> format like that. I spend a lot of time while I’m working on this program wondering how true Julia writers might use the language differently.

You also don’t need to sort the entire array — you just need to make sure that the median element(s) are in the right place. That’s precisely the reason for partialsort!. There’s surely more ways you could continue to speed this up, but doing less sorting work can be a quick (and huge!) win:

function im_median_faster(collection::AbstractArray{Complex{T}}) where {T}
    i = filter(x -> !isnan(x), collection)
    isempty(i) && return NaN
    n = length(i)
    v = partialsort!(i, div(n+1, 2, RoundDown):div(n+1, 2, RoundUp); by=abs)
    return sum(v)/length(v)
end
julia> A = rand(Complex{Float64}, 10000);

julia> @btime im_median($A)
  1.717 ms (4 allocations: 312.59 KiB)
0.4918520015608561 + 0.5308805009885738im

julia> @btime im_median_faster($A)
  198.083 μs (4 allocations: 312.59 KiB)
0.4918520015608561 + 0.5308805009885738im
4 Likes

One easy way (that I should have used in my first post) is to use abs2 rather than abs. This also gives a big speedup:

julia> function im_median_faster_still(collection::AbstractArray{Complex{T}}) where {T}
           i = filter(x -> !isnan(x), collection)
           isempty(i) && return NaN
           n = length(i)
           v = partialsort!(i, div(n+1, 2, RoundDown):div(n+1, 2, RoundUp); by=abs2)
           return sum(v)/length(v)
       end
im_median_faster_still (generic function with 1 method)

julia> using BenchmarkTools

julia> A = rand(ComplexF64, 10_000);

julia> @btime im_median($A)
  2.339 ms (4 allocations: 312.59 KiB)
0.49722004577323964 + 0.6006781196428463im

julia> @btime im_median_faster($A)
  345.400 μs (4 allocations: 312.59 KiB)
0.49722004577323964 + 0.6006781196428463im

julia> @btime im_median_faster_still($A)
  94.400 μs (4 allocations: 312.59 KiB)
0.49722004577323964 + 0.6006781196428463im
3 Likes

You probably already found this, but short-circuit evaluation is discussed in this section of the Julia documentation.

Just wanted to report back in with my final solution! Sorting by=abs2 worked great on almost all cases. In the end though there were a bunch of cases where Matlab was sorting things in a really non-sensical manner and so my median output wasn’t matching. Luckily a Matlab buddy of mine pointed out that Matlab sorts first by absolute magnitude and then by phase angle!

So in the end this worked like a charm:
by=x->(abs2(x), angle(x))

8 Likes

This may have some issues across the angle() function discontinuity at +/- pi?

Code example
function im_median(collection::AbstractArray{Complex{T}}) where {T}
    i = filter(x -> !isnan(x), collection)
    isempty(i) && return NaN
    n = length(i)
    v = partialsort!(i, div(n+1, 2, RoundDown):div(n+1, 2, RoundUp); by=x->(abs2(x), angle(x)))
    return sum(v)/length(v)
end

Random.seed!(0)
n = 99
C = (1 .+ rand(n))  .* exp.(im .* (π .+ (rand(n) .- 0.5)/100))
Cm = im_median(C)

using Plots
scatter(C, c=:blues)
scatter!((Cm.re, Cm.im), c=:red, ms=6)
1 Like

The basic issue is that there isn’t a good total ordering of the complex numbers that respects “common-sense” algebraic identities.

If your goal is “do whatever Matlab does”, then it indeed documents that sorting (hence median) uses phase angle (in [-\pi, \pi]):

By default, the sort function sorts complex values by their magnitude, and breaks ties using phase angles.

which should match what @shindelr implemented in Julia despite being mathematically arbitrary.

4 Likes

It’s so arbitrary that I have a hard time imagining any application where it makes sense at all.

3 Likes

It doesn’t even match what you’d get if all the complex values happen to be real since it orders by absolute value. Truly whacky way of ordering complex numbers. Ordering lexicologically by (re, im) seems much better. In particular it would give real(median(v)) == median(real.(v)) which is something at least.

1 Like

(Brief History) After writing the code and testing I, the same as others in this thread, realized Matlab gives nonsense SORT ordering for REAL functions.
So I searched Julia discourse again (4th time is a charm), found this thread, and so whipped up the code after Keying off StefanKarpinski comment / NOTE Here:
StefanKarpinski Stefan_Karpinski (Viking helmet man to the Rescue !!)
Steward
Aug 2nd (2024)
It doesn’t even match what you’d get if all the complex values happen to be real since it orders by absolute value.
Truly WHACKY way of ordering complex numbers.
Ordering LEXICOlogically by (re, im) SEEMS much BETTER.

function complex_min(x,y)
			  ## treat NaN as being GREATER than any other value and so are sorted to the end of the list.
			   ## So isnan IMMEDIATELY DECIDES which variable to return as LESSER
			  if isnan(x)
				  return y
			  end 
			  if isnan(y)
				  return x
			  end 

			  if real(y) < real(x)	## NOTE Strictly less than
				return y	## x is NOT LT y
			  end 

			  if real(y) > real(x)	## NOTE Strictly greater than
				return x 
              end

			  ## Here IFF real(y) = real(x), so sort by imag
			  if imag(y) < imag(x)	## NOTE Strictly less than
			    return y	## x is NOT LT y
			  else
				return x 	## x IS LT y
			  end
		end ## end function

!! TDD UNIT TEST - YAY PASSED !!

tx=NaN + 0.0im; ty=1.0 + 0.0im
julia> complex_min(ty,tx)
1.0 + 0.0im

Reverse order SB same answer.

julia> complex_min(tx,ty)
1.0 + 0.0im

		function complex_isless(x,y)
			    ## treat NaN as being GREATER than any other value and so are sorted to the end of the list.
		         ## So isnan IMMEDIATELY DECIDES which variable to return as LESSER
		        if isnan(x)
					return false
		        end 
		        if isnan(y)
					return true
		        end 

			  if real(y) < real(x)	## NOTE Strictly less than
				return false	## x is NOT LT y
			  end 

			  if real(y) > real(x)	## NOTE Strictly greater than
				return true 
              end

			  ## Here IFF real(y) = real(x), so sort by imag
			  if imag(y) < imag(x)	## NOTE Strictly less than
			    return false	## x is NOT LT y
			  else
				return true 	## x IS LT y
			  end
		end ## end function

Marc Cox TDD QA TEST My Example code above Passes some TDD QA TESTS for Sort order of a 1Dim Vector of Complex numbers.

vec_1 = [-3, NaN, NaN,1+im, -3, 1, 1-im, -3-im, -3+im]
sort(vec_1, lt=complex_isless)

julia> vec_1 = [-3, NaN, NaN,1+im, -3, 1, 1-im, -3-im, -3+im]
9-element Vector{ComplexF64}:
-3.0 + 0.0im
NaN + 0.0im
NaN + 0.0im
1.0 + 1.0im
-3.0 + 0.0im
1.0 + 0.0im
1.0 - 1.0im
-3.0 - 1.0im
-3.0 + 1.0im

julia> sort(vec_1, lt=complex_isless)
9-element Vector{ComplexF64}:
-3.0 - 1.0im
-3.0 + 0.0im
-3.0 + 0.0im
-3.0 + 1.0im
1.0 - 1.0im
1.0 + 0.0im
1.0 + 1.0im
NaN + 0.0im
NaN + 0.0im

RE: Stefan Karpinski TDD Test: for a working example test case TRY vec_2 WARNING try WITHOUT NaN

vec_2 = [-3, 0.0, 0.5,1+im, -3, 1, 1-im, -3-im, -3+im]
julia> real(MEDIAN_2(vec_2)) == MEDIAN_2(real.(vec_2))
true

Marc Cox WARNING NOTE there are cases, e.g. NaN in the list,where it is critical that median always uses a SORTED list, NOT a middle FUNCTION

julia> real(MEDIAN_2(sort(vec_2, lt=complex_isless))) == MEDIAN_2(real.(sort(vec_2, lt=complex_isless)))
true

More TDD TESTS to do, to reverify intuition and correct operation of SORT ORDER in ALL CASES suggest try Visualizing Complex function SORT ordering with Julia DomainColoring.jl: Smooth Complex Plotting
at Home · DomainColoring.jl