How can happen that mutual information is negative?

Dear Community,

I am about to calculate the mutual information of two PMFs according to (1) in mathworld
which i would normalize to get the redundancy, R, in Wiki. A particular feature of the problem is that the samples “behind” PMFs are not available. It occured, that two PMFs, PX and PY, contributed to negative mutual information, which is suspicious to me.

# My suspicious MWE: 
# bincenters = (1:1:length(PX))' # bin-centers are simply integers

PX = [0.010656 0.0130878 0.0291189 0.0262176 0.0350016 0.0604959 0.0620084 0.0695799 0.0813843 0.0831733 0.0807244 0.0763465 0.0981173 0.0632269 0.0648809 0.0661584 0.0316574 0.0189747 0.0194769 0.00971281]; # PMF of x

PY = [0.0172311 0.0162825 0.0379995 0.0257677 0.0390367 0.0609703 0.0624129 0.0723287 0.0810149 0.0842886 0.0910396 0.0840303 0.0704959 0.0664023 0.0486554 0.0366271 0.0321917 0.0234354 0.0151906 0.0345987]; # PMF of y
	
#=
import Pkg
Pkg.add("Plots")
using Plots 
plot(PX; line=:stem, marker=:circle, color=:black, legend =:none)
plot!(PY; line=:stem, marker=:circle, color=:gray)
=#

PXY = PX.*PY' 							# joint probability if PX and PY are independent ?
MI  = sum(PXY.*log2.(PXY./(PX.*PY))) 	# mutual information
HX  = -sum(PX.*log2.(PX)) 				# entropy X
HY  = -sum(PY.*log2.(PY)) 				# entropy Y
R   = MI/(HX+HY) 						# normalized mutual information

I have a hard time to figure out, why MI<0 in this case, and the followings concerned me:

  • If PX and PY are independent, PXY can be used, as I did?
  • If PX and PY are dependent, how shall I calculate the conditional probabilities?
  • Shall I use samplers to produce x and y underlying realizations? (I tried to make friends with samplers but that is a new field of study for me, and thus, it is hard to familiarize with the Documentation of Turing.jl despite its values.)

Ultimately, how can happen that MI is negative? It is possible, that I miss concepts from probability, so I am grateful for your suggestions :slight_smile:
Thank you in advance and in the meantime I study further the Docs.

Kind regards,
ladislaus

Yes. But in this case, mutual information is zero.

Your formula is missing a transpose though, should be sum(PXY.*log2.(PXY./(PX.*PY'))).

1 Like