How can happen that mutual information is negative?

ladislaus · May 14, 2022, 2:15pm

Dear Community,

I am about to calculate the mutual information of two PMFs according to (1) in mathworld
which i would normalize to get the redundancy, R, in Wiki. A particular feature of the problem is that the samples “behind” PMFs are not available. It occured, that two PMFs, PX and PY, contributed to negative mutual information, which is suspicious to me.

# My suspicious MWE: 
# bincenters = (1:1:length(PX))' # bin-centers are simply integers

PX = [0.010656 0.0130878 0.0291189 0.0262176 0.0350016 0.0604959 0.0620084 0.0695799 0.0813843 0.0831733 0.0807244 0.0763465 0.0981173 0.0632269 0.0648809 0.0661584 0.0316574 0.0189747 0.0194769 0.00971281]; # PMF of x

PY = [0.0172311 0.0162825 0.0379995 0.0257677 0.0390367 0.0609703 0.0624129 0.0723287 0.0810149 0.0842886 0.0910396 0.0840303 0.0704959 0.0664023 0.0486554 0.0366271 0.0321917 0.0234354 0.0151906 0.0345987]; # PMF of y
	
#=
import Pkg
Pkg.add("Plots")
using Plots 
plot(PX; line=:stem, marker=:circle, color=:black, legend =:none)
plot!(PY; line=:stem, marker=:circle, color=:gray)
=#

PXY = PX.*PY' 							# joint probability if PX and PY are independent ?
MI  = sum(PXY.*log2.(PXY./(PX.*PY))) 	# mutual information
HX  = -sum(PX.*log2.(PX)) 				# entropy X
HY  = -sum(PY.*log2.(PY)) 				# entropy Y
R   = MI/(HX+HY) 						# normalized mutual information

I have a hard time to figure out, why MI<0 in this case, and the followings concerned me:

If PX and PY are independent, PXY can be used, as I did?
If PX and PY are dependent, how shall I calculate the conditional probabilities?
Shall I use samplers to produce x and y underlying realizations? (I tried to make friends with samplers but that is a new field of study for me, and thus, it is hard to familiarize with the Documentation of Turing.jl despite its values.)

Ultimately, how can happen that MI is negative? It is possible, that I miss concepts from probability, so I am grateful for your suggestions
Thank you in advance and in the meantime I study further the Docs.

Kind regards,
ladislaus

mcabbott · May 14, 2022, 4:40pm

Yes. But in this case, mutual information is zero.

Your formula is missing a transpose though, should be sum(PXY.*log2.(PXY./(PX.*PY'))).

Topic		Replies	Views
Question about mutual information calculation Specific Domains statistics	5	175	January 18, 2025
Rand/logpdf semantic consistency Probabilistic programming	12	1109	December 3, 2020
How to simulate a random signal sequence change Statistics question	4	709	October 10, 2020
[ANN]: RxInfer.jl 2.0 Julia package for automated Bayesian inference on a factor graph with reactive message passing Package Announcements reactiveprogramming , distributions , bayesian-inference , reactivemp , rxinfer	23	4209	January 28, 2023
DynamicHMC for Bayesian inference where product distribution is observed - Possible? Probabilistic programming dynamichmc	9	679	May 3, 2021

How can happen that mutual information is negative?

Related topics