# Mutual information with Python

Mutual information (MI) is a non-negative value that measures the mutual dependence between two random variables. The mutual information measures the amount of information we can know from one variable by observing the values of the second variable.

The mutual information is a good alternative to Pearson’s correlation coefficient, because it is able to measure any type of relationship between variables, not just linear associations. And also, it is suitable for both continuous and discrete variables, unlike Pearson’s correlation coefficient.

MI is closely related to the concept of entropy. Thus, I will first introduce the entropy, then show how we compute the entropy of a discrete variable. Next, I will show how to compute the MI between discrete variables. I will extend the definition of MI for continuous variables. And finally, I will finish with a Python implementation of feature selection based on MI.

In summary, in the following paragraphs we will discuss:

- Entropy.
- Related entropy.
- Mutual information of discrete variables.
- Mutual information of continuous variables.
- Feature selection based on MI with Python.