Variable Discretization in Machine Learning
A common step before training machine learning algorithms is the discretization of continuous variables. But, why do we discretize continuous variables and how do we sort continuous data into discrete values?
These are the questions that we will address throughout this article.
Feature engineering for machine learning
This article is the fifth in a series of articles on feature engineering for tabular data. You can learn more about how data scientists preprocess their data at the following links:
- Feature engineering for machine learning
- Missing data imputation
- Encoding of categorical features
- Variable transformation
- Discretization (you are here)
- Feature scaling
- Feature creation from existing features
- Python libraries for feature engineering
- Resources for learning about feature engineering
Let’s now focus on discretization.
What is discretization?
In discretization, we convert continuous variables into discrete features by producing a collection of contiguous intervals that…