Variable Discretization in Machine Learning

Sole from Train in Data
5 min readJul 5, 2022
Discretization of continuous variables in machine learning
Discretization of continuous variables in machine learning

A common step before training machine learning algorithms is the discretization of continuous variables. But, why do we discretize continuous variables and how do we sort continuous data into discrete values?

These are the questions that we will address throughout this article.

Feature engineering for machine learning

This article is the fifth in a series of articles on feature engineering for tabular data. You can learn more about how data scientists preprocess their data at the following links:

  1. Feature engineering for machine learning
  2. Missing data imputation
  3. Encoding of categorical features
  4. Variable transformation
  5. Discretization (you are here)
  6. Feature scaling
  7. Feature creation from existing features
  8. Python libraries for feature engineering
  9. Resources for learning about feature engineering

Let’s now focus on discretization.

What is discretization?

In discretization, we convert continuous variables into discrete features by producing a collection of contiguous intervals that…

--

--

Sole from Train in Data

Data scientist, book author, online instructor (www.trainindata.com) and Python open-source developer. Get our regular updates: http://eepurl.com/hdzffv