Feature selection with Lasso in Python

Sole from Train in Data
6 min readAug 16, 2022
Feature selection with Lasso in Python

Lasso is a regularization constraint introduced to the objective function of linear models in order to prevent overfitting of the predictive model to the data. The name Lasso stands for Least Absolute Shrinkage and Selection Operator.

It turns out that the Lasso regularization has the ability to set some coefficients to zero. This means that Lasso can be used for variable selection in machine learning. If the coefficients that multiply some features are 0, we can safely remove those features from the data. The remaining are the important features in the data.

Lasso was designed to improve the interpretability of machine learning models by reducing the number of features. Other regularization methods, like Ridge regression or elastic net, do not share this property.

Lasso regularization for feature selection

Let’s do a short recap on linear models and regularization.

For tutorials on feature selection check out our course Feature Selection for Machine Learning or our book Feature Selection in Machine Learning with Python.

Linear models

Linear regression models aim to predict the outcome based on a linear combination of…

--

--

Sole from Train in Data

Data scientist, book author, online instructor (www.trainindata.com) and Python open-source developer. Get our regular updates: http://eepurl.com/hdzffv