Best Resources to Learn Feature Engineering

Sole from Train in Data
13 min readAug 11, 2020
Feature engineering for machine learning.
Feature Engineering — Image from the author

Data in its raw format cannot be used straightaway to train machine learning models. Instead, data scientists devote a big chunk of time to transform the data and to build suitable features for machine learning.

The process of transforming the variables and creating new features is called feature engineering, and it is typically the stage where data scientists devote most of their effort in a machine learning project.

As Pedro Domingos said in the article “A few useful things to know about machine learning”:

“At the end of the day, some machine learning projects succeed and some fail. What makes the difference? Easily the most important factor is the features used”.

Feature engineering and data pre-processing are also, for many of us, the most interesting parts of the data science project, where we can combine our creativity and intuition with domain knowledge to create meaningful features.

Some aspects of feature engineering are domain-specific: we need to know a few things about the data and the business area, to derive useful features. But a big chunk of feature engineering is also quite repetitive and can be automated.

--

--

Sole from Train in Data
Sole from Train in Data

Written by Sole from Train in Data

Data scientist, book author, online instructor (www.trainindata.com) and Python open-source developer. Get our regular updates: http://eepurl.com/hdzffv

No responses yet