Open in app

Sign In

Write

Sign In

Sole from Train in Data
Sole from Train in Data

1.2K Followers

Home

About

Pinned

Feature engineering for machine learning: What is it?

State-of-the-art feature engineering methods and Python libraries used in data science. Feature engineering is the process of transforming features, extracting features, and creating new variables from the original data, to train machine learning models. Data in its original format can almost never be used straightaway to train classification or regression models. …

Machine Learning

10 min read

Feature Engineering for Machine Learning: What is it?
Feature Engineering for Machine Learning: What is it?
Machine Learning

10 min read


Pinned

The 4 movies every data scientist should watch and how to use our skills responsibly.

Did you come here expecting to find Money Ball, The Imitation Game, or Minority Report? This is going to be a bit different. …

Data Science

8 min read

The 4 movies every data scientist should watch.
The 4 movies every data scientist should watch.
Data Science

8 min read


Pinned

How To Build And Deploy A Reproducible Machine Learning Pipeline

As companies and researches rush to implement more and more machine learning practices into their organizations, occasionally they sacrifice understanding the complexities of statistics practices in order to achieve results faster. …

Machine Learning

12 min read

How To Build And Deploy A Reproducible Machine Learning Pipeline
How To Build And Deploy A Reproducible Machine Learning Pipeline
Machine Learning

12 min read


Nov 9, 2022

Data science and machine learning books

Did you come here expecting to find the “Hundred-page machine learning book” or “Elements of statistical learning”? This article is going to be a bit different. In this article, I want to highlight the 5 books that expose the controversial policies and business models, as well as the surveillance abuses…

Data Science

4 min read

Data science and machine learning books
Data science and machine learning books
Data Science

4 min read


Aug 16, 2022

Feature selection with Lasso in Python

Lasso is a regularization constraint introduced to the objective function of linear models in order to prevent overfitting of the predictive model to the data. The name Lasso stands for Least Absolute Shrinkage and Selection Operator. It turns out that the Lasso regularization has the ability to set some coefficients…

Machine Learning

6 min read

Feature selection with Lasso in Python
Feature selection with Lasso in Python
Machine Learning

6 min read


Aug 16, 2022

Recursive feature elimination with Python

Recursive feature elimination (RFE) is the process of selecting features sequentially, in which features are removed one at a time, or a few at a time, iteration after iteration. Given a machine learning model, the goal of recursive feature elimination is to select features by recursively considering smaller and smaller…

Feature Selection

10 min read

Recursive feature elimination with Python
Recursive feature elimination with Python
Feature Selection

10 min read


Aug 12, 2022

Mutual information with Python

Mutual information (MI) is a non-negative value that measures the mutual dependence between two random variables. The mutual information measures the amount of information we can know from one variable by observing the values of the second variable. The mutual information is a good alternative to Pearson’s correlation coefficient, because…

Data Science

10 min read

Using Mutual information to select features with Python
Using Mutual information to select features with Python
Data Science

10 min read


Jul 5, 2022

Variable Discretization in Machine Learning

A common step before training machine learning algorithms is the discretization of continuous variables. But, why do we discretize continuous variables and how do we sort continuous data into discrete values? These are the questions that we will address throughout this article. Feature engineering for machine learning This article is the fifth in a series…

Machine Learning

5 min read

Variable Discretization in Machine Learning
Variable Discretization in Machine Learning
Machine Learning

5 min read


Jun 2, 2022

Variance stabilizing transformations in machine learning

You’ve probably heard that before training machine learning models, data scientists transform random variables to change their distribution into something closer to the normal distribution. But, why do we do this? Which variables should we transform? Which transformations should we use? …

Machine Learning

12 min read

Variance stabilizing transformations in machine learning
Variance stabilizing transformations in machine learning
Machine Learning

12 min read


Apr 25, 2022

Alternative Feature Selection Methods in Machine Learning

You’ve probably done your online searches on “Feature Selection”, and you’ve probably found tons of articles describing the three umbrella terms that group selection methodologies, i.e., “Filter Methods”, “Wrapper Methods” and “Embedded Methods”. Under the “Filter Methods”, we find statistical tests that select features based on their distributions. These methods…

Feature Selection

12 min read

Alternative Feature Selection Methods in Machine Learning
Alternative Feature Selection Methods in Machine Learning
Feature Selection

12 min read

Sole from Train in Data

Sole from Train in Data

1.2K Followers

Data scientist, book author, online instructor (www.trainindata.com) and Python open-source developer. Get our regular updates: http://eepurl.com/hdzffv

Following
  • Kurtis Pykes

    Kurtis Pykes

  • DataKind UK

    DataKind UK

  • Stojancho Tudjarski

    Stojancho Tudjarski

  • Samuele Mazzanti

    Samuele Mazzanti

  • Opeyemi Bamigbade

    Opeyemi Bamigbade

See all (9)

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Text to speech