Suppose there is a real-life problem as to put a nail on a wall, so what are the steps to complete the process of putting that nail, we need a nail and a hammer so this selection of attribute according to the problem is the same thing in the Feature selection in machine learning. The reduction of the problem depends on the selection of the variables included in the process, and thus the feature selection plays an essential role in the creation of a model of a problem.
This article covers some of the following topics:
- Techniques of Features Selection
- Advantage and use of feature selection
Introduction of Feature selection
In machine learning Feature selection is a process of selecting the essential and useful variables for the particular problem model and feature selection is the process where the complexity and the performance of the model will be decided, as if the features or not appropriately selected, then model will be complicated, slower and bulky and thus performance will be decreased. The feature selection compares and selects the best features that are relevant to the associated problem and removes the unwanted features according to the comparison to the other useful feature. After the feature selection method only essential and related variable will be remaining, hence the model will be much simpler to understand and will be more accurate .
figure 1 feature selection
Techniques of Feature selection
There are three techniques of features selection and all of them has one goal to find the best and relevant and useful features for correspond problem. All the processes have their way of searching for the best feature and selecting the proper correlated feature.
These are the three following Feature selection techniques:
- Filter Methods
- Wrapper Methods
- Embedded Methods
Filter Methods of Feature selection
Filter methods do not generate any new subset of features, it only compares the available features and then selects the most relevant feature amongst them and proceed with that feature. It uses the filtering of features as per their relative scoring, if the feature is more related then it scores of that feature is high and if the feature is less related with the problem then the score of that feature is low, at the end the highest scored features would be selected for the problem model.
For e.g, lets take our putting nail on the wall example, suppose we have a toolbox full of tools and we need only essential tools to hammer that nail on the wall, and to do so we need to pick the right tools and that would be done by filtering the tools , if there are a screwdriver and a hammer then the score of a hammer is high because its easy to nail in the wall with the hammer, so the score of hammer is high and it is selected.
- The filter method does the only comparison, so its fast.
- It selects the most relevant features according to the problem
- The generation of subset it not available in the filter method and sometimes generated features are not sufficient so the model has to work with only available features.
- Filter method does not check the performance after selecting the features .
Wrapper methods of feature selection
Wrapper methods work in an iterative manner, which means that it works continuously unless the proper feature is found for the particular problem. Wrapper method is entirely different from the filter method and the working of the wrapper method depends on the working of the problem model and thus it is more accurate. The wrapper methods select the features and then train the problem model according to that and if it fails then it keep on trying with the other features or the newly generated subsets depends on the mod of the method. There are three modes of working of a wrapper method :
- Forward Selection
- Backward Elimination
Recursive Feature Elimination
- Forward Selection: In the forward selection there would be various features available and all of them would be inserted one by one as per the testing on the model and if any of them succeed that will be selected.
- Backward Elimination: This is just the opposite of the forward selection, in the backward elimination all the variables are used in first then unwanted variables are removed with the process cycle.
We can relate this to our example, for instance, there is a toolbox we would try to hammer the nail with different tools, so first, we choose screwdriver and it fails then we will keep on trying with the tools unless we find the hammer or more essential tool that can hammer that nail.
- Recursive Feature Elimination: In this mode, the process keeps on creating various models with the different features and then keep all the best performing features and worst aside and from performance, all the proper working features are selected.
- Embedded Methods: Embedded methods are the combination of both filter methods and wrapper methods. It’s implemented by algorithms that have their built-in feature selection methods. Some of the most famous examples of these methods are LASSO and RIDGE regression which have inbuilt penalization functions to reduce overfitting.
 Wikipedia, https://en.wikipedia.org/wiki/Feature_selection
Satish Kaushik, “Introduction to Feature Selection methods with an example (or how to select the right variables?)”, https://www.analyticsvidhya.com/blog/2016/12/introduction-to-feature-selection-methods-with-an-example-or-how-to-select-the-right-variables/
Sebastian Raschka, “Machine Learning FAQ”, https://sebastianraschka.com/faq/docs/feature_sele_categories.html