What are the available feature selection methods for selecting the rig

1.	What are the available feature selection methods for selecting the right variables for building efficient predictive models?
Answer» While using a dataset in data science or machine learning algorithms, it so happens that not all the variables are necessary and useful to build a model. Smarter feature selection methods are required to avoid REDUNDANT models to increase the efficiency of our model. Following are the three main methods in feature selection: Filter Methods: These methods pick up only the intrinsic properties of features that are measured via univariate statistics and not cross-validated performance. They are straightforward and are generally faster and require less computational resources when compared to wrapper methods. There are VARIOUS filter methods such as the Chi-Square test, Fisher’s Score method, Correlation Coefficient, Variance Threshold, Mean Absolute Difference (MAD) method, Dispersion Ratios, etc. Wrapper Methods: These methods need some sort of method to search greedily on all possible feature subsets, access their quality by learning and evaluating a classifier with the feature. The selection technique is built upon the machine learning algorithm on which the given dataset needs to FIT. There are three types of wrapper methods, they are: Forward Selection: Here, one feature is tested at a time and new features are added until a good fit is obtained. Backward Selection: Here, all the features are tested and the non-fitting ones are eliminated one by one to see while checking which works better. Recursive Feature Elimination: The features are recursively checked and evaluated how well they perform. These methods are generally computationally INTENSIVE and require high-end resources for analysis. But these methods usually lead to better predictive models having higher accuracy than filter methods. Embedded Methods: Embedded methods constitute the advantages of both filter and wrapper methods by including feature interactions while maintaining reasonable computational costs. These methods are ITERATIVE as they take each model iteration and carefully extract features contributing to most of the training in that iteration. Examples of embedded methods: LASSO Regularization (L1), Random Forest Importance.

What are the available feature selection methods for selecting the right variables for building efficient predictive models?

Answer»

While using a dataset in data science or machine learning algorithms, it so happens that not all the variables are necessary and useful to build a model. Smarter feature selection methods are required to avoid REDUNDANT models to increase the efficiency of our model. Following are the three main methods in feature selection:

Filter Methods:
- These methods pick up only the intrinsic properties of features that are measured via univariate statistics and not cross-validated performance. They are straightforward and are generally faster and require less computational resources when compared to wrapper methods.
- There are VARIOUS filter methods such as the Chi-Square test, Fisher’s Score method, Correlation Coefficient, Variance Threshold, Mean Absolute Difference (MAD) method, Dispersion Ratios, etc.

Wrapper Methods:
- These methods need some sort of method to search greedily on all possible feature subsets, access their quality by learning and evaluating a classifier with the feature.
- The selection technique is built upon the machine learning algorithm on which the given dataset needs to FIT.
- There are three types of wrapper methods, they are:
  - Forward Selection: Here, one feature is tested at a time and new features are added until a good fit is obtained.
  - Backward Selection: Here, all the features are tested and the non-fitting ones are eliminated one by one to see while checking which works better.
  - Recursive Feature Elimination: The features are recursively checked and evaluated how well they perform.
- These methods are generally computationally INTENSIVE and require high-end resources for analysis. But these methods usually lead to better predictive models having higher accuracy than filter methods.

Embedded Methods:
- Embedded methods constitute the advantages of both filter and wrapper methods by including feature interactions while maintaining reasonable computational costs.
- These methods are ITERATIVE as they take each model iteration and carefully extract features contributing to most of the training in that iteration.
- Examples of embedded methods: LASSO Regularization (L1), Random Forest Importance.

What are the available feature selection methods for selecting the right variables for building efficient predictive models?

Discussion

No Comment Found

Related InterviewSolutions

Reply to Comment