Overview
-
Feature selection on Wikipedia
-
scikit-learn doc: 1.13. Feature selection
-
Feature selection of book Ensemble Machine Learning by Ankit Dixit: the workflow chart is excellent.
Implementation
Recursive feature elimination with cross-validation
Recursive feature elimination with cross-validation
Keywords: Python, RFECV
Extracting Features with Transformers
Keywords: Python, SelectKBest
, chi2
Data source: Adult
Chapter 5 of "Learning Data Mining with Python" by Robert Layton
X = adult[["Age", "Education-Num", "Capital-gain", "Capital-loss", "Hours-per-week"]].values
y = (adult["Earnings-Raw"] == ' >50K').values
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import chi2
transformer = SelectKBest(score_func=chi2, k=3)
Xt_chi2 = transformer.fit_transform(X, y)
from sklearn.tree import DecisionTreeClassifier
from sklearn.cross_validation import cross_val_score
clf = DecisionTreeClassifier(random_state=14)
scores_chi2 = cross_val_score(clf, Xt_chi2, y, scoring='accuracy')
print("Chi2 performance: {0:.3f}".format(scores_chi2.mean()))
Feature selection for machine learning
Feature selection for machine learning of book Ensemble Machine Learning.
Keywords: Python
Data source: Pima Indians Diabetes dataset
Note: Feature Selection For Machine Learning in Python is almost the same with this chapter.
Feature Selection in Python with Scikit-Learn
Feature Selection in Python with Scikit-Learn
Data source: Iris
Feature Selection with the Caret R Package
Feature Selection with the Caret R Package
Keywords: R,
Data source: Pima Indians Diabetes dataset
Logistic Regression
Section 13.2 in "R in Action".
Keywords: R, logistic
Data source: Affairs