-
-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add wrapper methods random and forward feature selection #30
Conversation
Example FeatureSelectionRandom + TerminatorEvaluations# Specify the task
task = mlr_tasks$get("boston_housing")
# Define the learner
learner = mlr_learners$get("regr.rpart")
# Choose resampling strategy
resampling = mlr_resamplings$get("cv", param_vals = list(folds = 5L))
# Specify performance evaluator
pe = PerformanceEvaluator$new(task = task,
learner = learner,
resampling = resampling)
# Specify terminator
tm = TerminatorEvaluations$new(max_evaluations = 10)
# Specify wrapper method
fs = FeatureSelectionRandom$new(pe = pe,
tm = tm,
batch_size = 10,
max_features = 8)
# Run feature selection
fs$calculate()
# Get best selection
fs$get_result() |
FeatureSelectionForward + TerminatorPerformanceStep# Specify the task
task = mlr_tasks$get("pima")
# Change measure
measures = mlr_measures$mget(c("classif.acc"))
task$measures = measures
# Define the learner
learner = mlr_learners$get("classif.rpart")
# Choose resampling strategy
resampling = mlr_resamplings$get("cv", param_vals = list(folds = 5L))
# Specify performance evaluator
pe = PerformanceEvaluator$new(task = task,
learner = learner,
resampling = resampling)
# Specify terminator
tm = TerminatorPerformanceStep$new(threshold = 0.01)
# Specify wrapper method
fs = FeatureSelectionForward$new(pe = pe, tm = tm)
# Run feature selection
fs$calculate()
# Get best selection
fs$get_result()
# Get optimization path
fs$get_optimization_path() |
Sounds reasonable. I think all kind of termination should go into the terminator. And setting
Do these two incorporate all functionality of Misc
|
Moved to #35 |
fixes #24
This is a basic implementation of mlr’s
makeFeatSelControlRandom
andmakeFeatSelControlSequential
. The overall design is similar tomlr3tuning
. I used many descriptions and some parts of the code from this package.Classes
FeatureSelection*
generate_states
method that generates different feature combinations (states) in a 0-1 encoding.FeatureSelectionRandom
n combinations are generated depending on thebatch_size
.FeatureSelectionForward
all combinations of one step are generated.PerformanceEvaluator
evaluate_states
method that takes the states as an argument. For each state, the task with all features is cloned and a selection is applied based on the encoding of the state. All states are evaluated withmlr3::benchmark
.evaluate_states
, thestates
are stored in a list entry inself$states
.evaluate_states
, thebenchmark
object is stored in a list entry inself$bmr
.FeatureSelectionForward
is able to generate the path of the stepwise selection.Terminator
Terminator
class inmlr3tuning
.TerminatorPerformanceStep
is specially designed to work withFeatureSelectionForward
. It compares the last two chosen states and terminates if the performance improvement is under a certain threshold.Discussion
get_result
andget_optimization_path
which print out the best feature combination or the steps of the feature selection.binary_to_features
, which converts the 0-1 encoding to feature names as a private method inFeatureSelection
?max_features
is not implemented forFeatureSelectionForward
because it is something theTerminatorPerformanceStep
object needs to know. I need to come up with an elegant way to do this. Maybe we have to makemax_features
an argument forTerminatorPerformanceStep
and remove it as a setting forFeatureSelectionForward
.