Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add chapter on validation and internal tuning #829

Open
wants to merge 29 commits into
base: main
Choose a base branch
from
Open

Conversation

sebffischer
Copy link
Sponsor Member

@sebffischer sebffischer commented Aug 16, 2024

TODOs:

Copy link

Preview

@sumny
Copy link
Sponsor Member

sumny commented Aug 18, 2024

minor wording suggestions (but can also be left out):

  • "where we would fit again and again with different iterations numbers." -> where we would fit the model again and again with different iterations numbers.
  • lightgbm -> LightGBM
  • catboost -> CatBoost
  • "test" to use the test set as validation data, which only works in combination with resampling and tuning." this sounds as if we would leak test data but it is just the test split from the resampling (so the validation split during HPO, correct?)
  • "we can no access this through the $model slit." no -> now; slit -> slot
  • "for training to end" -> for training to stop early
  • "By using early stopping, we were able to already terminate training 38 rounds. " -> terminate training after 38 rounds; (also training might have been performed actually longer, due to the patience?)
  • "We see that after a logloss plateaus." -> We can see that the logloss plateaus after 38 rounds.
  • "as it allows to the internal tuning of a Learner with (non-internal) hyperparameter" -> allows to perform internal tuning ...
  • "In such scenarios, what one" -> In such scenarios, one
  • "We also have to say" -> We also have to specify
  • "You can find out which ones support this feature by checking the corresponding documentation." Maybe also give one example
  • "as we specified validate =“test”. By visualizing the results we can see an inverse relationship between the two tuning parameters: a larger step size (eta) requires more boosting iterations (nrounds`)." Formatting is weird
  • "We can also prediction objects" Verb missing
  • "we its predict_sets field. " Verb missing
  • "Here can only select from those predict sets that we configured the Learner to predict on." Verb missing
  • "Because the penguins task" -> As
  • "select an evaluation metric to classification error" -> set the
  • "Then, show" -> Then, visualize (or print?)
  • "lightgbm" -> LightGBM
  • "xgboost" -> XGBoost
  • "why the code above errs" -> why the code above errrors
  • "Don’t tune any other parameters than the learning rate, which is possible by using tnr("internal")" somewhat unclear. Tune nrounds internally and then only tune the learning rate?

otherwise, great job!

@jemus42
Copy link
Sponsor Member

jemus42 commented Sep 5, 2024

I'm trying to add early stopping to the XGBoost learner in my benchmark based on this chapter, and I'm not sure whether I just misunderstand a few things or maybe the chapter could be extended in that regard.

My problem is that I'm using an AutoTuner with a given search space for tuning, but I would like to also internally use early stopping and thereby tune nrounds.

One of my naive attempts below:

library(mlr3)
library(mlr3tuning)
library(mlr3pipelines)
library(mlr3proba)
library(mlr3extralearners)

task = tsk("lung")
xgb_base = lrn("surv.xgboost.cox", 
               early_stopping_rounds = 10,
               nrounds = to_tune(upper = 1000, internal = TRUE),
               tree_method = "hist", booster = "gbtree")

xgb_glearn = po("fixfactors") %>>%
  po("imputesample", affect_columns = selector_type("factor")) %>>%
  po("encode", method = "treatment") %>>%
  po("removeconstants") %>>%
  xgb_base |>
  as_learner()

set_validate(xgb_glearn, "test")

xgb_autotuner = auto_tuner(
  learner = xgb_glearn,
  search_space = ps(
    surv.xgboost.cox.eta = p_dbl(0.001, 1, logscale = TRUE),
    surv.xgboost.cox.max_depth = p_int(1, 20),
    surv.xgboost.cox.subsample = p_dbl(0, 1),
    surv.xgboost.cox.colsample_bytree = p_dbl(0, 1),
    surv.xgboost.cox.grow_policy = p_fct(c("depthwise", "lossguide"))
  ),
  resampling = rsmp("cv", folds = 3),
  measure = msr("surv.cindex"),
  terminator = trm("evals", n_evals = 20, k = 0),
  tuner = tnr("random_search")
)

Resulting in the not unexpected error

Error in .__AutoTuner__initialize(self = self, private = private, super = super,  : 
  If the values of the ParamSet of the Learner contain TuneTokens you cannot supply a search_space.

I'm not sure how to indicate to my AutoTuner that I would like to both

  1. tune using the supplied search space using a given metric (not XGBoost's internal one)
  2. have XGBoost use early stopping for nrounds under the hood

@sebffischer
Copy link
Sponsor Member Author

sebffischer commented Sep 5, 2024

my previous answers were bad and thanks for making me aware that this is not documented yet.

library(mlr3)
library(mlr3tuning)
#> Loading required package: paradox
library(mlr3pipelines)
library(mlr3proba)
library(mlr3extralearners)

task = tsk("lung")
xgb_base = lrn("surv.xgboost.cox",
               early_stopping_rounds = 10,
               tree_method = "hist", booster = "gbtree")

xgb_glearn = po("fixfactors") %>>%
  po("imputesample", affect_columns = selector_type("factor")) %>>%
  po("encode", method = "treatment") %>>%
  po("removeconstants") %>>%
  xgb_base |>
  as_learner()

set_validate(xgb_glearn, "test")

xgb_autotuner = auto_tuner(
  learner = xgb_glearn,
  search_space = ps(
    surv.xgboost.cox.eta = p_dbl(0.001, 1, logscale = TRUE),
    surv.xgboost.cox.nrounds = p_int(upper = 1000, tags = "internal_tuning", aggr = function(x) as.integer(mean(unlist(x)))),
    surv.xgboost.cox.max_depth = p_int(1, 20),
    surv.xgboost.cox.subsample = p_dbl(0, 1),
    surv.xgboost.cox.colsample_bytree = p_dbl(0, 1),
    surv.xgboost.cox.grow_policy = p_fct(c("depthwise", "lossguide"))
  ),
  resampling = rsmp("cv", folds = 3),
  measure = msr("surv.cindex"),
  terminator = trm("evals", n_evals = 20, k = 0),
  tuner = tnr("random_search")
)

xgb_autotuner$train(task)

Created on 2024-09-05 with reprex v2.1.1

@sebffischer
Copy link
Sponsor Member Author

Maybe we should include the internal tune tokens in the tuning spaces @be-marc?

@jemus42
Copy link
Sponsor Member

jemus42 commented Sep 6, 2024

Great, thanks!

Is there something I can to to keep the evaluation_log around though?
Considering a previous experiment without AutoTuner:

library(mlr3)
library(mlr3pipelines)
library(mlr3proba)
library(mlr3extralearners)

task = tsk("lung")
xgb_base = lrn("surv.xgboost.cox",
               early_stopping_rounds = 100,
               max_depth = 3, eta = .01,
               tree_method = "hist", booster = "gbtree")

xgb_glearn = po("fixfactors") %>>%
  po("imputesample", affect_columns = selector_type("factor")) %>>%
  po("encode", method = "treatment") %>>%
  po("removeconstants") %>>%
  xgb_base |>
  as_learner()

set_validate(xgb_glearn, "test")

rr = resample(
  task = task,
  learner = xgb_glearn,
  resampling = rsmp("cv", folds = 3),
  store_models = TRUE
)
#> INFO  [11:45:40.575] [mlr3] Applying learner 'fixfactors.imputesample.encode.removeconstants.surv.xgboost.cox' on task 'lung' (iter 1/3)
#> INFO  [11:45:40.786] [mlr3] Applying learner 'fixfactors.imputesample.encode.removeconstants.surv.xgboost.cox' on task 'lung' (iter 2/3)
#> INFO  [11:45:40.921] [mlr3] Applying learner 'fixfactors.imputesample.encode.removeconstants.surv.xgboost.cox' on task 'lung' (iter 3/3)

rr$learners[[1]]$model$surv.xgboost.cox$model$model$evaluation_log |>
  ggplot2::ggplot(ggplot2::aes(x = iter, y = test_cox_nloglik)) +
  ggplot2::geom_line() +
  ggplot2::theme_minimal()

Created on 2024-09-06 with reprex v2.1.1

I was hoping to sanity check the internal tuning using the evaluation log when using the AutoTuner, but can't seem to find it.
Using the same reprex in your post but with

xgb_autotuner = auto_tuner(
  ...,
  store_models = TRUE,
  store_benchmark_result = TRUE,
  store_tuning_instance = TRUE
)

...I found xgb_autotuner$model$learner$model$surv.xgboost.cox$model$model$evaluation_log is empty.
The tuning archive also didn't contain anything related to nrounds, but wasn't there some talk about that being the case with a list-column for internally tuned params or am I misremembering?

@sebffischer
Copy link
Sponsor Member Author

you are accessing the final model fit but in the final model fit there is no early stopping.
This is, because during the final model fit we want to use all data. Therefore, the nrounds is set to the optimal nrounds that was found. You need to store the models that are created during the tuning to access the evaluation log. If this is not possible due to memory constraints you can write a callback, but here @be-marc should help you.

@sebffischer
Copy link
Sponsor Member Author

sebffischer commented Sep 6, 2024

Also, are you aware that xgboost will use the optimal model during prediction and NOT the final model?

--> You should be less worried about a too high patience parameter (except for increased runtime I guess).

@jemus42
Copy link
Sponsor Member

jemus42 commented Sep 6, 2024

you are accessing the final model fit but in the final model fit there is no early stopping.

Ah right, of course, makes sense 😅
I don't think I strictly need to access those, just for now I'm trying to get a feeling for how the early stopping works and behaves.
I also found xgb_autotuner$tuning_result$internal_tuned_values[[1]]$surv.xgboost.cox.nrounds by now, so that's helpful 👍🏻

Also, are you aware that xgboost will use the optimal model during prediction and NOT the final model?

I was banking on that -- my main concern is to avoid overfitting in the benchmark, and saving some compute would be a bonus but not a must.

Thanks for the clarifications!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants