Skip to content

Commit

Permalink
add time constraints gallery post (#149)
Browse files Browse the repository at this point in the history
* add time constraints gallery post

* draft

* render post

* update

* update

* title
  • Loading branch information
be-marc authored Dec 23, 2023
1 parent 4ed37f6 commit e33866b
Show file tree
Hide file tree
Showing 5 changed files with 165 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"hash": "bdef520900b935336c1faaa31abf2bbf",
"result": {
"markdown": "---\ntitle: \"Time constraints in the mlr3 ecosystem\"\ndescription: |\n Set time limits for learners, tuning and nested resampling.\nauthor:\n - name: Marc Becker\n orcid: 0000-0002-8115-0400\n url: https://github.com/be-marc\ndate: 2023-12-21\nbibliography: ../../bibliography.bib\nimage: cover.jpg\n---\n\n\n\n\n# Scope\n\nSetting time limits is an important consideration when tuning unreliable or unstable learning algorithms and when working on shared computing resources.\nThe mlr3 ecosystem provides several mechanisms for setting time constraints for individual learners, tuning processes, and nested resampling.\n\n# Learner\n\nThis section demonstrates how to impose time constraints using a support vector machine (SVM) as an illustrative example.\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nlibrary(mlr3verse)\n\nlearner = lrn(\"classif.svm\")\n```\n:::\n\n\nApplying timeouts to the `$train()` and `$predict()` functions is essential for managing learners that may operate indefinitely.\nThese time constraints are set independently for both the training and prediction stages.\nGenerally, training a learner consumes more time than prediction.\nCertain learners, like k-nearest neighbors, lack a distinct training phase and require a timeout only during prediction.\nFor the SVM's training, we set a 10-second limit.\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nlearner$timeout = c(train = 10, predict = Inf)\n```\n:::\n\n\nTo effectively terminate the process if necessary, it's important to run the training and prediction within a separate R process.\nThe [callr](https://cran.r-project.org/package=callr) package is recommended for this encapsulation, as it tends to be more reliable than the [evaluate](https://cran.r-project.org/package=evaluate) package, especially for terminating externally compiled code.\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nlearner$encapsulate = c(train = \"callr\", predict = \"callr\")\n```\n:::\n\n\nNote that using `callr` increases the runtime due to the overhead of starting an R process.\nAdditionally, it's advisable to specify a fallback learner, such as `\"classif.featureless\"`, to provide baseline predictions in case the primary learner is terminated.\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nlearner$fallback = lrn(\"classif.featureless\")\n```\n:::\n\n\nThese time constraints are now integrated into the training, resampling, and benchmarking processes.\nFor more information on encapsulation and fallback learners, see the [mlr3book](https://mlr3book.mlr-org.com/chapters/chapter10/advanced_technical_aspects_of_mlr3.html#sec-error-handling).\nThe next section will focus on setting time limits for the entire tuning process.\n\n# Tuning\n\nWhen working with high-performance computing clusters, jobs are often bound by strict time constraints.\nExceeding these limits results in the job being terminated and the loss of any results generated.\nTherefore, it's important to ensure that the tuning process is designed to adhere to these time constraints.\n\nThe `trm(\"runtime\")` controls the duration of the tuning process.\nWe must take into account that the terminator can only check if the time limit is reached between batches.\nWe must therefore set the time lower than the runtime of the job.\nHow much lower depends on the runtime or time limit of the individual learners.\nThe last batch should be able to finish before the time limit of the cluster is reached.\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nterminator = trm(\"run_time\", secs = 60)\n\ninstance = ti(\n task = tsk(\"sonar\"),\n learner = learner,\n resampling = rsmp(\"cv\", folds = 3),\n measures = msr(\"classif.ce\"),\n terminator = terminator\n)\n```\n:::\n\n\nWith these settings, our tuning operation is configured to run for 60 seconds, while individual learners are set to terminate after 10 seconds.\nThis approach ensures the tuning process is efficient and adheres to the constraints imposed by the high-performance computing cluster.\n\n# Nested Resampling\n\nWhen using [nested resampling](https://mlr3book.mlr-org.com/chapters/chapter4/hyperparameter_optimization.html#sec-nested-resampling), time constraints become more complex as they are applied across various levels.\nAs before, the time limit for an individual learner during the tuning is set with `$timeout`.\nThe time limit for the tuning processes in the auto tuners is controlled with the `trm(\"runtime\")`.\nIt's important to note that once the auto tuner enters the final phase of fitting the model and making predictions on the outer test set, the time limit governed by the terminator no longer applies.\nAdditionally, the time limit previously set on the learner is temporarily deactivated, allowing the auto tuner to complete its task uninterrupted.\nHowever, a separate time limit can be assigned to each auto tuner using `$timeout`.\nThis limit encompasses not only the tuning phase but also the time required for fitting the final model and predictions on the outer test set.\n\nThe best way to show this is with an example.\nWe set the time limit for an individual learner to 10 seconds.\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nlearner$timeout = c(train = 10, predict = Inf)\nlearner$encapsulate = c(train = \"callr\", predict = \"callr\")\nlearner$fallback = lrn(\"classif.featureless\")\n```\n:::\n\n\nNext, we give each auto tuner 60 seconds to finish the tuning process.\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nterminator = trm(\"run_time\", secs = 60)\n```\n:::\n\n\nFurthermore, we impose a 120-second limit for resampling each auto tuner.\nThis effectively divides the time allocation, with around 60 seconds for tuning and another 60 seconds for final model fitting and predictions on the outer test set.\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nat = auto_tuner(\n tuner = tnr(\"random_search\"),\n learner = learner,\n resampling = rsmp(\"cv\", folds = 3),\n measure = msr(\"classif.ce\"),\n terminator = trm(\"run_time\", secs = 60)\n)\n\nat$timeout = c(train = 100, predict = 20)\nat$encapsulate = c(train = \"callr\", predict = \"callr\")\nat$fallback = lrn(\"classif.featureless\")\n```\n:::\n\n\nIn total, the entire nested resampling process is designed to be completed within 10 minutes (120 seconds multiplied by 5 folds).\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nrr = resample(task, at, rsmp(\"cv\", folds = 5))\n```\n:::\n\n\n# Conclusion\n\nWe delved into the setting of time constraints across different levels in the mlr3 ecosystem.\nFrom individual learners to the complexities of nested resampling, we've seen how effectively managing time limits can significantly enhance the efficiency and reliability of machine learning workflows.\nBy utilizing the `trm(\"runtime\")` for tuning processes and setting `$timeout` for individual learners and auto tuners, we can ensure that our machine learning tasks are not only effective but also adhere to the practical time constraints of shared computational resources.\nFor more information, see also the error handling section in the [mlr3book](https://mlr3book.mlr-org.com/chapters/chapter5/advanced_tuning_methods_and_black_box_optimization.html#sec-encapsulation-fallback).\n",
"supporting": [],
"filters": [
"rmarkdown/pagebreak.lua"
],
"includes": {},
"engineDependencies": {},
"preserve": {},
"postProcess": true
}
}
6 changes: 6 additions & 0 deletions mlr-org/faq.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ toc: false

* [What is the purpose of the `OMP_THREAD_LIMIT` environment variable?](#omp-thread-limit)
* [Why is tuning slow despite quick model fitting?](#tuning-slow)
* [How can I use time constraints in tuning?](#time-limit-constraints)

## What is the purpose of the `OMP_THREAD_LIMIT` environment variable? {#omp-thread-limit}

Expand Down Expand Up @@ -43,3 +44,8 @@ Refer to the [OpenMP Thread Limit](#omp-thread-limit) section in this FAQ for gu
5. **Nested Resampling Strategies:** When employing nested resampling, choosing an effective parallelization strategy is crucial.
The wrong strategy can lead to inefficiencies.
For a deeper understanding, refer to the [nested resampling section](https://mlr3book.mlr-org.com/chapters/chapter10/advanced_technical_aspects_of_mlr3.html#sec-nested-resampling-parallelization) in our book.

## How can I use time constraints in tuning? {#time-limit-constraints}

The mlr3 ecosystem provides several mechanisms for setting time constraints for individual learners, tuning processes, and nested resampling.
The [gallery post](gallery/technical/2023-12-21-time-constraints/) on time limit constraints provides a detailed overview of these mechanisms.
4 changes: 4 additions & 0 deletions mlr-org/gallery-top-technical.yml
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
- title: Time constraints in the mlr3 ecosystem
href: gallery/technical/2023-12-21-time-constraints
description: |
Set time limits for learners, tuning and nested resampling.
- title: Production Example Using Plumber and Docker
href: gallery/technical/2020-08-13-a-production-example-using-plumber-and-docker
description: |
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
141 changes: 141 additions & 0 deletions mlr-org/gallery/technical/2023-12-21-time-constraints/index.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
---
title: "Time constraints in the mlr3 ecosystem"
description: |
Set time limits for learners, tuning and nested resampling.
author:
- name: Marc Becker
orcid: 0000-0002-8115-0400
url: https://github.com/be-marc
date: 2023-12-21
bibliography: ../../bibliography.bib
image: cover.jpg
---

{{< include ../../_setup.qmd >}}

# Scope

Setting time limits is an important consideration when tuning unreliable or unstable learning algorithms and when working on shared computing resources.
The mlr3 ecosystem provides several mechanisms for setting time constraints for individual learners, tuning processes, and nested resampling.

# Learner

This section demonstrates how to impose time constraints using a support vector machine (SVM) as an illustrative example.

```{r 2023-12-21-time-constraints-001}
#| message: false
library(mlr3verse)
learner = lrn("classif.svm")
```

Applying timeouts to the `$train()` and `$predict()` functions is essential for managing learners that may operate indefinitely.
These time constraints are set independently for both the training and prediction stages.
Generally, training a learner consumes more time than prediction.
Certain learners, like k-nearest neighbors, lack a distinct training phase and require a timeout only during prediction.
For the SVM's training, we set a 10-second limit.

```{r 2023-12-21-time-constraints-002}
learner$timeout = c(train = 10, predict = Inf)
```

To effectively terminate the process if necessary, it's important to run the training and prediction within a separate R process.
The `r ref_pkg("callr")` package is recommended for this encapsulation, as it tends to be more reliable than the `r ref_pkg("evaluate")` package, especially for terminating externally compiled code.

```{r 2023-12-21-time-constraints-003}
learner$encapsulate = c(train = "callr", predict = "callr")
```

Note that using `callr` increases the runtime due to the overhead of starting an R process.
Additionally, it's advisable to specify a fallback learner, such as `"classif.featureless"`, to provide baseline predictions in case the primary learner is terminated.

```{r 2023-12-21-time-constraints-004}
learner$fallback = lrn("classif.featureless")
```

These time constraints are now integrated into the training, resampling, and benchmarking processes.
For more information on encapsulation and fallback learners, see the [mlr3book](https://mlr3book.mlr-org.com/chapters/chapter10/advanced_technical_aspects_of_mlr3.html#sec-error-handling).
The next section will focus on setting time limits for the entire tuning process.

# Tuning

When working with high-performance computing clusters, jobs are often bound by strict time constraints.
Exceeding these limits results in the job being terminated and the loss of any results generated.
Therefore, it's important to ensure that the tuning process is designed to adhere to these time constraints.

The `trm("runtime")` controls the duration of the tuning process.
We must take into account that the terminator can only check if the time limit is reached between batches.
We must therefore set the time lower than the runtime of the job.
How much lower depends on the runtime or time limit of the individual learners.
The last batch should be able to finish before the time limit of the cluster is reached.

```{r 2023-12-21-time-constraints-005}
terminator = trm("run_time", secs = 60)
instance = ti(
task = tsk("sonar"),
learner = learner,
resampling = rsmp("cv", folds = 3),
measures = msr("classif.ce"),
terminator = terminator
)
```

With these settings, our tuning operation is configured to run for 60 seconds, while individual learners are set to terminate after 10 seconds.
This approach ensures the tuning process is efficient and adheres to the constraints imposed by the high-performance computing cluster.

# Nested Resampling

When using [nested resampling](https://mlr3book.mlr-org.com/chapters/chapter4/hyperparameter_optimization.html#sec-nested-resampling), time constraints become more complex as they are applied across various levels.
As before, the time limit for an individual learner during the tuning is set with `$timeout`.
The time limit for the tuning processes in the auto tuners is controlled with the `trm("runtime")`.
It's important to note that once the auto tuner enters the final phase of fitting the model and making predictions on the outer test set, the time limit governed by the terminator no longer applies.
Additionally, the time limit previously set on the learner is temporarily deactivated, allowing the auto tuner to complete its task uninterrupted.
However, a separate time limit can be assigned to each auto tuner using `$timeout`.
This limit encompasses not only the tuning phase but also the time required for fitting the final model and predictions on the outer test set.

The best way to show this is with an example.
We set the time limit for an individual learner to 10 seconds.

```{r 2023-12-21-time-constraints-006}
learner$timeout = c(train = 10, predict = Inf)
learner$encapsulate = c(train = "callr", predict = "callr")
learner$fallback = lrn("classif.featureless")
```

Next, we give each auto tuner 60 seconds to finish the tuning process.

```{r 2023-12-21-time-constraints-007}
terminator = trm("run_time", secs = 60)
```

Furthermore, we impose a 120-second limit for resampling each auto tuner.
This effectively divides the time allocation, with around 60 seconds for tuning and another 60 seconds for final model fitting and predictions on the outer test set.

```{r 2023-12-21-time-constraints-008}
at = auto_tuner(
tuner = tnr("random_search"),
learner = learner,
resampling = rsmp("cv", folds = 3),
measure = msr("classif.ce"),
terminator = trm("run_time", secs = 60)
)
at$timeout = c(train = 100, predict = 20)
at$encapsulate = c(train = "callr", predict = "callr")
at$fallback = lrn("classif.featureless")
```

In total, the entire nested resampling process is designed to be completed within 10 minutes (120 seconds multiplied by 5 folds).

```{r 2023-12-21-time-constraints-009}
#| eval: false
rr = resample(task, at, rsmp("cv", folds = 5))
```

# Conclusion

We delved into the setting of time constraints across different levels in the mlr3 ecosystem.
From individual learners to the complexities of nested resampling, we've seen how effectively managing time limits can significantly enhance the efficiency and reliability of machine learning workflows.
By utilizing the `trm("runtime")` for tuning processes and setting `$timeout` for individual learners and auto tuners, we can ensure that our machine learning tasks are not only effective but also adhere to the practical time constraints of shared computational resources.
For more information, see also the error handling section in the [mlr3book](https://mlr3book.mlr-org.com/chapters/chapter5/advanced_tuning_methods_and_black_box_optimization.html#sec-encapsulation-fallback).

0 comments on commit e33866b

Please sign in to comment.