-
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: remove srcref after leanification #89
Conversation
good find, hadn't thought about the problem that srcref can take up lots of memory |
I still don't understand why the measures object size depends on the packages that are loaded, do you have an idea why? e.g. when saving a learner state (when having installed mlr3 with --with-keep.source) the result returned by |
Consider: library(mlr3verse)
#> Loading required package: mlr3
library(mlr3)
task = tsk("iris")
learner = lrn("classif.rpart")
learner$train(task)
pth = tempfile(fileext = ".rds")
saveRDS(learner$state, pth)
x = readRDS(pth)
pryr::object_size(x)
#> 19.49 MB vs library(mlr3)
task = tsk("iris")
learner = lrn("classif.rpart")
learner$train(task)
pth = tempfile(fileext = ".rds")
saveRDS(learner$state, pth)
x = readRDS(pth)
pryr::object_size(x)
#> 4.00 MB |
It gets worse: library("mlr3")
task = tsk("iris")
learner = lrn("classif.rpart")
learner$train(task)
pth = tempfile(fileext = ".rds")
saveRDS(learner$state, pth)
x = readRDS(pth)
pryr::object_size(x)
#> 4.00 MB
x$train_task$help
#> function() {
#> open_help(self$man)
#> }
#> <environment: 0x563addc97780>
pryr::object_size(x)
#> 1.08 MB probably some kind of promise being evaluated |
The substitute(lines, attr(attr(x$train_task$help, "srcref"), "srcfile")$original)
#> lazyLoadDBfetch(c(344L, 114431L), datafile, compressed, envhook) |
Thanks! so the promise ensures that some object (whose size depends on the loaded packages) is part of the rds file and once the promise is evaluated this data is freed and the object size changes, correct? |
What is happening is that the The offender here is the (Don't know how to inspect the promise's environment with base R, and even prominfo <- evalq(pi(lines), list(pi = pryr::promise_info), attr(attr(x$train_task$help, "srcref"), "srcfile")$original)
prom_env <- prominfo$env
names(prom_env$envenv)
#> [1] "env::150" "env::151" "env::152" "env::10" "env::157" "env::13"
#> ....... It appears to contain lots of environments. Maybe they are all environments that can be accessed from within Interestingly, printing a single function from within x = readRDS(pth)
y = readRDS(pth)
pryr::object_size(x)
#> 4.00 MB
pryr::object_size(y)
#> 4.00 MB
x$train_task$help
#> function() {
#> open_help(self$man)
#> }
#> <environment: 0x563addc97780>
pryr::object_size(x)
#> 1.08 MB
pryr::object_size(y)
#> 4.00 MB It is also possible to trigger the library(mlr3verse)
#> Loading required package: mlr3
library(mlr3)
task = tsk("iris")
learner = lrn("classif.rpart")
learner$train(task)
pth = tempfile(fileext = ".rds")
learner$state$train_task$help
#> function() {
#> open_help(self$man)
#> }
#> <environment: 0x563addc97780>
saveRDS(learner$state, pth)
x = readRDS(pth)
pryr::object_size(x)
#> 1.08 MB |
When compiling R with
--with-keep.source
, serialized objects were gigantic (and dependent on the loaded packages), see this issue: #88I tested that when installing mlr3 with
--with-keep.source
with this version of mlr3misc, the problem disappears.This also caused the failed workflows in the mlr3book
@berndbischl @mllg @mb706