Extra pool in Code objects prevents S4 objects from collection #1288

fikovnik · 2024-06-20T08:10:48Z

Consider the following code (simplified version of the R methods package test):

A <- setRefClass("A")
A$methods(f = function(i) { print(i) })

bug <- TRUE

# class B inherits from A
B <- setRefClass("B", contains = "A")
B$methods(
  # destructor that sets the global flag `bug` to FALSE
  finalize = function() { bug <<- FALSE },

  # method that calls `f` from the superclass
  g = function() {
    # R at its best - first pull out the method `f` from super class into the current environment
    usingMethods("f")
    # call it using lapply - the match.fun in lapply will resolve the string to a closure
    lapply(1, "f")
})

# new instance of class B
b <- B()

# call B$g which in turn calls A$f printing 1
b$g()

# removes b
rm(b)

# since b is not referenced from anywhere, it should be collected
# and have its finalizer called
gc()

# check that the finalized has run
stopifnot(!bug)

In vanilla R it works well.
On master it works well as long as the number of type feedback slots for the observed callees is less than 13.

Here is what happens:

The lapply function gets compiled to RIR.
It contains a call to .Internal(lapply, xs, FUN) which we special-case (inline) in the RIR compiler turning it into a loop with one call to FUN(xs[[i]]). This call will create a type feedback slot for the observed callees.
Once we call it from the B$g the interpreter will record the callee, i.e. the closure A$f into the type feedback.
The closure is S4 method, in its environment, it contains reference to .self which in turn will prevent it from collection.

The way the observed callees recording works is that first the closure gets stored in the extra pool of the corresponding caller code object, i.e. the 'lapply' function, and then the index of this pool entry gets stored in the type feedback.
Normally, we keep just 3 callees per call site. Since lapply is used for a whole lot of things before we exhaust the available slots way before we call the b$g(). However, when we increase the number of slots, the bug manifests. It even manifests when PIR is disabled (PIR_ENABLE=off)

This happened when @stepam38 added the support for the contextualized type feedback.

IMHO: It is a fairly weird test: relying on a call to gc() to collect an object is wrong. Having finalized together with gc also wrong.

The text was updated successfully, but these errors were encountered:

fikovnik added the bug label Jun 20, 2024

fikovnik mentioned this issue Jun 20, 2024

Calling XLENGTH on S4 objects #1289

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extra pool in Code objects prevents S4 objects from collection #1288

Extra pool in Code objects prevents S4 objects from collection #1288

fikovnik commented Jun 20, 2024 •

edited by skrynski

Loading

Extra pool in Code objects prevents S4 objects from collection #1288

Extra pool in Code objects prevents S4 objects from collection #1288

Comments

fikovnik commented Jun 20, 2024 • edited by skrynski Loading

fikovnik commented Jun 20, 2024 •

edited by skrynski

Loading