Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using Rcongas+ only in the scATAC modality #35

Open
ccruizm opened this issue Jan 30, 2024 · 5 comments
Open

Using Rcongas+ only in the scATAC modality #35

ccruizm opened this issue Jan 30, 2024 · 5 comments
Assignees
Labels
help wanted Extra attention is needed

Comments

@ccruizm
Copy link

ccruizm commented Jan 30, 2024

Good day!

I have read the documentation in https://caravagnalab.github.io/rcongas/index.html, and it provides valuable information on how to run Rcongas on multimodal data, but it is not clear to me whether it can work only using the scATAC modality. If so, what adjustments would I need to make to run it correctly?

Also, after running the pipeline, do the results also include the classification of a cell to be either diploid or aneuploid?

Thanks in advance!

@caravagn
Copy link
Collaborator

Hi, I think it should work in that case.

@lucreziaPatruno can you provide some basic steps?

@caravagn caravagn added the help wanted Extra attention is needed label Jan 30, 2024
@lucreziaPatruno
Copy link
Contributor

Hi @ccruizm,
thanks for your interest in CONGAS+. In order to use only the ATAC modality you can follow the steps presented in the tutorial for building the ATAC count tibble, and then in order to create the CONGAS+ object you can use this code:

x = init(
  rna = NULL,
  atac = atac,
  segmentation = segments, 
  rna_normalisation_factors = NULL,
  atac_normalisation_factors = norm_atac,
  atac_likelihood = 'NB',
  description = '')

Next, you can follow the tutorial for fitting a model, setting the lambda parameter of fit_congas to 0:

fit_filt <- Rcongas:::(filt,
                                 K = k, 
                                 lambdas = 0, 
                                 learning_rate = lr, 
                                 steps = steps,
                                 model_parameters = hyperparams_filt, 
                                 model_selection = model,
                                 latent_variables = "G",
                                 CUDA = FALSE,
                                 temperature = temperature, 
                                 same_mixing = TRUE, 
                                 threshold = 0.001)

Hope this helps, please let me know if you have further questions.

BW
Lucrezia

@ccruizm
Copy link
Author

ccruizm commented Feb 14, 2024

Thanks @lucreziaPatruno for the code!

I am running into some issues:

  1. When running segments_selector_congas I get the following error:
 ▣ / [~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~] 100% [ETA  0s] ▶ 00:00:19
                                                                                          

── (R)CONGAS+ hyperparameters auto-config ──────────────────────────────────────────────────────────────────────────────────────────────────────────────



── ATAC modality ──



→ Negative Binomial likelihood, estimating Gamma shape and rate



── Estimating segment factors 

→ 1: chrY:10400000:57227415 theta_shape = 2.11866186334306, theta_rate = 0.0837168665702484



──  (R)CONGAS+  Variational Inference ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────



── Fit with k = 1 and lambda = 0.5. 



── Fit with k = 2 and lambda = 0.5. 



── Fit with k = 3 and lambda = 0.5. 

[easypar] 3/3 computations returned errors and will be removed.





── (R)CONGAS+ fits completed in 39ms. ──





Error in `f()`:
! Argument 1 must be a data frame or a named atomic vector.
Traceback:

1. segments_selector_congas(filt)
2. Reduce(bind_rows, report)
3. f(init, x[[i]])
4. abort(glue("Argument {i} must be a data frame or a named atomic vector."))
5. signal_abort(cnd, .file)

Then I skipped this one and ran fit_congas using the code you provided (auto_config_run did not give any error), but got an error:

✖ Warning ATAC 0-counts cells. 5795 cells have no data in any of 45 segments, top 5 with missing data are:

✖ Cell atac_T19-91014_CTCACCACATGGTAAA-1 with 5 0-segments (11%)

✖ Cell atac_T19-91014_GAACCGCCAATGACTC-1 with 5 0-segments (11%)

✖ Cell atac_T19-91014_GACCTTCCATACAACC-1 with 5 0-segments (11%)

✖ Cell atac_T19-91014_ACCATCCAGTACCCAT-1 with 4 0-segments (9%)

✖ Cell atac_T19-91014_ACCGGGTTCATCGCTC-1 with 4 0-segments (9%)




──  (R)CONGAS+  Variational Inference ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────



── Fit with k = 1 and lambda = 0. 



── Fit with k = 2 and lambda = 0. 



── Fit with k = 3 and lambda = 0. 



── Fit with k = 4 and lambda = 0. 

[easypar] 4/4 computations returned errors and will be removed.





── (R)CONGAS+ fits completed in 48ms. ──





Error in `pull()`:
! Can't extract columns that don't exist.
✖ Column `BIC` doesn't exist.
Traceback:

1. Rcongas:::fit_congas(filt, K = k, lambdas = 0, learning_rate = lr, 
 .     steps = steps, model_parameters = hyperparams_filt, model_selection = model, 
 .     latent_variables = "G", CUDA = TRUE, temperature = temperature, 
 .     same_mixing = TRUE, threshold = 0.001)
2. order(model_selection_df %>% pull(!!model_selection))
3. model_selection_df %>% pull(!!model_selection)
4. pull(., !!model_selection)
5. pull.data.frame(., !!model_selection)
6. tidyselect::vars_pull(names(.data), !!enquo(var))
7. pull_as_location2(loc, n, vars, error_arg = error_arg, error_call = error_call)
8. with_subscript_errors(type = "pull", {
 .     i <- vctrs::vec_as_subscript2(i, logical = "error", arg = error_arg, 
 .         call = error_call)
 .     if (length(i) != 1) {
 .         cli::cli_abort("{.arg {error_arg}} must select exactly one column.", 
 .             call = error_call)
 .     }
 .     if (is.numeric(i)) {
 .         vctrs::num_as_location2(i, n = n, negative = "ignore", 
 .             arg = error_arg, call = error_call)
 .     }
 .     else {
 .         vctrs::vec_as_location2(i, n = n, names = names, arg = error_arg, 
 .             call = error_call, )
 .     }
 . })
9. try_fetch(expr, vctrs_error_subscript = function(cnd) {
 .     cnd$subscript_action <- subscript_action(type)
 .     cnd$subscript_elt <- "column"
 .     cnd_signal(cnd)
 . })
10. withCallingHandlers(expr, condition = function(cnd) {
  .     {
  .         .__handler_frame__. <- TRUE
  .         .__setup_frame__. <- frame
  .         if (inherits(cnd, "message")) {
  .             except <- c("warning", "error")
  .         }
  .         else if (inherits(cnd, "warning")) {
  .             except <- "error"
  .         }
  .         else {
  .             except <- ""
  .         }
  .     }
  .     while (!is_null(cnd)) {
  .         if (inherits(cnd, "vctrs_error_subscript")) {
  .             out <- handlers[[1L]](cnd)
  .             if (!inherits(out, "rlang_zap")) 
  .                 throw(out)
  .         }
  .         inherit <- .subset2(.subset2(cnd, "rlang"), "inherit")
  .         if (is_false(inherit)) {
  .             return()
  .         }
  .         cnd <- .subset2(cnd, "parent")
  .     }
  . })
11. vctrs::vec_as_location2(i, n = n, names = names, arg = error_arg, 
  .     call = error_call, )
12. result_get(vec_as_location2_result(i, n = n, names = names, negative = "error", 
  .     missing = missing, arg = arg, call = call))
13. cnd_signal(x$err)
14. signal_abort(cnd)
15. signalCondition(cnd)
16. (function (cnd) 
  . {
  .     {
  .         .__handler_frame__. <- TRUE
  .         .__setup_frame__. <- frame
  .         if (inherits(cnd, "message")) {
  .             except <- c("warning", "error")
  .         }
  .         else if (inherits(cnd, "warning")) {
  .             except <- "error"
  .         }
  .         else {
  .             except <- ""
  .         }
  .     }
  .     while (!is_null(cnd)) {
  .         if (inherits(cnd, "vctrs_error_subscript")) {
  .             out <- handlers[[1L]](cnd)
  .             if (!inherits(out, "rlang_zap")) 
  .                 throw(out)
  .         }
  .         inherit <- .subset2(.subset2(cnd, "rlang"), "inherit")
  .         if (is_false(inherit)) {
  .             return()
  .         }
  .         cnd <- .subset2(cnd, "parent")
  .     }
  . })(structure(list(message = "", trace = structure(list(call = list(
  .     IRkernel::main(), kernel$run(), handle_shell(), executor$execute(msg), 
  .     tryCatch(evaluate(request$content$code, envir = .GlobalEnv, 
  .         output_handler = oh, stop_on_error = 1L), interrupt = function(cond) {
  .         log_debug("Interrupt during execution")
  .         interrupted <<- TRUE
  .     }, error = .self$handle_error), tryCatchList(expr, classes, 
  .         parentenv, handlers), tryCatchOne(tryCatchList(expr, 
  .         names[-nh], parentenv, handlers[-nh]), names[nh], parentenv, 
  .         handlers[[nh]]), doTryCatch(return(expr), name, parentenv, 
  .         handler), tryCatchList(expr, names[-nh], parentenv, handlers[-nh]), 
  .     tryCatchOne(expr, names, parentenv, handlers[[1L]]), doTryCatch(return(expr), 
  .         name, parentenv, handler), evaluate(request$content$code, 
  .         envir = .GlobalEnv, output_handler = oh, stop_on_error = 1L), 
  .     evaluate_call(expr, parsed$src[[i]], envir = envir, enclos = enclos, 
  .         debug = debug, last = i == length(out), use_try = stop_on_error != 
  .             2L, keep_warning = keep_warning, keep_message = keep_message, 
  .         log_echo = log_echo, log_warning = log_warning, output_handler = output_handler, 
  .         include_timing = include_timing), timing_fn(handle(ev <- withCallingHandlers(withVisible(eval_with_user_handlers(expr, 
  .         envir, enclos, user_handlers)), warning = wHandler, error = eHandler, 
  .         message = mHandler))), handle(ev <- withCallingHandlers(withVisible(eval_with_user_handlers(expr, 
  .         envir, enclos, user_handlers)), warning = wHandler, error = eHandler, 
  .         message = mHandler)), try(f, silent = TRUE), tryCatch(expr, 
  .         error = function(e) {
  .             call <- conditionCall(e)
  .             if (!is.null(call)) {
  .                 if (identical(call[[1L]], quote(doTryCatch))) 
  .                   call <- sys.call(-4L)
  .                 dcall <- deparse(call, nlines = 1L)
  .                 prefix <- paste("Error in", dcall, ": ")
  .                 LONG <- 75L
  .                 sm <- strsplit(conditionMessage(e), "\n")[[1L]]
  .                 w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], 
  .                   type = "w")
  .                 if (is.na(w)) 
  .                   w <- 14L + nchar(dcall, type = "b") + nchar(sm[1L], 
  .                     type = "b")
  .                 if (w > LONG) 
  .                   prefix <- paste0(prefix, "\n  ")
  .             }
  .             else prefix <- "Error : "
  .             msg <- paste0(prefix, conditionMessage(e), "\n")
  .             .Internal(seterrmessage(msg[1L]))
  .             if (!silent && isTRUE(getOption("show.error.messages"))) {
  .                 cat(msg, file = outFile)
  .                 .Internal(printDeferredWarnings())
  .             }
  .             invisible(structure(msg, class = "try-error", condition = e))
  .         }), tryCatchList(expr, classes, parentenv, handlers), 
  .     tryCatchOne(expr, names, parentenv, handlers[[1L]]), doTryCatch(return(expr), 
  .         name, parentenv, handler), withCallingHandlers(withVisible(eval_with_user_handlers(expr, 
  .         envir, enclos, user_handlers)), warning = wHandler, error = eHandler, 
  .         message = mHandler), withVisible(eval_with_user_handlers(expr, 
  .         envir, enclos, user_handlers)), eval_with_user_handlers(expr, 
  .         envir, enclos, user_handlers), eval(expr, envir, enclos), 
  .     eval(expr, envir, enclos), Rcongas:::fit_congas(filt, K = k, 
  .         lambdas = 0, learning_rate = lr, steps = steps, model_parameters = hyperparams_filt, 
  .         model_selection = model, latent_variables = "G", CUDA = TRUE, 
  .         temperature = temperature, same_mixing = TRUE, threshold = 0.001), 
  .     order(model_selection_df %>% pull(!!model_selection)), model_selection_df %>% 
  .         pull(!!model_selection), pull(., !!model_selection), 
  .     pull.data.frame(., !!model_selection), tidyselect::vars_pull(names(.data), 
  .         !!enquo(var)), pull_as_location2(loc, n, vars, error_arg = error_arg, 
  .         error_call = error_call), with_subscript_errors(type = "pull", 
  .         {
  .             i <- vctrs::vec_as_subscript2(i, logical = "error", 
  .                 arg = error_arg, call = error_call)
  .             if (length(i) != 1) {
  .                 cli::cli_abort("{.arg {error_arg}} must select exactly one column.", 
  .                   call = error_call)
  .             }
  .             if (is.numeric(i)) {
  .                 vctrs::num_as_location2(i, n = n, negative = "ignore", 
  .                   arg = error_arg, call = error_call)
  .             }
  .             else {
  .                 vctrs::vec_as_location2(i, n = n, names = names, 
  .                   arg = error_arg, call = error_call, )
  .             }
  .         }), try_fetch(expr, vctrs_error_subscript = function(cnd) {
  .         cnd$subscript_action <- subscript_action(type)
  .         cnd$subscript_elt <- "column"
  .         cnd_signal(cnd)
  .     }), withCallingHandlers(expr, condition = function(cnd) {
  .         {
  .             .__handler_frame__. <- TRUE
  .             .__setup_frame__. <- frame
  .             if (inherits(cnd, "message")) {
  .                 except <- c("warning", "error")
  .             }
  .             else if (inherits(cnd, "warning")) {
  .                 except <- "error"
  .             }
  .             else {
  .                 except <- ""
  .             }
  .         }
  .         while (!is_null(cnd)) {
  .             if (inherits(cnd, "vctrs_error_subscript")) {
  .                 out <- handlers[[1L]](cnd)
  .                 if (!inherits(out, "rlang_zap")) 
  .                   throw(out)
  .             }
  .             inherit <- .subset2(.subset2(cnd, "rlang"), "inherit")
  .             if (is_false(inherit)) {
  .                 return()
  .             }
  .             cnd <- .subset2(cnd, "parent")
  .         }
  .     }), vctrs::vec_as_location2(i, n = n, names = names, arg = error_arg, 
  .         call = error_call, ), result_get(vec_as_location2_result(i, 
  .         n = n, names = names, negative = "error", missing = missing, 
  .         arg = arg, call = call)), vec_as_location2_result(i, 
  .         n = n, names = names, negative = "error", missing = missing, 
  .         arg = arg, call = call), tryCatch(vec_as_location(i, 
  .         n, names = names, arg = arg, call = call), vctrs_error_subscript = function(err) {
  .         err[["subscript_scalar"]] <- TRUE
  .         err <<- err
  .         i
  .     }), tryCatchList(expr, classes, parentenv, handlers), tryCatchOne(expr, 
  .         names, parentenv, handlers[[1L]]), doTryCatch(return(expr), 
  .         name, parentenv, handler), vec_as_location(i, n, names = names, 
  .         arg = arg, call = call), `<fn>`(), stop_subscript_oob(i = i, 
  .         subscript_type = subscript_type, names = names, subscript_action = subscript_action, 
  .         subscript_arg = subscript_arg, call = call), stop_subscript(class = "vctrs_error_subscript_oob", 
  .         i = i, subscript_type = subscript_type, ..., call = call), 
  .     abort(class = c(class, "vctrs_error_subscript"), i = i, ..., 
  .         call = call)), parent = c(0L, 1L, 2L, 3L, 4L, 5L, 6L, 
  . 7L, 6L, 9L, 10L, 4L, 12L, 13L, 13L, 15L, 16L, 17L, 18L, 19L, 
  . 13L, 13L, 13L, 23L, 24L, 0L, 26L, 26L, 0L, 0L, 30L, 31L, 32L, 
  . 33L, 34L, 32L, 36L, 36L, 38L, 39L, 40L, 41L, 38L, 0L, 44L, 45L, 
  . 46L), visible = c(TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, 
  . TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, 
  . TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, 
  . FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, 
  . FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE), namespace = c("IRkernel", 
  . NA, "IRkernel", NA, "base", "base", "base", "base", "base", "base", 
  . "base", "evaluate", "evaluate", "evaluate", "evaluate", "base", 
  . "base", "base", "base", "base", "base", "base", "evaluate", "base", 
  . "base", "Rcongas", "base", NA, "dplyr", "dplyr", "tidyselect", 
  . "tidyselect", "tidyselect", "rlang", "base", "vctrs", "vctrs", 
  . "vctrs", "base", "base", "base", "base", "vctrs", "vctrs", "vctrs", 
  . "vctrs", "rlang"), scope = c("::", NA, "local", NA, "::", "local", 
  . "local", "local", "local", "local", "local", "::", ":::", "local", 
  . "local", "::", "::", "local", "local", "local", "::", "::", ":::", 
  . "::", "::", ":::", "::", NA, "::", ":::", "::", ":::", ":::", 
  . "::", "::", "::", ":::", ":::", "::", "local", "local", "local", 
  . "::", "local", ":::", ":::", "::"), error_frame = c(FALSE, FALSE, 
  . FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, 
  . FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, 
  . FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, 
  . TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, 
  . FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE
  . )), row.names = c(NA, -47L), version = 2L, class = c("rlang_trace", 
  . "rlib_trace", "tbl", "data.frame")), parent = NULL, i = "BIC", 
  .     subscript_type = "character", names = c("hyperparameter_K", 
  .     "K", "K_rna", "K_atac", "lambda"), subscript_action = NULL, 
  .     subscript_arg = "!!enquo(var)", rlang = list(inherit = TRUE), 
  .     call = pull(., !!model_selection), subscript_scalar = TRUE), class = c("vctrs_error_subscript_oob", 
  . "vctrs_error_subscript", "rlang_error", "error", "condition")))
17. handlers[[1L]](cnd)
18. cnd_signal(cnd)
19. signal_abort(cnd)

Where should I add the missing column?

Thanks for the help!

@Militeee
Copy link
Collaborator

Hi @ccruizm,

Unfortunately, it is very hard for us to debug internal errors with the reticulate interface we have right now.
Can you by any chance provide a minimal reproducing example for the error? I'll try to work on it ASAP.

Regarding your question: no we don't automatically classify aneuploid vs diploid, but generally unless it is very small the diploid cluster is quite obvious.

Cheers,
S

@lucreziaPatruno
Copy link
Contributor

Hi @ccruizm,
It's also possible that something went wrong in the reticulate setup and CONGAS+ inference doesn't start, hence why you get that error.
Can you please try to load congas through reticulate like this:

library(reticulate)
cg = reticulate::import("congas")

and please let me know whether you have any issue or if you are able to load it without any error.

BW
Lucrezia

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants