Skip to content

Commit

Permalink
Added seed argument to diablo_tune()
Browse files Browse the repository at this point in the history
  • Loading branch information
oliviaAB committed Jul 25, 2024
1 parent b39a115 commit dc07f76
Show file tree
Hide file tree
Showing 3 changed files with 76 additions and 39 deletions.
4 changes: 3 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@

- Fixed typo in samples metadata file, samples with no value for "rnaseq_batch" variable now have `NA` rather than `"BNA"` values.

- `perf_splsda()` and `run_splsda()` now have a `seed` argument (hopefully self-explanatory :)). Accordingly, `feature_preselection_splsda_factory` now has arguments `seed_perf` and `seed_run` to pass on seeds to `perf_splsda()` and `run_splsda`.
- `perf_splsda()` and `run_splsda()` now have a `seed` argument (hopefully self-explanatory :)). Accordingly, `feature_preselection_splsda_factory` now has arguments `seed_perf` and `seed_run` to pass on seeds to `perf_splsda()` and `run_splsda()`.

- `create_multiomis_set()` now returns an error if some feature IDs are used across different omics sets. This is to prevent errors further down the line when visualising or subsetting the multi-omics data.

- `diablo_tune()` now has a `seed` argument.
61 changes: 39 additions & 22 deletions R/diablo.R
Original file line number Diff line number Diff line change
Expand Up @@ -365,29 +365,44 @@ diablo_get_optim_ncomp <- function(perf_res, measure = "Overall.BER", distance =

#' Tunes keepX arg for DIABLO
#'
#' Performs cross-validation to estimate the optimal number of features to retain from each dataset for a DIABLO run.
#'
#' The \code{design_matrix} argument can either be a custom design matrix (for example as constructed via the
#' \code{\link{diablo_generate_design_matrix}} function); or a character indicating the type of design matrix
#' to generate. Possible values include:
#' \itemize{
#' \item \code{'null'}: Off-diagonal elements of the design matrix are set to 0;
#' \item \code{'weighted_full'}: Off-diagonal elements of the design matrix are set to 0.1;
#' \item \code{'full'}: Off-diagonal elements of the design matrix are set to 1.
#' }
#'
#' @param mixomics_data A \code{mixOmics} input object created with \code{\link{get_input_mixomics_supervised}}.
#' @param design_matrix Either numeric matrix created through \code{\link{diablo_generate_design_matrix}}, or character
#' (accepted values are \code{'null'}, \code{'weighted_full'}, \code{'full'}). See Details.
#' @param keepX_list Named list, gives for each omics dataset in the mixOmics input (i.e. excluding the response Y) a vector of values
#' to test (i.e. number of features to return from this dataset). If \code{NULL} (default), a standard grid will
#' be applied for each dataset and latent component, testing values: \code{seq(5, 30, 5)}.
#' @param cpus Integer, the number of CPUs to use when running the code in parallel. For advanced users,
#' see the \code{BPPARAM} argument of \code{\link[mixOmics]{tune.block.splsda}}.
#' @param ... Arguments to be passed to the \code{\link[mixOmics]{tune.block.splsda}} function.
#' @return A list, see \code{\link[mixOmics]{tune.block.splsda}}.
#' Performs cross-validation to estimate the optimal number of features to
#' retain from each dataset for a DIABLO run.
#'
#' The `design_matrix`` argument can either be a custom design matrix (for
#' example as constructed via the `diablo_generate_design_matrix` function); or
#' a character indicating the type of design matrix to generate. Possible values
#' include:
#' * `'null'`: Off-diagonal elements of the design matrix are set to 0;
#' * `'weighted_full'`: Off-diagonal elements of the design matrix are set to
#' 0.1;
#' * `'full'`: Off-diagonal elements of the design matrix are set to 1.
#'
#' @param mixomics_data A `mixOmics` input object created with
#' `get_input_mixomics_supervised()`.
#' @param design_matrix Either numeric matrix created through
#' `diablo_generate_design_matrix`, or character (accepted values are
#' `'null'`, `'weighted_full'`, `'full'`). See Details.
#' @param keepX_list Named list, gives for each omics dataset in the mixOmics
#' input (i.e. excluding the response Y) a vector of values to test (i.e.
#' number of features to return from this dataset). If `NULL` (default), a
#' standard grid will be applied for each dataset and latent component,
#' testing values: `seq(5, 30, 5)`.
#' @param cpus Integer, the number of CPUs to use when running the code in
#' parallel. For advanced users, see the \code{BPPARAM} argument of
#' [mixOmics::tune.block.splsda()].
#' @param seed Integer, seed to use. Default is `NULL`, i.e. no seed is set
#' inside the function.
#' @param ... Arguments to be passed to the [mixOmics::tune.block.splsda()]
#' function.
#' @returns A list, see [mixOmics::tune.block.splsda()].
#' @export
diablo_tune <- function(mixomics_data, design_matrix, keepX_list = NULL, cpus = NULL, ...) {
diablo_tune <- function(mixomics_data,
design_matrix,
keepX_list = NULL,
cpus = NULL,
seed = NULL,
...) {

## Take care of the design matrix
datasets_name <- setdiff(names(mixomics_data), "Y")

Expand Down Expand Up @@ -419,6 +434,8 @@ diablo_tune <- function(mixomics_data, design_matrix, keepX_list = NULL, cpus =

BPPARAM <- .mixomics_cpus_to_bparam(cpus)

if (!is.null(seed)) set.seed(seed)

mixOmics::tune.block.splsda(
X = mixomics_data[datasets_name],
Y = mixomics_data[["Y"]],
Expand Down
50 changes: 34 additions & 16 deletions man/diablo_tune.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit dc07f76

Please sign in to comment.