Skip to content

Commit

Permalink
feat: add tutorial guidelines for survival analysis with omics data (#…
Browse files Browse the repository at this point in the history
  • Loading branch information
be-marc authored Aug 25, 2023
1 parent eb8873f commit 341d281
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 4 deletions.
23 changes: 19 additions & 4 deletions mlr-org/publications.bib
Original file line number Diff line number Diff line change
Expand Up @@ -71,13 +71,13 @@ @article{pargent2023tutorial
year = {2023},
doi = {10.1177/25152459231162559},

URL = {
URL = {
https://doi.org/10.1177/25152459231162559
},
eprint = {
eprint = {
https://doi.org/10.1177/25152459231162559
}
,
abstract = { Supervised machine learning (ML) is becoming an influential analytical method in psychology and other social sciences. However, theoretical ML concepts and predictive-modeling techniques are not yet widely taught in psychology programs. This tutorial is intended to provide an intuitive but thorough primer and introduction to supervised ML for psychologists in four consecutive modules. After introducing the basic terminology and mindset of supervised ML, in Module 1, we cover how to use resampling methods to evaluate the performance of ML models (bias-variance trade-off, performance measures, k-fold cross-validation). In Module 2, we introduce the nonlinear random forest, a type of ML model that is particularly user-friendly and well suited to predicting psychological outcomes. Module 3 is about performing empirical benchmark experiments (comparing the performance of several ML models on multiple data sets). Finally, in Module 4, we discuss the interpretation of ML models, including permutation variable importance measures, effect plots (partial-dependence plots, individual conditional-expectation profiles), and the concept of model fairness. Throughout the tutorial, intuitive descriptions of theoretical concepts are provided, with as few mathematical formulas as possible, and followed by code examples using the mlr3 and companion packages in R. Key practical-analysis steps are demonstrated on the publicly available PhoneStudy data set (N = 624), which includes more than 1,800 variables from smartphone sensing to predict Big Five personality trait scores. The article contains a checklist to be used as a reminder of important elements when performing, reporting, or reviewing ML analyses in psychology. Additional examples and more advanced concepts are demonstrated in online materials (https://osf.io/9273g/). }
Expand All @@ -95,3 +95,18 @@ @article{bischl_hyperparameter_2021
month = jul,
year = {2021}
}

@misc{Zhao2023,
abstract = {Identification of genomic, molecular and clinical markers predictive of patient survival is important for developing personalized disease prevention, diagnostic and treatment approaches. Modern omics technologies have made it possible to investigate the prognostic impact of markers at multiple molecular levels, including genotype, DNA methylation, transcriptomics, proteomics and metabolomics, and how these risk factors complement clinical characterization of patients for prognostic prediction. However, the massive omics data pose challenges for studying relationships between the molecular information and patients' survival outcomes. We demonstrate a general workflow of survival analysis, with emphasis on dealing with high-dimensional omics features, using both univariate and multivariate approaches. In particular, we describe commonly used Cox-type penalized regressions and Bayesian models for feature selection in survival analysis with multi-omics data, where caution is needed to account for the underlying structure both within and between omics data sets. A step-by-step R tutorial using TCGA survival and omics data for the execution and evaluation of survival models has been made available at https://ocbe-uio.github.io/survomics/survomics.html.},
archivePrefix = {arXiv},
arxivId = {2302.12542},
author = {Zhao, Zhi and Zobolas, John and Zucknick, Manuela and Aittokallio, Tero},
doi = {10.48550/arxiv.2302.12542},
eprint = {2302.12542},
keywords = {Time-to-event data,feature selection,machine learning,model calibration,multi-omics,penalized regressions,sparse Bayesian models,survival prediction},
month = {feb},
pages = {1--13},
title = {{Tutorial on survival modelling with omics data}},
url = {https://arxiv.org/abs/2302.12542v2},
year = {2023}
}
1 change: 1 addition & 0 deletions mlr-org/resources.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -102,3 +102,4 @@ A more scientific view on our packages and the packages we depend on.
## Tutorial Papers

* @pargent2023tutorial: An Introduction to Machine Learning for Psychologists in R
* @Zhao2023: Tutorial Guidelines for Survival Analysis with Omics Data. [Website](https://ocbe-uio.github.io/survomics/survomics.html).

0 comments on commit 341d281

Please sign in to comment.