Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limma mixed models feature #6753

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 8 additions & 3 deletions modules/nf-core/limma/differential/environment.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
name: limma_differential
KamilMaliszArdigen marked this conversation as resolved.
Show resolved Hide resolved
channels:
- conda-forge
- bioconda
- conda-forge
- bioconda
dependencies:
- bioconda::bioconductor-limma=3.54.0
- bioconda::bioconductor-edger=4.0.16
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This multi-package thing still creates difficulties right now- which is why I wrote this module depending on a single Biocontainer.

- bioconda::bioconductor-ihw=1.28.0
- bioconda::bioconductor-limma=3.58.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Limma is a dependency of edger, you don't need to include it here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true however without this we won't be able to control limma version which will be used

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only pin the primary dependency in the modules repo, unless you have a really compelling reason to do otherwise. We will at some point have lock files that pin all dependencies.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(as in other comments, I'm not sure we need edgeR here, so we'd only pin Limma)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well we noticed that results are slightly different depending on limma versions so we decided to pin it to achieve reproducibility. I will adopt your comment in standalone module

Copy link
Member

@pinin4fjords pinin4fjords Oct 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I see your point on this one, apologies. Limma is the primary dependency so it's pinned, but we need edgeR.

- conda-forge::r-dplyr=1.1.4
- conda-forge::r-readr=2.1.5
13 changes: 10 additions & 3 deletions modules/nf-core/limma/differential/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,19 @@ process LIMMA_DIFFERENTIAL {

conda "${moduleDir}/environment.yml"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/bioconductor-limma:3.54.0--r42hc0cfd56_0' :
'biocontainers/bioconductor-limma:3.54.0--r42hc0cfd56_0' }"
'oras://community.wave.seqera.io/library/bioconductor-edger_bioconductor-ihw_bioconductor-limma_r-dplyr_r-readr:7fc48564d286c1c6' :
Copy link
Member

@maxulysse maxulysse Oct 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't have any oras protocol in the modules, it doesn't work with NXF_SINGULARITY_CACHEDIR
Can you swittch to https?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These oras links don't work well with nf-core tooling right now, unfortunately

'community.wave.seqera.io/library/bioconductor-edger_bioconductor-ihw_bioconductor-limma_r-dplyr_r-readr:edea0f9fbaeba3a0' }"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
'community.wave.seqera.io/library/bioconductor-edger_bioconductor-ihw_bioconductor-limma_r-dplyr_r-readr:edea0f9fbaeba3a0' }"
'nf-core/bioconductor-edger_bioconductor-ihw_bioconductor-limma_r-dplyr_r-readr:edea0f9fbaeba3a0' }"

I pulled and pushed to quay.io

So we can change the registry from all container in a simpler way.
We'll be using community.wave.seqera.io as registry when we do the switch

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not keen on this @maxulysse, there's no way for someone without privilege to do this and maintain the module.


input:
tuple val(meta), val(contrast_variable), val(reference), val(target)
tuple val(meta2), path(samplesheet), path(intensities)
val type

output:
tuple val(meta), path("*.limma.results.tsv") , emit: results
tuple val(meta), path("*.limma.mean_difference.png") , emit: md_plot
tuple val(meta), path("*.MArrayLM.limma.rds") , emit: rdata
tuple val(meta), path("*.normalised_counts.tsv") , emit: normalised_counts, optional: true
tuple val(meta), path("*.limma.model.txt") , emit: model
tuple val(meta), path("*.R_sessionInfo.log") , emit: session_info
path "versions.yml" , emit: versions
Expand All @@ -23,5 +25,10 @@ process LIMMA_DIFFERENTIAL {
task.ext.when == null || task.ext.when

script:
template 'limma_de.R'
if (type == 'rnaseq') {
template 'limma_de_rnaseq.R'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not massively keen on the parallel script.

Either the new thing should be a separate module, or they should be properly integrated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How you envision "proper" integration? As single template? I will refactor that if needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, if the two scripts share enough logic, they should be one with a simple conditional. If the logic is quite divergent such that doing that adds too much complexity, these should be separate modules.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I had similar doubts in the one hand this is still limma and differential abundance analysis and on the other site this one is combined with voom and there is lot of differences. So maybe we will create new module named limma/differential-voom? or simply limma/voom - what are your thoughts on the module naming?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be limma/differential-voom.

As in the comments below, I also think this should be, as much as possible, a thin wrapper around Limma functions only. Otherwise it's just your custom script, rather than a 'standard' Limma module.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updating this older comment after newer ones below: I don't believe additional logic required for Voom merits its own module. We can add the Voom part in a conditional, and other changes (e.g. duplicateCorrelation) etc, apply equally to non-Voom Limma.

} else {
template 'limma_de_micro_array.R'
}

}
141 changes: 59 additions & 82 deletions modules/nf-core/limma/differential/meta.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,96 +13,73 @@ tools:
tool_dev_url: https://github.com/cran/limma""
doi: "10.18129/B9.bioc.limma"
licence: ["LGPL >=3"]
identifier: biotools:limma
input:
- - meta:
type: map
description: |
Groovy Map containing contrast information. This can be used at the
workflow level to pass optional parameters to the module, e.g.
[ id:'contrast1', blocking:'patient' ] passed in as ext.args like:
'--blocking_variable $meta.blocking'.
- contrast_variable:
type: string
description: |
The column in the sample sheet that should be used to define groups for
comparison
- reference:
type: string
description: |
The value within the contrast_variable column of the sample sheet that
should be used to derive the reference samples
- target:
type: string
description: |
The value within the contrast_variable column of the sample sheet that
should be used to derive the target samples
- - meta2:
type: map
description: |
Groovy map containing study-wide metadata related to the sample sheet
and matrix
- samplesheet:
type: file
description: Sample sheet file
- intensities:
type: file
description: |
Raw TSV or CSV format expression matrix with probes by row and samples
by column
- meta:
type: map
description: |
Groovy Map containing contrast information. This can be used at the
workflow level to pass optional parameters to the module, e.g.
[ id:'contrast1', blocking:'patient' ] passed in as ext.args like:
'--blocking_variable $meta.blocking'.
- contrast_variable:
type: string
description: |
The column in the sample sheet that should be used to define groups for
comparison
- reference:
type: string
description: |
The value within the contrast_variable column of the sample sheet that
should be used to derive the reference samples
- target:
type: string
description: |
The value within the contrast_variable column of the sample sheet that
should be used to derive the target samples
- meta2:
type: map
description: |
Groovy map containing study-wide metadata related to the sample sheet
and matrix
- samplesheeet:
type: file
description: |
CSV or TSV format sample sheet with sample metadata
- intensities:
type: file
description: |
Raw TSV or CSV format expression matrix with probes by row and samples
by column
- type:
type: string
description: |
Analysis type to be performed determines template which should be used for analysis

output:
- results:
- meta:
type: file
description: TSV-format table of differential expression information as output
by Limma
pattern: "*.limma.results.tsv"
- "*.limma.results.tsv":
type: file
description: TSV-format table of differential expression information as output
by Limma
pattern: "*.limma.results.tsv"
type: file
description: TSV-format table of differential expression information as output by Limma
pattern: "*.limma.results.tsv"
- md_plot:
- meta:
type: file
description: Limma mean difference plot
pattern: "*.mean_difference.png"
- "*.limma.mean_difference.png":
type: file
description: Limma mean difference plot
pattern: "*.mean_difference.png"
type: file
description: Limma mean difference plot
pattern: "*.mean_difference.png"
- rdata:
- meta:
type: file
description: Serialised MArrayLM object
pattern: "*.MArrayLM.limma.rds"
- "*.MArrayLM.limma.rds":
type: file
description: Serialised MArrayLM object
pattern: "*.MArrayLM.limma.rds"
type: file
description: Serialised MArrayLM object
pattern: "*.MArrayLM.limma.rds"
- model:
- meta:
type: file
description: TXT-format limma model
pattern: "*.limma.model.tsv"
- "*.limma.model.txt":
type: file
description: TXT-format limma model
pattern: "*.limma.model.tsv"
type: file
description: TXT-format limma model
pattern: "*.limma.model.tsv"
- session_info:
- meta:
type: file
description: dump of R SessionInfo
pattern: "*.log"
- "*.R_sessionInfo.log":
type: file
description: dump of R SessionInfo
pattern: "*.log"
type: file
description: dump of R SessionInfo
pattern: "*.log"
- versions:
- versions.yml:
type: file
description: File containing software versions
pattern: "versions.yml"
type: file
description: File containing software versions
pattern: "versions.yml"
authors:
- "@pinin4fjords"
maintainers:
Expand Down
Loading
Loading