Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Purecn/run #3140

Merged
merged 36 commits into from
Jul 14, 2023
Merged
Show file tree
Hide file tree
Changes from 28 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
563a8ed
Set up nf-core template for purecn/run module
aldosr Mar 1, 2023
1cbba6c
Add run script and I/O
aldosr Mar 1, 2023
24dfc84
Clean TODOs and set main script
aldosr Mar 27, 2023
148be9f
Merge branch 'master' into purecn/run
aldosr Mar 27, 2023
fe68198
Fix duplicate entry
aldosr Mar 27, 2023
1188f63
Merge branch 'master' into purecn/run
aldosr May 30, 2023
7efedb6
Set up main script
aldosr May 30, 2023
69b4b5f
[ci skip] Format with prettier
aldosr May 30, 2023
b9abf03
Merge branch 'master' into purecn/run
aldosr Jun 6, 2023
8054685
Merge branch 'master' into purecn/run
aldosr Jul 4, 2023
218986d
[ci skip] Address review question and set up stub
aldosr Jul 4, 2023
9e85346
Merge remote-tracking branch 'origin/purecn/run' into purecn/run
aldosr Jul 4, 2023
4db4cea
Merge branch 'master' into purecn/run
aldosr Jul 4, 2023
8515a53
[ci skip] Set stub for testing
aldosr Jul 5, 2023
d21dda4
Merge branch 'master' into purecn/run
aldosr Jul 5, 2023
1f1ad8a
Reformat with prettier
aldosr Jul 5, 2023
397ff7b
Merge remote-tracking branch 'origin/purecn/run' into purecn/run
aldosr Jul 5, 2023
56b9f9d
[CI skip] Adjust some typos
aldosr Jul 6, 2023
0024556
[CI skip] Set up test script using stub
aldosr Jul 6, 2023
4fcf9d9
Set up test yml
aldosr Jul 6, 2023
08e7352
Merge branch 'master' into purecn/run
aldosr Jul 6, 2023
205abd4
Reformat with prettier
aldosr Jul 6, 2023
7665ef3
Merge remote-tracking branch 'origin/purecn/run' into purecn/run
aldosr Jul 6, 2023
c77c9ec
Set up meta.yml file and fix typos
aldosr Jul 6, 2023
de57416
Reformat with prettier
aldosr Jul 6, 2023
2247914
Fix typo
aldosr Jul 6, 2023
ceace12
Remove quay.io from container string
aldosr Jul 6, 2023
41c8602
Merge branch 'master' into purecn/run
aldosr Jul 6, 2023
ee8f831
Merge branch 'master' into purecn/run
aldosr Jul 12, 2023
0aa72d7
Merge branch 'master' into purecn/run
aldosr Jul 12, 2023
1210b43
Remove optional input argument
aldosr Jul 14, 2023
a3acfbd
Add optional outputs and clean non-mandatory parameters
aldosr Jul 14, 2023
aa961d0
Clean non-mandatory outputs
aldosr Jul 14, 2023
a4e7d59
Reformat outputs
aldosr Jul 14, 2023
707633f
Fix minor and address reviews
aldosr Jul 14, 2023
69e226b
Merge branch 'master' into purecn/run
aldosr Jul 14, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
111 changes: 111 additions & 0 deletions modules/nf-core/purecn/run/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
process PURECN_RUN {
tag "$meta.id"
label 'process_medium'

// WARN: Version information not provided by tool on CLI. Please update version string below when bumping container versions.
conda "bioconda::bioconductor-purecn=2.4.0 bioconda::bioconductor-txdb.hsapiens.ucsc.hg38.knowngene=3.16.0 bioconductor-txdb.hsapiens.ucsc.hg19.knowngene=3.2.2 bioconda::bioconductor-org.hs.eg.db=3.16.0"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/mulled-v2-582ac26068889091d5e798347c637f8208d77a71:a29c64a63498b1ee8b192521fdf6ed3c65506994-0':
'biocontainers/mulled-v2-582ac26068889091d5e798347c637f8208d77a71:a29c64a63498b1ee8b192521fdf6ed3c65506994-0' }"

input:
tuple val(meta), path(intervals), path(coverage), path(vcf)
path normal_db
val genome
lbeltrame marked this conversation as resolved.
Show resolved Hide resolved

output:
tuple val(meta), path("*.csv") , emit: csv
tuple val(meta), path("*_variants.csv") , emit: variants_csv
tuple val(meta), path("*.pdf") , emit: pdf
tuple val(meta), path("*.rds") , emit: rds
tuple val(meta), path("*_amplification_pvalues.csv") , emit: amplification_pvalues_csv
tuple val(meta), path("*_chromosomes.pdf") , emit: chr_pdf
tuple val(meta), path("*_dnacopy.seg") , emit: seg
tuple val(meta), path("*_genes.csv") , emit: genes_csv
tuple val(meta), path("*_local_optima.pdf") , emit: local_optima_pdf
tuple val(meta), path("*.log") , emit: log
tuple val(meta), path("*_loh.vcf") , emit: loh_vcf
tuple val(meta), path("*_loh.vcf.gz") , emit: loh_vcf_gz
tuple val(meta), path("*_loh.vcf.gz.tbi") , emit: loh_vcf_tbi
tuple val(meta), path("*_loh.csv") , emit: loh_csv
tuple val(meta), path("*_loh-effects-stats.csv") , emit: loh_stats_csv
tuple val(meta), path("*_loh-effects-stats.genes.txt") , emit: loh_effects_txt
tuple val(meta), path("*_loh-effects-stats.html") , emit: loh_effects_html
tuple val(meta), path("*_loh-effects.vcf.gz") , emit: loh_effects_vcf
tuple val(meta), path("*_loh-effects.vcf.gz.tbi") , emit: loh_effects_tbi
tuple val(meta), path("*_loh-summary.yaml") , emit: loh_summary_yaml
tuple val(meta), path("*_loh-effects.csv") , emit: loh_effects_csv
tuple val(meta), path("*_segmentation.pdf") , emit: segmentation_pdf
tuple val(meta), path("*_sort_coverage_loess.png") , emit: sort_coverage_loess_png
tuple val(meta), path("*_sort_coverage_loess_qc.txt") , emit: sort_coverage_loess_qc_txt
tuple val(meta), path("*_sort_coverage_loess.txt.gz") , emit: sort_coverage_loess_txt_gz
tuple val(meta), path("*_sort_coverage.txt.gz") , emit: sort_coverage_txt_gz
path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"
def VERSION = '2.4.0' // WARN: Version information not provided by tool on CLI. Please update this string when bumping container versions.

"""
library_path=\$(Rscript -e 'cat(.libPaths(), sep = "\\n")')
Rscript "\$library_path"/PureCN/extdata/PureCN.R \\
--out ./ \\
--tumor ${coverage} \\
--sampleid ${prefix} \\
--vcf ${vcf} \\
--normaldb ${normal_db} \\
--intervals ${intervals} \\
lbeltrame marked this conversation as resolved.
Show resolved Hide resolved
--genome ${genome} \\
--parallel \\
--cores ${task.cpus} \\
--stats-file ${prefix}_stats.txt \\
${args}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
purecn: ${VERSION}
END_VERSIONS
"""

stub:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"
def VERSION = '2.4.0' // WARN: Version information not provided by tool on CLI. Please update this string when bumping container versions.

"""
touch ${prefix}.csv
touch ${prefix}_variants.csv
touch ${prefix}.pdf
touch ${prefix}.rds
touch ${prefix}_amplification_pvalues.csv
touch ${prefix}_chromosomes.pdf
touch ${prefix}_dnacopy.seg
touch ${prefix}_genes.csv
touch ${prefix}_local_optima.pdf
touch ${prefix}.log
touch ${prefix}_loh.vcf
touch ${prefix}_loh.vcf.gz
touch ${prefix}_loh.vcf.gz.tbi
touch ${prefix}_loh.csv
touch ${prefix}_loh-effects-stats.csv
touch ${prefix}_loh-effects-stats.genes.txt
touch ${prefix}_loh-effects-stats.html
touch ${prefix}_loh-effects.vcf.gz
touch ${prefix}_loh-effects.vcf.gz.tbi
touch ${prefix}_loh-summary.yaml
touch ${prefix}_loh-effects.csv
touch ${prefix}_segmentation.pdf
touch ${prefix}_sort_coverage_loess.png
touch ${prefix}_sort_coverage_loess_qc.txt
touch ${prefix}_sort_coverage_loess.txt.gz
touch ${prefix}_sort_coverage.txt.gz
cat <<-END_VERSIONS > versions.yml
"${task.process}":
purecn: ${VERSION}
END_VERSIONS
"""
}
185 changes: 185 additions & 0 deletions modules/nf-core/purecn/run/meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
name: "purecn_run"
description: Run PureCN workflow to normalize, segment and determine purity and ploidy
keywords:
- copy number alteration calling
- hybrid capture sequencing
- targeted sequencing
- DNA sequencing
tools:
- "purecn":
description: "Copy number calling and SNV classification using targeted short read sequencing"
homepage: "https://bioconductor.org/packages/release/bioc/html/PureCN.html"
documentation: "https://bioconductor.org/packages/release/bioc/html/PureCN.html"
tool_dev_url: "https://github.com/lima1/PureCN"
doi: "10.1186/s13029-016-0060-z"
licence: "Artistic-2.0"
args_id: "$args"

input:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test' ]
- intervals:
type: file
description: |
BED file of target intervals, generated from IntervalFile.R
pattern: "{*.bed,*.txt}"
- coverage:
type: file
description: Coverage file generated from Coverage.R
pattern: "*.txt"
- vcf:
type: file
description: |
Normal panel in VCF format, used to calculate mapping bias
pattern: "*.vcf.gz"
- normaldb:
type: file
description: |
Normal panel in RDS format, generated from NormalDB.R
pattern: "*.rds"
- genome:
type: string
description: Genome build

output:
- csv:
type: file
description: |
CSV file containing copy number calls
pattern: "*.csv"
- variants_csv:
type: file
description: |
CSV file containing SNV calls
pattern: "*_variants.csv"
- pdf:
type: file
description: |
PDF file containing copy number plots
pattern: "*.pdf"
- rds:
type: file
description: |
RDS file containing copy number calls
pattern: "*.rds"
- amplification_pvalues_csv:
type: file
description: |
CSV file containing amplification p-values
pattern: "*_amplification_pvalues.csv"
- chr_pdf:
type: file
description: |
PDF file containing chromosome plots
pattern: "*_chromosomes.pdf"
- seg:
type: file
description: |
Segmentation file generated from DNAcopy.R
lbeltrame marked this conversation as resolved.
Show resolved Hide resolved
aldosr marked this conversation as resolved.
Show resolved Hide resolved
pattern: "*_dnacopy.seg"
- genes_csv:
type: file
description: |
CSV file containing gene copy number calls
pattern: "*_genes.csv"
- local_optima_pdf:
type: file
description: |
PDF file containing local optima plots
pattern: "*_local_optima.pdf"
- log:
type: file
description: |
Log file
pattern: "*.log"
- loh_vcf:
type: file
description: |
VCF file containing LOH calls
pattern: "*_loh.vcf"
- loh_vcf_gz:
type: file
description: |
GZipped VCF file containing LOH calls
pattern: "*_loh.vcf.gz"
- loh_vcf_tbi:
type: file
description: |
Tabix index file for LOH VCF
pattern: "*_loh.vcf.gz.tbi"
- loh_csv:
type: file
description: |
CSV file containing LOH calls
pattern: "*_loh.csv"
- loh_stats_csv:
type: file
description: |
CSV file containing LOH statistics
pattern: "*_loh-effects-stats.csv"
- loh_effects_txt:
type: file
description: |
TXT file containing LOH effects
pattern: "*_loh-effects-stats.genes.txt"
- loh_effects_html:
type: file
description: |
HTML file containing LOH effects
pattern: "*_loh-effects-stats.html"
- loh_effects_vcf:
type: file
description: |
VCF file containing LOH effects
pattern: "*_loh-effects.vcf.gz"
- loh_effects_tbi:
type: file
description: |
Tabix index file for LOH effects VCF
pattern: "*_loh-effects.vcf.gz.tbi"
- loh_summary_yaml:
type: file
description: |
YAML file containing LOH summary
pattern: "*_loh-summary.yaml"
- loh_effects_csv:
type: file
description: |
CSV file containing LOH effects
pattern: "*_loh-effects.csv"
- segmentation_pdf:
type: file
description: |
PDF file containing segmentation plots
pattern: "*_segmentation.pdf"
- sort_coverage_loess_png:
type: file
description: |
PNG file containing sort coverage loess plots
pattern: "*_sort-coverage-loess.png"
- sort_coverage_loess_qc_txt:
type: file
description: |
TXT file containing sort coverage loess QC
pattern: "*_sort-coverage-loess.qc.txt"
- sort_coverage_loess_txt_gz:
type: file
description: |
GZipped TXT file containing sort coverage loess
pattern: "*_sort-coverage-loess.txt.gz"
- sort_coverage_txt_gz:
type: file
description: |
GZipped TXT file containing sort coverage
pattern: "*_sort-coverage.txt.gz"
- versions:
type: file
description: File containing software versions
pattern: "versions.yml"

authors:
- "@aldosr"
- "@lbeltrame"
4 changes: 4 additions & 0 deletions tests/config/pytest_modules.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2867,6 +2867,10 @@ purecn/normaldb:
- modules/nf-core/purecn/normaldb/**
- tests/modules/nf-core/purecn/normaldb/**

purecn/run:
- modules/nf-core/purecn/run/**
- tests/modules/nf-core/purecn/run/**

purgedups/calcuts:
- modules/nf-core/purgedups/calcuts/**
- tests/modules/nf-core/purgedups/calcuts/**
Expand Down
38 changes: 38 additions & 0 deletions tests/modules/nf-core/purecn/run/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
#!/usr/bin/env nextflow

nextflow.enable.dsl = 2

include { PURECN_RUN } from '../../../../../modules/nf-core/purecn/run/main.nf'

process STUB_PURECN_RUN {
output:
path("*.txt") , emit: intervals
path("*.txt") , emit: coverage
path("*.vcf.gz") , emit: vcf
path("*.rds") , emit: normal_db

stub:
"""
touch interval_file.txt
touch coverage.txt
touch test.vcf.gz
touch normal_db.rds
"""
}

workflow test_purecn_run {

STUB_PURECN_RUN()

input = [
[ id:'test'],
file("interval_file.txt"),
file("coverage.txt"),
file("test.vcf.gz")
]

normal_db = file("normal_db.rds")
genome = "hg38"

PURECN_RUN ( input, normal_db, genome )
}
5 changes: 5 additions & 0 deletions tests/modules/nf-core/purecn/run/nextflow.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
process {

publishDir = { "${params.outdir}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" }

}
33 changes: 33 additions & 0 deletions tests/modules/nf-core/purecn/run/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
- name: purecn run
command: nextflow run ./tests/modules/nf-core/purecn/run -entry test_purecn_run -c ./tests/config/nextflow.config -c ./tests/modules/nf-core/purecn/run/nextflow.config -stub-run
tags:
- purecn
- purecn/run
files:
- path: output/purecn/test.csv
- path: output/purecn/test_variants.csv
- path: output/purecn/test.pdf
- path: output/purecn/test.rds
- path: output/purecn/test_amplification_pvalues.csv
- path: output/purecn/test_chromosomes.pdf
- path: output/purecn/test_dnacopy.seg
- path: output/purecn/test_genes.csv
- path: output/purecn/test_local_optima.pdf
- path: output/purecn/test.log
- path: output/purecn/test_loh.vcf
- path: output/purecn/test_loh.vcf.gz
- path: output/purecn/test_loh.vcf.gz.tbi
- path: output/purecn/test_loh.csv
- path: output/purecn/test_loh-effects-stats.csv
- path: output/purecn/test_loh-effects-stats.genes.txt
- path: output/purecn/test_loh-effects-stats.html
- path: output/purecn/test_loh-effects.vcf.gz
- path: output/purecn/test_loh-effects.vcf.gz.tbi
- path: output/purecn/test_loh-summary.yaml
- path: output/purecn/test_loh-effects.csv
- path: output/purecn/test_segmentation.pdf
- path: output/purecn/test_sort_coverage_loess.png
- path: output/purecn/test_sort_coverage_loess_qc.txt
- path: output/purecn/test_sort_coverage_loess.txt.gz
- path: output/purecn/test_sort_coverage.txt.gz
- path: output/purecn/versions.yml