-
Notifications
You must be signed in to change notification settings - Fork 695
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Move DeepVariant into a subcommand module rundeepvariant, preparing for split modules The test snapshot is updated because the process name in the version file changed. * Add a split DeepVariant workflow with individual processes for each step * Remove hash unique ID and fix input structure issue * Fixes for call_variants outputing sharded file * Fix test * Remove --channels insert_size, which is only applicable for short read data The channels should be specified in the pipeline config * Replace the model type value input with ext.args config * Fix tests: should run twice for two samples in input channel * Fix linting issues and input channel description * Fix formatting of md files Co-authored-by: Felix Lenner <[email protected]> * Corrections / imrpovements from @fellen31 review * Check tfrecord file names * Updating conda skipping options, because the paths have changed * Add deprecation warning for top-level process and test for the deprecated process * also skip conda for the new deprecated module --------- Co-authored-by: Felix Lenner <[email protected]> Co-authored-by: Maxime U Garcia <[email protected]>
- Loading branch information
1 parent
29110dd
commit a004c86
Showing
38 changed files
with
2,241 additions
and
200 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
|
||
process DEEPVARIANT_CALLVARIANTS { | ||
tag "$meta.id" | ||
label 'process_high' | ||
|
||
//Conda is not supported at the moment | ||
container "nf-core/deepvariant:1.6.1" | ||
|
||
input: | ||
tuple val(meta), path(make_examples_tfrecords) | ||
|
||
output: | ||
tuple val(meta), path("${prefix}.call-*-of-*.tfrecord.gz"), emit: call_variants_tfrecords | ||
path "versions.yml", emit: versions | ||
|
||
when: | ||
task.ext.when == null || task.ext.when | ||
|
||
script: | ||
// Exit if running this module with -profile conda / -profile mamba | ||
if (workflow.profile.tokenize(',').intersect(['conda', 'mamba']).size() >= 1) { | ||
error "DEEPVARIANT module does not support Conda. Please use Docker / Singularity / Podman instead." | ||
} | ||
def args = task.ext.args ?: '' | ||
prefix = task.ext.prefix ?: "${meta.id}" | ||
|
||
def matcher = make_examples_tfrecords[0].baseName =~ /^(.+)-\d{5}-of-(\d{5})$/ | ||
if (!matcher.matches()) { | ||
throw new IllegalArgumentException("tfrecord baseName '" + make_examples_tfrecords[0].baseName + "' doesn't match the expected pattern") | ||
} | ||
def examples_tfrecord_name = matcher[0][1] | ||
def shardCount = matcher[0][2] | ||
// Reconstruct the logical name - ${tfrecord_name}.examples.tfrecord@${task.cpus}.gz | ||
def examples_tfrecords_logical_name = "${examples_tfrecord_name}@${shardCount}.gz" | ||
|
||
""" | ||
/opt/deepvariant/bin/call_variants \\ | ||
${args} \\ | ||
--outfile "${prefix}.call.tfrecord.gz" \\ | ||
--examples "${examples_tfrecords_logical_name}" | ||
cat <<-END_VERSIONS > versions.yml | ||
"${task.process}": | ||
deepvariant_callvariants: \$(echo \$(/opt/deepvariant/bin/run_deepvariant --version) | sed 's/^.*version //; s/ .*\$//' ) | ||
END_VERSIONS | ||
""" | ||
|
||
stub: | ||
prefix = task.ext.prefix ?: "${meta.id}" | ||
""" | ||
echo "" | gzip > ${prefix}.call-00000-of-00001.tfrecord.gz | ||
cat <<-END_VERSIONS > versions.yml | ||
"${task.process}": | ||
deepvariant_callvariants: \$(echo \$(/opt/deepvariant/bin/run_deepvariant --version) | sed 's/^.*version //; s/ .*\$//' ) | ||
END_VERSIONS | ||
""" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
name: deepvariant_callvariants | ||
description: Call variants from the examples produced by make_examples | ||
keywords: | ||
- variant calling | ||
- machine learning | ||
- neural network | ||
tools: | ||
- deepvariant: | ||
description: DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data | ||
homepage: https://github.com/google/deepvariant | ||
documentation: https://github.com/google/deepvariant | ||
tool_dev_url: https://github.com/google/deepvariant | ||
doi: "10.1038/nbt.4235" | ||
licence: ["BSD-3-clause"] | ||
input: | ||
- meta: | ||
type: map | ||
description: | | ||
Groovy Map containing sample information | ||
e.g. [ id:'test', single_end:false ] | ||
- make_examples_tfrecords: | ||
type: file | ||
description: The actual sharded input files, from DEEPVARIANT_MAKEEXAMPLES process | ||
pattern: "*.gz" | ||
output: | ||
- call_variants_tfrecords: | ||
type: list | ||
description: | | ||
Each output contains: unique ID string from input channel, meta, tfrecord file with variant calls. | ||
- versions: | ||
type: file | ||
description: File containing software version | ||
pattern: "versions.yml" | ||
authors: | ||
- "@abhi18av" | ||
- "@ramprasadn" | ||
- "@fa2k" | ||
maintainers: | ||
- "@abhi18av" | ||
- "@ramprasadn" |
85 changes: 85 additions & 0 deletions
85
modules/nf-core/deepvariant/callvariants/tests/main.nf.test
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
nextflow_process { | ||
|
||
name "Test Process DEEPVARIANT_CALLVARIANTS" | ||
script "../main.nf" | ||
config "./nextflow.config" | ||
process "DEEPVARIANT_CALLVARIANTS" | ||
|
||
tag "deepvariant/makeexamples" | ||
tag "deepvariant/callvariants" | ||
tag "deepvariant" | ||
tag "modules" | ||
tag "modules_nfcore" | ||
|
||
test("homo_sapiens - wgs") { | ||
setup { | ||
run("DEEPVARIANT_MAKEEXAMPLES") { | ||
script "../../makeexamples/main.nf" | ||
process { | ||
""" | ||
input[0] = [ | ||
[ id:'test', single_end:false ], // meta map | ||
file(params.modules_testdata_base_path + '/genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), | ||
file(params.modules_testdata_base_path + '/genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true), | ||
[] | ||
] | ||
input[1] = [ | ||
[ id:'genome'], | ||
file(params.modules_testdata_base_path + '/genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) | ||
] | ||
input[2] = [ | ||
[ id:'genome'], | ||
file(params.modules_testdata_base_path + '/genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true) | ||
] | ||
input[3] = [ | ||
[],[] | ||
] | ||
input[4] = [ | ||
[],[] | ||
] | ||
""" | ||
} | ||
} | ||
} | ||
when { | ||
process { | ||
""" | ||
input[0] = DEEPVARIANT_MAKEEXAMPLES.out.examples | ||
""" | ||
} | ||
} | ||
|
||
then { | ||
assertAll( | ||
{ assert process.success }, | ||
{ assert process.out.call_variants_tfrecords.get(0).get(0) == [ id:'test', single_end:false ] }, | ||
// The tfrecord binary representation is not stable, but we check the name of the output. | ||
{ assert snapshot(file(process.out.call_variants_tfrecords.get(0).get(1)).name).match("homo_sapiens-wgs-call_variants_tfrecords-filenames")}, | ||
{ assert snapshot(process.out.versions).match("versions") }, | ||
) | ||
} | ||
} | ||
|
||
test("homo_sapiens - wgs - stub") { | ||
options "-stub" | ||
|
||
when { | ||
process { | ||
""" | ||
input[0] = [ | ||
[ id:'test', single_end:false ], // meta | ||
[] // No input paths are needed in stub mode | ||
] | ||
""" | ||
} | ||
} | ||
|
||
then { | ||
assertAll( | ||
{ assert process.success }, | ||
{ assert snapshot(process.out).match() } | ||
) | ||
} | ||
} | ||
|
||
} |
59 changes: 59 additions & 0 deletions
59
modules/nf-core/deepvariant/callvariants/tests/main.nf.test.snap
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
{ | ||
"versions": { | ||
"content": [ | ||
[ | ||
"versions.yml:md5,5ff99ffba1e56e4e919d3dfc2d0f3cbb" | ||
] | ||
], | ||
"meta": { | ||
"nf-test": "0.9.0", | ||
"nextflow": "24.04.4" | ||
}, | ||
"timestamp": "2024-08-09T16:38:47.927241" | ||
}, | ||
"homo_sapiens-wgs-call_variants_tfrecords-filenames": { | ||
"content": [ | ||
"test.call-00000-of-00001.tfrecord.gz" | ||
], | ||
"meta": { | ||
"nf-test": "0.9.0", | ||
"nextflow": "24.04.4" | ||
}, | ||
"timestamp": "2024-09-04T17:04:33.276938" | ||
}, | ||
"homo_sapiens - wgs - stub": { | ||
"content": [ | ||
{ | ||
"0": [ | ||
[ | ||
{ | ||
"id": "test", | ||
"single_end": false | ||
}, | ||
"test.call-00000-of-00001.tfrecord.gz:md5,68b329da9893e34099c7d8ad5cb9c940" | ||
] | ||
], | ||
"1": [ | ||
"versions.yml:md5,5ff99ffba1e56e4e919d3dfc2d0f3cbb" | ||
], | ||
"call_variants_tfrecords": [ | ||
[ | ||
{ | ||
"id": "test", | ||
"single_end": false | ||
}, | ||
"test.call-00000-of-00001.tfrecord.gz:md5,68b329da9893e34099c7d8ad5cb9c940" | ||
] | ||
], | ||
"versions": [ | ||
"versions.yml:md5,5ff99ffba1e56e4e919d3dfc2d0f3cbb" | ||
] | ||
} | ||
], | ||
"meta": { | ||
"nf-test": "0.9.0", | ||
"nextflow": "24.04.4" | ||
}, | ||
"timestamp": "2024-08-13T21:07:17.335788301" | ||
} | ||
} |
11 changes: 11 additions & 0 deletions
11
modules/nf-core/deepvariant/callvariants/tests/nextflow.config
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
process { | ||
withName: "DEEPVARIANT_CALLVARIANTS" { | ||
ext.args = '--checkpoint "/opt/models/wgs"' | ||
cpus = 2 // Keep CPUs fixed so the number of output files is reproducible | ||
} | ||
} | ||
process { | ||
withName: "DEEPVARIANT_MAKEEXAMPLES" { | ||
ext.args = '--channels "insert_size"' | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
deepvariant/callvariants: | ||
- modules/nf-core/deepvariant/callvariants/** |
Oops, something went wrong.