-
Notifications
You must be signed in to change notification settings - Fork 695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add compression to MSA modules #4754
Changes from 60 commits
2680830
23f496e
2d704a7
f8ddf9f
c0d1d50
3eeca26
d9666d7
21f5f81
c1e26f3
8eb641e
82ea41a
15f5102
1915cc3
0839b98
b21623e
c790e44
5a6e78e
d26dae7
083fc72
2497b9f
f6cfc4a
9c6481d
bd443c1
ea6d06e
746d601
93f2943
a3e6205
7a5bb1e
32e3508
b29d802
ad33963
21ba7d4
e9617ba
4fd533d
1353118
9e44859
ad9516e
a181b03
c047a6d
6251e0a
ac15c92
2ee7bed
86568d9
b0dbaa4
222b029
81ced39
3e82387
2df6c38
6c141ab
6338cf2
984ee51
a5330ef
1f7791b
838ec1b
d2e05da
7a0e78d
103736a
d870e1d
ddc6725
e8ae161
8184055
1d35d3e
706d05f
b338910
4528520
1f89b92
a624c58
6f6da51
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,3 +5,4 @@ channels: | |
- defaults | ||
dependencies: | ||
- bioconda::clustalo=1.2.4 | ||
- conda-forge::pigz=2.8 |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,45 +4,54 @@ process CLUSTALO_ALIGN { | |
|
||
conda "${moduleDir}/environment.yml" | ||
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? | ||
'https://depot.galaxyproject.org/singularity/clustalo:1.2.4--h87f3376_5': | ||
'biocontainers/clustalo:1.2.4--h87f3376_5' }" | ||
'https://depot.galaxyproject.org/singularity/mulled-v2-4cefc38542f86c17596c29b35a059de10387c6a7:adbe4fbad680f9beb083956d79128039a727e7b3-0': | ||
'biocontainers/mulled-v2-4cefc38542f86c17596c29b35a059de10387c6a7:adbe4fbad680f9beb083956d79128039a727e7b3-0' }" | ||
|
||
input: | ||
tuple val(meta), path(fasta) | ||
tuple val(meta) , path(fasta) | ||
tuple val(meta2), path(tree) | ||
val(compress) | ||
|
||
output: | ||
tuple val(meta), path("*.aln"), emit: alignment | ||
path "versions.yml" , emit: versions | ||
tuple val(meta), path("*.aln{.gz,}"), emit: alignment | ||
path "versions.yml" , emit: versions | ||
|
||
when: | ||
task.ext.when == null || task.ext.when | ||
|
||
script: | ||
def args = task.ext.args ?: '' | ||
def prefix = task.ext.prefix ?: "${meta.id}" | ||
def write_output = compress ? "--force -o >(pigz -cp ${task.cpus} > ${prefix}.aln.gz)" : "> ${prefix}.aln" | ||
// using >() is necessary to preserve the return value, | ||
// so nextflow knows to display an error when it failed | ||
// the --force -o is necessary, as clustalo expands the commandline input, | ||
// causing it to treat the pipe as a parameter and fail | ||
// this way, the command expands to /dev/fd/<id>, and --force allows writing output to an already existing file | ||
""" | ||
clustalo \\ | ||
-i ${fasta} \\ | ||
--threads=${task.cpus} \\ | ||
$args \\ | ||
-o ${prefix}.aln | ||
clustalo \ | ||
-i ${fasta} \ | ||
--threads=${task.cpus} \ | ||
$args \ | ||
$write_output | ||
|
||
cat <<-END_VERSIONS > versions.yml | ||
"${task.process}": | ||
clustalo: \$( clustalo --version ) | ||
pigz: \$(echo \$(pigz --version 2>&1) | sed 's/^.*pigz\\w*//' )) | ||
END_VERSIONS | ||
""" | ||
|
||
stub: | ||
def args = task.ext.args ?: '' | ||
def prefix = task.ext.prefix ?: "${meta.id}" | ||
""" | ||
touch ${prefix}.aln | ||
touch ${prefix}.aln.gz | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should change based on the
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah good catch, didn't change that since introducing the compress input channel. Might also affect some of the other modules, I'll have a look. |
||
|
||
cat <<-END_VERSIONS > versions.yml | ||
"${task.process}": | ||
clustalo: \$( clustalo --version ) | ||
pigz: \$(echo \$(pigz --version 2>&1) | sed 's/^.*pigz\\w*//' )) | ||
END_VERSIONS | ||
""" | ||
} |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,26 +10,29 @@ process FAMSA_ALIGN { | |
'biocontainers/famsa:2.2.2--h9f5acd7_0' }" | ||
|
||
input: | ||
tuple val(meta), path(fasta) | ||
tuple val(meta) , path(fasta) | ||
tuple val(meta2), path(tree) | ||
val(compress) | ||
|
||
output: | ||
tuple val(meta), path("*.aln"), emit: alignment | ||
path "versions.yml" , emit: versions | ||
tuple val(meta), path("*.aln{.gz,}"), emit: alignment | ||
path "versions.yml" , emit: versions | ||
|
||
when: | ||
task.ext.when == null || task.ext.when | ||
|
||
script: | ||
def args = task.ext.args ?: '' | ||
def compress_args = compress ? '-gz' : '' | ||
def prefix = task.ext.prefix ?: "${meta.id}" | ||
def options_tree = tree ? "-gt import $tree" : "" | ||
""" | ||
famsa $options_tree \\ | ||
$compress_args \\ | ||
$args \\ | ||
-t ${task.cpus} \\ | ||
${fasta} \\ | ||
${prefix}.aln | ||
${prefix}.aln${compress ? '.gz':''} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm I'm coming around to this idea a bit more, it seems to be cleaner and harder to mess up. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, it being the most clean and straightforward to understand/document (edit: compared to the other options we came up with) is I think the main advantage. |
||
|
||
cat <<-END_VERSIONS > versions.yml | ||
"${task.process}": | ||
|
@@ -40,7 +43,7 @@ process FAMSA_ALIGN { | |
stub: | ||
def prefix = task.ext.prefix ?: "${meta.id}" | ||
""" | ||
touch ${prefix}.aln | ||
touch ${prefix}.aln.gz | ||
|
||
cat <<-END_VERSIONS > versions.yml | ||
"${task.process}": | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a huge fan of this but I guess most pipeline developers will leave it on
true
and forget about it, so why not?