Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Maker annotation workflow #47

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
15bea80
Add maker annotation workflow
gallardoalba Jul 11, 2021
6f792e3
Update workflow name in yml file
gallardoalba Jul 11, 2021
41a2064
Update title and dockstore file
gallardoalba Jul 11, 2021
1118a0c
Modify dockstore file
gallardoalba Jul 11, 2021
bcef099
Fix tests
gallardoalba Jul 11, 2021
b81d7e6
Fix tests
gallardoalba Jul 11, 2021
4cb151b
Update datasets
gallardoalba Jul 13, 2021
97acc4a
Use planemo from master branch for tests
mvdbeek Jul 13, 2021
b0d0d47
Include tar in augustus conda package
gallardoalba Jul 14, 2021
6ca5807
Update workflow Augustus 3.4
gallardoalba Jul 18, 2021
4d5d81a
Update tests include files
gallardoalba Jul 20, 2021
af8899c
Remove unnecesary file
gallardoalba Jul 20, 2021
2cac706
Fix tests replace size by value
gallardoalba Jul 20, 2021
9d43494
Update tests
gallardoalba Jul 22, 2021
3daee38
Revert "Use planemo from master branch for tests"
mvdbeek Jul 23, 2021
d535a55
Move maker-annotation-eukaryote into genome-annotation
mvdbeek Jul 23, 2021
dc2e56a
Also drop remaining exact comparisons
mvdbeek Jul 23, 2021
1356592
More test fixes
mvdbeek Jul 23, 2021
1ee8509
Update last tests
gallardoalba Jul 23, 2021
dce532d
Update README.md
gallardoalba Sep 6, 2021
1e08ee3
Update CHANGELOG.md
gallardoalba Sep 20, 2021
75c735b
Update workflows/genome-annotation/maker-annotation-eukaryote/README.md
gallardoalba Sep 20, 2021
e72d7ff
Update README.md: include licencing
gallardoalba Oct 7, 2021
eda05ed
Update CHANGELOG.md
gallardoalba Oct 7, 2021
6d106b2
Update workflows/genome-annotation/maker-annotation-eukaryote/README.md
gallardoalba Apr 11, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
version: 1.2
workflows:
- name: "main"
primaryDescriptorPath: /maker-annotation-eukaryote.ga
subclass: Galaxy
testParameterFiles:
- /maker-annotation-eukaryote-tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Changelog

## [0.1] 2021-09-20

### Added

- Initial version of Annotation of eukaryotic genomes with MAKER workflow

## [0.2] 2021-10-07

### Include use rights

- Updating of README.md and worflow description. Information about licencing has been included.

15 changes: 15 additions & 0 deletions workflows/genome-annotation/maker-annotation-eukaryote/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Annotation of eukaryotic genomes with MAKER

This workflow describes the necessary steps to annotate eukaryotic genomes by using the MAKER tool. Its original version appears in the GTN under the title [Genome annotation with Maker](https://training.galaxyproject.org/training-material/topics/genome-annotation/tutorials/annotation-with-maker/tutorial.html).

### Workflow evaluation

Comparing annotation workflows and deciding which one is the best is an open question.

As a possible approach to assess whether changes in the workflow contribute to its improvement, one possibility is to use the [ParseVal](https://usegalaxy.eu/root?tool_id=toolshed.g2.bx.psu.edu/repos/iuc/aegean_parseval/aegean_parseval/0.16.0) tool, in order to compare the obtained result with a standard annotation.

If you only want to know if an annotation looks reasonable based on the current test data, you can just count the genes in the output GFF, and/or compare the total length of genes.

### Licencing

This workflow includes [MAKER](https://www.yandell-lab.org/software/maker.html), a tool whose use is restricted for commercial use without a license. Those wishing to license MAKER for commercial use should contact Aaron Duffy at the University of Utah TVC to discuss your needs.
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
- doc: Test outline for maker-annotation-eukaryote.ga
job:
EST and/or cDNA:
class: File
location: https://zenodo.org/record/5095188/files/cds.fasta?download=1
Genome sequence:
class: File
location: https://zenodo.org/record/5095188/files/genome.fasta?download=1
Protein sequences:
class: File
location: https://zenodo.org/record/5095188/files/proteins.fasta?download=1
outputs:
BUSCO_full_table_01:
class: File
file: test-data/BUSCO_full_table_01.tabular
compare: diff
lines_diff: 4
BUSCO_missing_orthologs_01:
class: File
file: test-data/BUSCO_missing_orthologs_01.tabular
compare: diff
lines_diff: 4
BUSCO_short_summary_01:
asserts:
has_text:
text: "Complete and single-copy BUSCOs"
has_text:
text: "Complete and duplicated BUSCOs"
BUSCO_short_summary_02:
asserts:
has_text:
text: "Complete and single-copy BUSCOs"
has_text:
text: "Complete and duplicated BUSCOs"
BUSCO_short_summary_03:
asserts:
has_text:
text: "Complete and single-copy BUSCOs"
has_text:
text: "Complete and duplicated BUSCOs"
BUSCO_short_summary_04:
asserts:
has_text:
text: "Complete and single-copy BUSCOs"
has_text:
text: "Complete and duplicated BUSCOs"
SNAP_trained_model:
asserts:
has_size:
value: 46205
delta: 1000
fasta_statistics:
asserts:
has_size:
value: 206
delta: 10
genome_annotation_statistics_graphs_01:
asserts:
has_size:
value: 18850
delta: 300
genome_annotation_statistics_graphs_02:
asserts:
has_size:
value: 18857
delta: 300
genome_annotation_statistics_graphs_03:
asserts:
has_size:
value: 18944
delta: 300
genome_annotation_statistics_summary_01:
asserts:
has_text:
text: "Mean gene locus size (first to last exon)"
has_text:
text: "Number of genes with alternative transcript variants"
genome_annotation_statistics_summary_02:
asserts:
has_text:
text: "Mean gene locus size (first to last exon)"
has_text:
text: "Number of genes with alternative transcript variants"
genome_annotation_statistics_summary_03:
asserts:
has_text:
text: "Mean gene locus size (first to last exon)"
has_text:
text: "Number of genes with alternative transcript variants"
gffread_cds:
asserts:
has_text:
text: ">TEST000001-RA gene=TEST000001"
has_text:
text: ">TEST000100-RA gene=TEST000100"
has_text:
text: ">TEST000200-RA gene=TEST000200"
gffread_exons:
asserts:
has_text:
text: ">TEST000001-RA gene=TEST000001 CDS"
has_text:
text: ">TEST000100-RA gene=TEST000100 CDS"
has_text:
text: ">TEST000200-RA gene=TEST000200 CDS"
gffread_translated_cds:
asserts:
has_text:
text: ">TEST000001-RA gene=TEST000001"
has_text:
text: ">TEST000100-RA gene=TEST000100"
has_text:
text: ">TEST000200-RA gene=TEST000200"
map_annotation_ids:
asserts:
has_text:
text: "NC_003421.2"
augus_trained_model:
asserts:
has_size:
value: 151220
delta: 18000
Loading