-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow local databases to be used for kraken2, centrifuge, and busco #504
Changes from 7 commits
328740c
7f8909b
118271c
d4b6258
4304cf6
190129e
1961247
4c064c0
9640d64
d9e7cc8
7f054c6
e09c6a0
1f5d1a7
42b914d
3521da0
04ceb89
5cb4f41
e4525bc
0d84819
10ae039
e77e44d
21fcad2
55fd640
d122b49
a111ca8
f0e4ce6
6190222
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -481,12 +481,12 @@ | |
"centrifuge_db": { | ||
"type": "string", | ||
"description": "Database for taxonomic binning with centrifuge.", | ||
"help_text": "E.g. ftp://ftp.ccb.jhu.edu/pub/infphilo/centrifuge/data/p_compressed+h+v.tar.gz." | ||
"help_text": "Local directory containing `*.cf` files or path to download compressed tar archive. E.g. ftp://ftp.ccb.jhu.edu/pub/infphilo/centrifuge/data/p_compressed+h+v.tar.gz." | ||
gregorysprenger marked this conversation as resolved.
Show resolved
Hide resolved
|
||
}, | ||
"kraken2_db": { | ||
"type": "string", | ||
"description": "Database for taxonomic binning with kraken2.", | ||
"help_text": "The database file must be a compressed tar archive that contains at least the three files `hash.k2d`, `opts.k2d` and `taxo.k2d`. E.g. ftp://ftp.ccb.jhu.edu/pub/data/kraken2_dbs/minikraken_8GB_202003.tgz." | ||
"help_text": "Local directory or compressed tar archive that contains at least the three files `hash.k2d`, `opts.k2d` and `taxo.k2d`. E.g. ftp://ftp.ccb.jhu.edu/pub/data/kraken2_dbs/minikraken_8GB_202003.tgz." | ||
gregorysprenger marked this conversation as resolved.
Show resolved
Hide resolved
|
||
}, | ||
"krona_db": { | ||
"type": "string", | ||
|
@@ -757,23 +757,18 @@ | |
"description": "Specify which tool for bin quality-control validation to use.", | ||
"enum": ["busco", "checkm"] | ||
}, | ||
"busco_reference": { | ||
"busco_db": { | ||
"type": "string", | ||
"description": "Download path for BUSCO lineage dataset, instead of using automated lineage selection.", | ||
"help_text": "E.g. https://busco-data.ezlab.org/v5/data/lineages/bacteria_odb10.2020-03-06.tar.gz. Available databases are listed here: https://busco-data.ezlab.org/v5/data/lineages/." | ||
}, | ||
"busco_download_path": { | ||
"type": "string", | ||
"description": "Path to local folder containing already downloaded and unpacked lineage datasets.", | ||
"help_text": "If provided, BUSCO analysis will be run in offline mode. Data can be downloaded from https://busco-data.ezlab.org/v5/data/ (files still need to be unpacked manually). Run in combination with automated lineage selection." | ||
"description": "Download path for BUSCO lineage dataset or path to local directory containing already downloaded and unpacked lineage datasets.", | ||
gregorysprenger marked this conversation as resolved.
Show resolved
Hide resolved
|
||
"help_text": "E.g. https://busco-data.ezlab.org/v5/data/lineages/bacteria_odb10.2020-03-06.tar.gz or '/path/to/buscodb' (files still need to be unpacked manually). Available databases are listed here: https://busco-data.ezlab.org/v5/data/lineages/." | ||
}, | ||
"busco_auto_lineage_prok": { | ||
"type": "boolean", | ||
"description": "Run BUSCO with automated lineage selection, but ignoring eukaryotes (saves runtime)." | ||
}, | ||
"save_busco_reference": { | ||
"save_busco_db": { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm wondering if this should keep its name (or if it should be renamed to "save_busco_references"? As far as I see it, the use case for this parameter is to save the lineages files downloaded when using online auto lineage detection. On their own, these don't actually comprise a busco database as there are other index files you need to download in order to pass a directory to |
||
"type": "boolean", | ||
"description": "Save the used BUSCO lineage datasets provided via --busco_reference or downloaded when not using --busco_reference or --busco_download_path.", | ||
"description": "Save the used BUSCO lineage datasets provided via `--busco_db`.", | ||
"help_text": "Useful to allow reproducibility, as BUSCO datasets are frequently updated and old versions do not always remain accessible." | ||
}, | ||
"busco_clean": { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outwith the scope of this PR for sure, but I think cases like this where you skip a step but specify a database should probably just print a warning rather than quitting. Makes debugging a little easier if you just need to quickly turn something off.