-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new command nf-core rocrate
to create a Research Object (RO) crate for a pipeline
#2680
base: dev
Are you sure you want to change the base?
Conversation
Codecov ReportAttention:
Additional details and impacted files☔ View full report in Codecov by Sentry. |
@nf-core-bot changelog: Add new command nf-core rocrate to create a Research Object (RO) crate for a pipeline |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super nice! 👏🏻
Couple of minor comments and haven't tried running myself, but from a quick run through of the code I think it looks great 👍🏻
nf_core/rocrate.py
Outdated
self.add_main_authors(wf_file) | ||
wf_file.append_to("programmingLanguage", {"@id": "#nextflow"}) | ||
# get keywords from nf-core website | ||
remote_workflows = requests.get("https://nf-co.re/pipelines.json").json()["remote_workflows"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 👍🏻
TODO:
|
|
|
Throws an error if downloaded using
|
"Data entities representing workflows (@type: ComputationalWorkflow) SHOULD comply with the Bioschemas ComputationalWorkflow profile, where possible." - https://www.researchobject.org/ro-crate/1.1/workflows.html#complying-with-bioschemas-computational-workflow-profile |
We could include subworkflows / modules information to the RO-Crate to increase machine readability. An overhead is of course the metadata size. Unclear situation with versioning: how to identify RO-Crates of a same workflow but of different versions (esp. if the changes are not yet commited to git) |
For the structure of the workflow and nested workflows, see https://gitlab.liris.cnrs.fr/sharefair/bioflow-insight which can parse the DSL2 definitions and generate an RO-Crate. |
Looks interesting, unfortunately is a bit too slow for my taste (3 minutes to run through nf-core/RNA-seq, 2 for nf-core/sarek even with |
Done for now. Waiting on ResearchObject/ro-crate-py#185 to find a better way to write all files which are part of an nf-core repo |
@mashehu - is creator now email address? |
Example was last updated 3 weeks ago, maybe needs a refresh. |
|
nf_core/pipelines/rocrate.py
Outdated
if fn.endswith(".png"): | ||
log.debug(f"Adding workflow image file: {fn}") | ||
self.crate.add_jsonld({"@id": fn, "@type": ["File", "ImageObject"]}) | ||
if re.search(r"(metro|tube)_?(map)?", fn) and self.crate.mainEntity is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also check the filename? In case someone makes metro_diagram.json
?
Hi, The repo2rocrate library generates a Workflow Testing RO-Crate from workflow repositories that follow community guidelines, and includes support for nf-core pipelines. Workflow Testing RO-Crate is like Workflow RO-Crate, but with additional metadata related to the testing of the workflow. This format can be read not only by WorkflowHub, but also by LifeMonitor, which uses the extra metadata to track the workflow's test status over time. WorkflowHub integrates with LifeMonitor by adding a link pointing to the workflow's status. For instance. https://workflowhub.eu/workflows/109 has a "Tests Passing" button that points to https://app.lifemonitor.eu/workflow;uuid=9647f1e0-6566-0139-90bb-005056ab5db4. There is overlap between the RO-Crate generated by repo2rocrate and the one generated by the Some remarks on this PR:
|
create: add shortcut to toggle all switches
Template: Do not assume pipeline name is url
…s datasets with descriptions
…eate # Conflicts: # CHANGELOG.md # nf_core/commands_pipelines.py # nf_core/pipelines/lint_utils.py # nf_core/pipelines/rocrate.py # nf_core/utils.py
Example crate from the rnaseq pipeline:
ro-crate-metadata.json