Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define logic for suits with tools that have different bio.tools entries #67

Open
paulzierep opened this issue Mar 7, 2024 · 5 comments

Comments

@paulzierep
Copy link
Collaborator

paulzierep commented Mar 7, 2024

Currently, we assume that all tools in a suite (i.e. github folder) have the same bio.tools ID.
However, there are some suites where the contained tools have different bio.tool IDs: For example: https://github.com/galaxyproject/tools-iuc/tree/main/tools/spades

As a consequence, our tool only parses one bio.tools entry, which is not correct for all tools.
We need to brainstorm how to handle these cases.

A) Aggregate the bio.tools entries and EDAM Terms (similar to the Galaxy tool ids row)
B) One row for each Galaxy tool ids (this would probably mean a good amount of restructuring, i.e. creating a tool list and an aggregated list)

Any vote for either way ?

@matuskalas
Copy link
Contributor

matuskalas commented Mar 8, 2024

Oh no, this is a nightmare I was expecting. Such cases are basically a wrong use of Bio.tools.

spades is a good example, with Bio.tools records generated from Galaxy@Pasteur, separate records for subtools, with identical homepage, citations, version, etc. The correct record is https://bio.tools/spades (and I'll be happy to delete the rest🐱‍👤)

What needs to be done, is to include into Bio.tools linter checks for identical homepage, repo & other links (download, documentation, ...), primary citation, ... 😨

On the side of Tool Extractor, I see 3 options, ordered from IMHO least favourable to most:
A) Aggregate, but this is working around a situation that probably shouldn't exist.
C) Fail and prompt to fix the mess (Should ideally be part of linting the tool wrappers)
D) As C, but just throw a warning and ignore the Bio.tools IDs

@matuskalas
Copy link
Contributor

Could you output all the wrappers with this problem? Billion thanks! 🙏🏽🙏🏽🙇🏽‍♂️

@paulzierep
Copy link
Collaborator Author

We have no functionality to automatically see which tools have multiple entries, yet.
But I will collect the cases here:

https://github.com/galaxyproject/tools-iuc/tree/main/tools/bbtools

@paulzierep
Copy link
Collaborator Author

I will update the parser to collect all IDs if there are multiple

@paulzierep
Copy link
Collaborator Author

Now, we produce a column with bio.tools ids #72
Overall, there are only like 4 or 5 cases, so very minor issue ! And can be fixed by using the tool suite as xref for all tools in a suite.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants