Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use new support for prefixes in OBO format #119

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from
Draft

Conversation

balhoff
Copy link
Member

@balhoff balhoff commented Sep 25, 2024

The latest ROBOT includes new OWL API support to make use of prefix definitions in OBO format. This will allow us to get rid of the perl regex hack. This should be a fix for geneontology/noctua#902. Also relates to #84 and #17.

These changes haven't been tested on a full build yet—may need to tweak prefix definitions. Also, the latest ROBOT is required for this to work.

@balhoff
Copy link
Member Author

balhoff commented Sep 25, 2024

I think these changes fix a possibly unnoticed problem: IDs like http://purl.obolibrary.org/obo/go/noctua/neo#DDB_G0277037 in the current release should be http://identifiers.org/dictybase.gene/DDB_G0277037

Noting some problems:

  • TAIR:locus:1005794076 in the generated OBO file needs to become http://identifiers.org/tair.locus/1005203273
    • this can't be done with a simple prefix expansion; can the generated OBO be changed?

@kltm
Copy link
Member

kltm commented Sep 25, 2024

@balhoff Originally, there was a sed command in there doing some filtering, do you mean something like that?

@balhoff
Copy link
Member Author

balhoff commented Sep 25, 2024

@kltm we used https://github.com/geneontology/neo/blob/master/bin/fix-obo-uris.pl to hack in all the right expansions at the end, and this is supposed to now do it "right". But for TAIR:locus in particular I wonder if we can change the OBO generated from the GPI or GAF to not require a regex replacement that is needed by the transformation above.

@sierra-moxon
Copy link
Member

sierra-moxon commented Sep 25, 2024

@kltm we used https://github.com/geneontology/neo/blob/master/bin/fix-obo-uris.pl to hack in all the right expansions at the end, and this is supposed to now do it "right". But for TAIR:locus in particular I wonder if we can change the OBO generated from the GPI or GAF to not require a regex replacement that is needed by the transformation above.

e.g. given this GAF line in the TAIR upstream src file:

AGI_LocusCode   AT5G01310       APTX    involved_in     GO:0000012      TAIR:Communication:501741973    IBA     PANTHER:PTN000281062|UniProtKB:Q7Z2E3|MGI:MGI:1913658   P       AT5G01310       AT5G01310|APTX|APRATAXIN-like|T10O8.20|T10O8_20 protein taxon:3702      20240215        GOC             TAIR:locus:2179122

have the ontobio code replace that TAIR:locus:2179122 with tair.locus:2179122 in the resulting GO-generated GPI? (was just intensively in that code so it seems reasonable to do that, if I am understanding correctly - are these two the only one-offs or is there a list of things we want to rewrite to conform to bioregistry/prefixmaps/etc...?)

@balhoff
Copy link
Member Author

balhoff commented Sep 25, 2024

@sierra-moxon I think the NEO build is getting the GAF straight from TAIR, so maybe that won't work. Unless we're okay with obtaining this from GO instead of TAIR.

wget --no-check-certificate https://www.arabidopsis.org/download_files/GO_and_PO_Annotations/Gene_Ontology_Annotations/gene_association.tair.gz -O [email protected] && mv [email protected] $@

Thank you though!

@sierra-moxon
Copy link
Member

ah ok - makes sense - I checked the go-site/metadata/datasets/tair.yaml but it didn't have a GPI so I assumed that somehow you were grabbing the created GPI, but this makes more sense!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants