Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load GPI file for UniProt reviewed entries for Virus and Bacteria in NEO #49

Closed
pgaudet opened this issue Feb 10, 2020 · 11 comments
Closed
Assignees

Comments

@pgaudet
Copy link

pgaudet commented Feb 10, 2020

Hello,

We'd like to load the UniProt reviewed entries for Virus and Bacteria in NEO.

@alexsign generated a GPI file.
@kltm Where do we make it available?

Thanks, Pascale

@kltm
Copy link
Member

kltm commented Feb 10, 2020

@pgaudet Currently, changes to the NEO build are handled over here: https://github.com/geneontology/neo/issues
Once in NEO, we run the build here: https://build.geneontology.org/job/geneontology/job/pipeline/job/issue-35-neo-test/

@pgaudet
Copy link
Author

pgaudet commented Feb 11, 2020

OK - you said go-site on the software call. I'll move the ticket.

Can we assign someone to this ?

@pgaudet pgaudet transferred this issue from geneontology/go-site Feb 11, 2020
@kltm
Copy link
Member

kltm commented Feb 12, 2020

@pgaudet Where is this file?

@pgaudet
Copy link
Author

pgaudet commented Feb 12, 2020

In my inbox.

My question was

Where do we make it available?

:)

Thanks, Pascale

@kltm
Copy link
Member

kltm commented Feb 12, 2020

Just to follow up on the ticket with conversations help elsewhere:

  • we will be provided a URL for a GPI
  • we will add this to our metadata
    this is terrible--why do we do this? https://build.berkeleybop.org/job/publish-go-site-datasets-json/
    ideally, we're just getting and working with the YAML product from a pipeline
  • we will regenerate the datasets.json used by NEO
    this should self-update, see above
  • we will rerun the temporary NEO pipeline
  • we will deploy this new NEO

@pgaudet
Copy link
Author

pgaudet commented Apr 9, 2020

This is the URL: ftp://ftp.ebi.ac.uk/pub/contrib/goa/uniprot_reviewed_virus_bacteria.gpi

@pgaudet
Copy link
Author

pgaudet commented Apr 9, 2020

@kltm What do we need to do next ? Can we add this to the GOA yaml ?

@kltm
Copy link
Member

kltm commented Apr 9, 2020

@pgaudet That would be the older file at this point. The way the system works current, the Scov2 stuff is bypassing the main system.
We're currently using: https://github.com/geneontology/neo/blob/master/Makefile#L43

@cmungall
Copy link
Member

We can go ahead and add this to to neo easily, but there are some things we need to discuss first:

Currently we assume each gpi file is for one species, and we assign a species code. This is for disambiguation of label which is necessary to prevent autocomplete confusion. For example, there are 574 dnaA genes here. I suggest we get around this by appending the taxon id to the symbol. It's not perfect but it's simple to implement.

Second, discussing with @pgaudet we decided that we would come up with a GO reference species list and use this instead. Stay tuned.

Third, we will be loading sars-cov-2 twice (this is configured to be loaded from its own gpi, see geneontology/go-site#1431). I think this is what you are referring to Seth?

@kltm
Copy link
Member

kltm commented May 21, 2020

@cmungall I'd have to get it all in my head again to be completely clear, but the current use case for the GPI is for annotation in Noctua: in order to make annotation work we need it in NEO; to get it into NEO, we have the somewhat hacked NEO Makefile with the sars-cov-2 handled as a special case. I was responding to @pgaudet 's query about adding an older file to the main pipeline GOA dataset metadata, which would not support this use case.

@kltm
Copy link
Member

kltm commented Feb 17, 2022

@pgaudet This is now a dupe on #77 , correct? (Still with the understanding that we may just do #82 anyways.)

@kltm kltm closed this as completed Feb 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants