Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ecoli data is gone from NEO as the upstream source changed #114

Open
kltm opened this issue Mar 2, 2023 · 13 comments
Open

Ecoli data is gone from NEO as the upstream source changed #114

kltm opened this issue Mar 2, 2023 · 13 comments

Comments

@kltm
Copy link
Member

kltm commented Mar 2, 2023

In exploring the NEO load, we discovered that there are no ecoli entries, likely due to the upstream file change.

To close this, update to the new (temp) file and reload.

@kltm kltm self-assigned this Mar 2, 2023
@kltm
Copy link
Member Author

kltm commented Mar 2, 2023

Okay, I think that this may be cleared naturally as the datasets.json does seem to refresh to the correct value.

@kltm
Copy link
Member Author

kltm commented Mar 2, 2023

Check next week on a test server.

@kltm
Copy link
Member Author

kltm commented Mar 17, 2023

Data does seem to be in there now, although we may want to do more tweaks.

@vanaukenk
Copy link

Just checking the autocomplete on the Noctua Landing Page, I can find E. coli entities, but the species/taxon for E. coli K12 is shown as ecocyc, rather than one of the abbreviations (e.g. Atal) or an NCBI taxon id. I'm not sure if this is what was intended.

image

@kltm
Copy link
Member Author

kltm commented Mar 17, 2023

I think "intended" here is not quite right. Maybe. Initially, I thought this might be due to the ongoing deal with geneontology/go-site#1961, but it seems to stem from (by several steps) from https://github.com/geneontology/go-site/blob/8b649d799b522af9ca28f560f71ec1c978076d99/metadata/datasets/ecocyc.yaml#L22 being null. And it has been like that for years, so I'm not sure if this was intentional for some reason or not? Easy enough to fix it that was an oversight somewhere along the way.

That said, this is apparently the way it was before the churn for geneontology/go-site#1961 started, so at least that much is correct.

@suzialeksander
Copy link

FWIW geneontology/go-site#1994 added species_code: Ecol this for a different ticket

@kltm
Copy link
Member Author

kltm commented Apr 13, 2023

Recheck today after outage

@kltm
Copy link
Member Author

kltm commented Apr 13, 2023

Ugh, issue is persisting.

@kltm
Copy link
Member Author

kltm commented Apr 14, 2023

Okay, @pgaudet , I think I've tracked this back to an assumption in the "NEO Makefile builder" that believes that everything is compressed, which is not true for the ecoli/ecocyc data.

2023-04-13 17:46:32 (441 KB/s) - ‘mirror/18.E_coli_MG1655.goa.tmp’ saved [11407440/11407440]

gzip -dc mirror/18.E_coli_MG1655.goa | ./gaf2obo.pl -s Ecol -n ecocyc > target/neo-ecocyc.obo.tmp && mv target/neo-ecocyc.obo.tmp target/neo-ecocyc.obo

gzip: mirror/18.E_coli_MG1655.goa: not in gzip format

@kltm
Copy link
Member Author

kltm commented Apr 14, 2023

Testing; build in pipeline.

@pgaudet @vanaukenk If this works (and I believe it will), we probably want to get this out before the next outage in a month.

@kltm
Copy link
Member Author

kltm commented Apr 15, 2023

@pgaudet @vanaukenk I believe this is (finally) working now.

@pgaudet
Copy link

pgaudet commented Apr 17, 2023

Maybe there are multiple problems - in this model http://noctua.geneontology.org/workbench/noctua-visual-pathway-editor/?model_id=gomodel:62f58d8800001680
the gene label still doesn't show.

Thanks, Pascale

@kltm
Copy link
Member Author

kltm commented Apr 17, 2023

@pgaudet I think that that is a different issue: we can confirm that Ecoli data is now present in NEO. As this fix went in after the outage, it may be related to that, or another issue. That said, it appears in autocomplete dropdowns now, so the data is there (which is the scope of this ticket).

Also note: #111

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants