-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementing meta-information for annotations of objects #77
Comments
Hi Seb,
are we talking about the disease models or the queries that people are sending to Genomiser or both? Can we please schedule a telcon about this, I do not feel I have enough information about the overall gameplan at this point.
…-Peter
|
Both. Frequency makes only sense for diseases. Confidence makes more sense for individual's annotation.Skype sounds good
|
OK, let's separate external I/O from internal modeling. For external I/O, all of this is handled by phenopackets. We have a PR for loading phenopackets #30. I'm open to also having smaller ad-hoc jsons and tsvs for particular limited scenarios, but I think this could create confusion longer term. For internal modeling, OwlSim3 assumes everything is converted to OWL, and then it has its own simplified OWL internal storage.The subset supported historically supported was:
of course other constructs are supported for the pre-processing reasoning step, but OwlSim operations were historically defined in terms of the above. For FrequencyAware NaiveBernoulliBayes, we extended the KB model to include getDirectWeightedTypes for any individual. In theory it should be possible to keep advancing the KB with additional methods like this, but yr thoughts welcome as to a larger refactor. |
For temporal info, we can just use RO axioms and expressing things like
OWLTime axioms may come in useful to map quantitive temporal data to bins. @dosumis did a paper on classifying phenotypes based on RO temporal axioms: https://jbiomedsem.biomedcentral.com/articles/10.1186/2041-1480-4-30 The simplest approach is to make the required groupings ahead of time and then feed it to the algorithm like any other ontology. But we should probably be thinking more algorithm-first, representation-later. There are a lot of different approaches here, and it depends on what data is available - temporal bins, quantitative progression data, etc. |
Thanks.
I was kind of expecting that and understand your motivation. In my opinion this feels a bit bulky compared to something simple as
But how can I try out algorithmic ideas, when I can't get the information into owlsim. But maybe we can discuss on one of the future calls. |
I agree with your points, I don't know what the best approach is other
than to be pragmatic - if we want to try something new lets just get the
information in there, some of the harmonization can come later.
|
Hi all,
after a long Skype-session with Jules, we came up with the following owlsim3-relevant ticket:
We will need more meta-data associated with object-to-ontology term associations. At the moment it seems to be only possible to add negation (although this seems to be non-trivial to provide). To get an idea what I am referring to here, see my comment at this PR
This ticket is about two things, how do we handle meta-information internally and how do we provide this information to owlsim3 (syntax-wise).
Handling
I would like to know how complicated this will be from a software-perspective. Must haves are negation (this seems to be working somehow already) and frequency (for diseases/mouse populations). Date we need in a not so distant future are time points (e.g. congenital, juvenile, adult etc.). Here we might have to look at the recent W3C time ontology. Information, that I think will be important later, are confidence (e.g. how sure is the user that the distance between the eyes is abnormal) and relevance (i.e. does the user think this an important feature for the diagnosis). The rational behind the idea of providing confidence and relevance (and maybe more, like expected agreement with other physicians opinions) comes from discussions such as in this article or this recent Nature paper (esp. Study 3), where it is often found useful to let the user indicate additional information/priors to their choice/query (I know the latter article argues against confidence as parameter, but the point is that we should prepare for additional user-provided data). Although, we currently miss the ontology-based algorithms to make use of such information, I think it is utterly important to discuss now how to collect such information. Any thoughts welcome!
Syntax
How should such information be provided by the user to our tools? One idea from @julesjacobsen was a JSON representation like
{HP:0001, freq=0.99, rel=0.9, NOT}
. How should such information be handled internally? Would be great if you could have some thoughts on this.@julesjacobsen please add to this ticket if I missed important points/ideas.
Everybody is invited to give thoughts, especially @pnrobinson , @cmungall , @mellybelly , @julesjacobsen , @jnguyenx , etc.
The text was updated successfully, but these errors were encountered: