Replies: 2 comments 1 reply
-
Are you suggesting to fine-tune an LLM? Which architecture would you use, and where (hardware) do we do the training? For the knowledge graphs, I would remark that BioChatter works natively with BioCypher KGs, by explaining the graph structure to the LLM; using a KG that is not in BioCypher would mean not being able to use that integration. See the responsible module at https://github.com/biocypher/biochatter/blob/main/biochatter/prompts.py, the docs description of the feature, and this demo repository: https://github.com/biocypher/pole. |
Beta Was this translation helpful? Give feedback.
-
I updated the discussion. I'm not sure we necessarily need to train the model. I suggest using the native explanations provided for BioCypher KGs to explain in a natural language what an user sees in a graph. So I meant that once the query has been executed (as a result of docs description of the feature ), can the LLM interpret the result of the query in a biomedical context? and potentially make sense of it? |
Beta Was this translation helpful? Give feedback.
-
Project Objective
Knowledge graphs in biological research often contain large and heterogeneous data sets, making it challenging to interpret this information in a way that's useful for formulating hypotheses and aiding research. A robust interpretation or model that can effectively analyze and interpret this data would be invaluable, particularly for clinicians involved in the drug repurposing process, as it would enhance their understanding of the complex associations within the data.
Implementation Strategy
First Steps:
Data Acquisition: Acquire a comprehensive knowledge graph containing detailed information about drugs, diseases, genes, proteins, and their interrelationships. We recommend using NeDRexv.1 or DISNET as a starting point.
Model application: use a language model, such as LLAMA2, on the annotated knowledge graph. We will provide guidance on interpreting the relationships represented in the graph.
Graph Interpretation: Apply the trained model to a specific graph, derived from entities in the knowledge graph, to enable the model to contextualize and explain the displayed network within a medical framework.
Some Ideas:
Encyclopedic Entity Descriptions: Utilize the trained language model to generate detailed descriptions of entities within the knowledge graph. This will enrich the graph with informative context for each entity.
Narrative Relationship Descriptions: Leverage the model to create engaging narrative descriptions that elucidate the relationships between different entities in the knowledge graph, adding a layer of interpretability and insight.
Knowledge Graph Path Summarization: Employ the model to succinctly summarize paths through the knowledge graph. This approach aims to provide a clear and concise overview of the connections and interactions between various entities.
Beta Was this translation helpful? Give feedback.
All reactions