How can we use large language models to generate natural language descriptions of knowledge graph entities and relationships? #2

fmdelgado · 2023-12-06T12:35:59Z

fmdelgado
Dec 6, 2023

Project Objective

Knowledge graphs in biological research often contain large and heterogeneous data sets, making it challenging to interpret this information in a way that's useful for formulating hypotheses and aiding research. A robust interpretation or model that can effectively analyze and interpret this data would be invaluable, particularly for clinicians involved in the drug repurposing process, as it would enhance their understanding of the complex associations within the data.

Implementation Strategy

First Steps:

Data Acquisition: Acquire a comprehensive knowledge graph containing detailed information about drugs, diseases, genes, proteins, and their interrelationships. We recommend using NeDRexv.1 or DISNET as a starting point.
Model application: use a language model, such as LLAMA2, on the annotated knowledge graph. We will provide guidance on interpreting the relationships represented in the graph.
Graph Interpretation: Apply the trained model to a specific graph, derived from entities in the knowledge graph, to enable the model to contextualize and explain the displayed network within a medical framework.

Some Ideas:

Encyclopedic Entity Descriptions: Utilize the trained language model to generate detailed descriptions of entities within the knowledge graph. This will enrich the graph with informative context for each entity.
Narrative Relationship Descriptions: Leverage the model to create engaging narrative descriptions that elucidate the relationships between different entities in the knowledge graph, adding a layer of interpretability and insight.
Knowledge Graph Path Summarization: Employ the model to succinctly summarize paths through the knowledge graph. This approach aims to provide a clear and concise overview of the connections and interactions between various entities.

slobentanzer · 2023-12-06T15:15:26Z

slobentanzer
Dec 6, 2023
Maintainer

Are you suggesting to fine-tune an LLM? Which architecture would you use, and where (hardware) do we do the training?

For the knowledge graphs, I would remark that BioChatter works natively with BioCypher KGs, by explaining the graph structure to the LLM; using a KG that is not in BioCypher would mean not being able to use that integration. See the responsible module at https://github.com/biocypher/biochatter/blob/main/biochatter/prompts.py, the docs description of the feature, and this demo repository: https://github.com/biocypher/pole.

0 replies

fmdelgado · 2023-12-06T17:33:38Z

fmdelgado
Dec 6, 2023
Author

I updated the discussion. I'm not sure we necessarily need to train the model. I suggest using the native explanations provided for BioCypher KGs to explain in a natural language what an user sees in a graph. So I meant that once the query has been executed (as a result of docs description of the feature ), can the LLM interpret the result of the query in a biomedical context? and potentially make sense of it?

1 reply

slobentanzer Dec 6, 2023
Maintainer

I understand better now. To make this more tangible, could you describe a use case? Who would use it, and what would they interface with? A web-app? Python? I assume you are aware of the knowledge graph tab in ChatGSE (that we use in the pole demo)? Something on top of that?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can we use large language models to generate natural language descriptions of knowledge graph entities and relationships? #2

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

How can we use large language models to generate natural language descriptions of knowledge graph entities and relationships? #2

fmdelgado Dec 6, 2023

Project Objective

Implementation Strategy

Replies: 2 comments · 1 reply

slobentanzer Dec 6, 2023 Maintainer

fmdelgado Dec 6, 2023 Author

slobentanzer Dec 6, 2023 Maintainer

fmdelgado
Dec 6, 2023

Replies: 2 comments 1 reply

slobentanzer
Dec 6, 2023
Maintainer

fmdelgado
Dec 6, 2023
Author

slobentanzer Dec 6, 2023
Maintainer