An example application that demonstrates how to use LangChain's graph_vectorstores and CassandraGraphVectorStore to add structured data to RAG (Retrieval-Augmented Generation) applications. The app scrapes content from specified URLs, processes the content, and performs vector similarity and graph traversal searches.
____ _ ____ _ ____
/ ___|_ __ __ _ _ __ | |__ | _ \ / \ / ___|
| | _| '__/ _` | '_ \| '_ \| |_) | / _ \| | _
| |_| | | | (_| | |_) | | | | _ < / ___ \ |_| |
\____|_| \__,_| .__/|_| |_|_| \_\/_/ \_\____|
|_|
*no graph database needed!!!
-
Clone the repository:
git clone https://github.com/datastaxdevs/graph-rag-example.git cd graphRAG_example
-
Create and activate a virtual environment:
python3 -m venv venv source venv/bin/activate
-
Install the required dependencies:
pip install -r requirements.txt
-
Set up the environment variables:
- Copy the
.env.example
file to.env
:cp .env.example .env
- Fill in the required environment variables in the
.env
file.
Once you have your .env ready, create a DataStax Astra Vector database if you don't already have one and copy the database ID, API endpoint, and an application token from the database overview page. Everything you need will be there.
You also need an OpenAI API key to power the LLM responsible for giving responses.
- Copy the
-
Run the data loading script:
python load_data.py
load_data.py pulls data from www.themoviedb.org and extracts page content and metadata used in the graph.
-
Run the main script:
python app.py
app.py displays a Dash based UI that allows a real-time comparison between both similarity and traversal based searches using graph RAG.
This project is licensed under the MIT License.