From e1fb25df4383eb7fcd70a8fd133510cec44e7351 Mon Sep 17 00:00:00 2001
From: Ayush Thakur <mein2work@gmail.com>
Date: Tue, 23 Apr 2024 22:02:58 +0530
Subject: [PATCH] update readme with what's new

---
 README.md | 56 ++++++++++++++++++++++++-------------------------------
 1 file changed, 24 insertions(+), 32 deletions(-)

diff --git a/README.md b/README.md
index 904a5fe..5d2fd2c 100644
--- a/README.md
+++ b/README.md
@@ -4,10 +4,33 @@ Wandbot is a question-answering bot designed specifically for Weights & Biases [
 Leveraging the power of [llama-index](https://gpt-index.readthedocs.io/en/stable/) and OpenAI's [gpt-4](https://openai.com/research/gpt-4), it provides precise and context-aware responses
 using a combination of [FAISS](https://github.com/facebookresearch/faiss) for RAG and OpenAI's [gpt-4](https://openai.com/research/gpt-4) for generating responses.
 
+## What's New
+
+### wandbot v1.3.0
+
+This release introduces a number of exciting updates and improvements:
+
+- **Parallel LLM Calls**: Replaced the llama-index with the LECL, enabling parallel LLM calls for increased efficiency.
+- **ChromaDB Integration**: Transitioned from FAISS to ChromaDB to leverage metadata filtering and speed.
+- **Query Enhancer Optimization**: Improved the query enhancer to operate with a single LLM call.
+- **Modular RAG Pipeline**: Split the RAG pipeline into three distinct modules: query enhancement, retrieval, and response synthesis, for improved clarity and maintenance.
+- **Parent Document Retrieval**: Introduced parent document retrieval functionality within the retrieval module to enhance contextuality.
+- **Sub-query Answering**: Added sub-query answering capabilities in the response synthesis module to handle complex queries more effectively.
+- **API Restructuring**: Redesigned the API into separate routers for retrieval, database, and chat operations.
+
+These updates are part of our ongoing commitment to improve performance and usability.
+
+## Evaluation
+
+| wandbot version  | Comment  | response accuracy |
+|---|---|---|
+| 1.0.0 | our baseline wandbot |  53.78 % |
+| 1.1.0 | improvement over baseline; in production for the longest | 72.45 %  | 
+| 1.3.0 | our new enhanced wandbot | 81.63 % |
 
 ## Features
 
-- Wandbot employs Retrieval Augmented Generation with a [FAISS](https://github.com/facebookresearch/faiss) backend, ensuring efficient and accurate responses to user queries by retrieving relevant documents.
+- Wandbot employs Retrieval Augmented Generation with a ChromaDB backend, ensuring efficient and accurate responses to user queries by retrieving relevant documents.
 - It features periodic data ingestion and report generation, contributing to the bot's continuous improvement. You can view the latest data ingestion report [here](https://wandb.ai/wandbot/wandbot-dev/reportlist).
 - The bot is integrated with Discord and Slack, facilitating seamless integration with these popular collaboration platforms.
 - Performance monitoring and continuous improvement are made possible through logging and analysis with Weights & Biases Tables. Visit the workspace for more details [here](https://wandb.ai/wandbot/wandbot_public).
@@ -78,37 +101,6 @@ For more detailed instructions on installing and running the bot, please refer t
 
 Executing these commands will launch the API, Slackbot, and Discord bot applications, enabling you to interact with the bot and ask questions related to the Weights & Biases documentation.
 
-## Evaluation
-
-We evaluated the performance of the Q&A bot manually and using auto eval strategies. The following W&B reports document the steps taken to evaluate the Q&A bot:
-
-- [How to evaluate an LLM Part 1: Building an Evaluation Dataset for our LLM System](http://wandb.me/wandbot-eval-part1): The report dives into the steps taken to build a gold-standard evaluation set.
-- [How to evaluate an LLM Part 2: Manual Evaluation of our LLM System](http://wandb.me/wandbot-eval-part2): The report talks about the thought process and steps taken to perform manual evaluation.
-- [How to evaluate an LLM Part 3: Auto-Evaluation; LLMs evaluating LLMs](http://wandb.me/wandbot-eval-part3): Various LLM auto-eval startegies are documented in this report.
-
-### Evaluation Results
-
-**Manual Evaluation**
-
-We manually evaluated the Q&A bot's responses to establish a basline score.
-
-| Evaluation Metric  | Comment  | Score |
-|---|---|---|
-| Accurary | measure the correctness of Q&A bot responses |  66.67 % |
-| URL Hallucination | measure the validity and relevancy of the links | 10.61 %  | 
-| Query Relevancy | measure if the query is relevant to W&B | 88.64 % |
-
-**Auto Evaluation (LLM evaluate LLM)**
-
-We employed a few auto evaluation strategies to speed up the iteration process of the bot's development
-
-| Evaluation Metric  | Comment  | Score |
-|---|---|---|
-| Faithfulness Accuracy | measures if the response from a RAG pipeline matches any retrieved chunk | 53.78 % |
-| Relevancy Accuracy | measures is the generated response is in-line with the context | 61.36 % |
-| Hit Rate | measures if the correct chunk is present in the retrieved chunks | 0.79 |
-| Mean Reciprocal Ranking (MRR) | measures the quality of the retriever | 0.74 |
-
 ## Overview of the Implementation
 
 1. Creating Document Embeddings with ChromaDB