Feat v1.1 Updates to chat client, retrieval, and evauations #54

parambharat · 2024-01-08T09:16:05Z

Overview

This PR introduces enhancements and fixes to the chat client, focusing on query handling, You.com retrieval, and various formatting and compatibility improvements.

Key Enhancements

Query Enhancements: Added a query enhancer to classify query intents, and identify keywords and sub-queries from a user query.
Chat History Retrieval: Implemented a feature for more efficient chat history retrieval.
Compatibility and Formatting Fixes: Addressed issues related to Pydanticv1 compatibility, JSON formatting errors, and Slack message formatting.

Additional Features

Improved chat prompting method.
Separated the query handler for better modularity.
Enhanced ingestion pipeline and evaluation processes.

…ing.

…evals

…rface

…ries

src/wandbot/api/schemas.py

morganmcg1 · 2024-01-09T13:55:21Z

src/wandbot/api/schemas.py

+    query: str
+    language: str = "en"
+    initial_k: int = 50
+    top_k: int = 5


Maybe adding tags here to do filtering on the app slide, after-retrieval but before-response

Added include_tags and `exclude_tags keys for both inclusions and exclusions

src/wandbot/apps/slack/__main__.py

morganmcg1 · 2024-01-09T13:58:21Z

src/wandbot/apps/slack/formatter.py

+import regex as re
+
+
+class MrkdwnFormatter:


Not a typo, Mrkdwn is actually a formatting language used by slack

src/wandbot/apps/slack/formatter.py

…ndencies

morganmcg1 · 2024-01-10T15:16:58Z

src/wandbot/chat/query_enhancer.py

+
+class MultiLabel(BaseModel):
+    label: Labels = Field(..., description="The label for the query")
+    reasoning: str = Field(


I think reasoning should come before label to allow the model to think first

morganmcg1 · 2024-01-10T15:20:04Z

src/wandbot/chat/query_enhancer.py

+    "support",
+    Labels.BEST_PRACTICES.value: "The query is related to best practices for using Weights & Biases. Answer the query "
+    "and provide guidance where necessary",
+    Labels.COURSE_RELATED.value: "The query is related to a Weight & Biases course and/or skill enhancement. Answer "


maybe specify that the W&B courses are "Machine Learning" and "AI" courses

morganmcg1 · 2024-01-10T15:24:23Z

src/wandbot/chat/query_enhancer.py

+                {
+                    "role": "system",
+                    "content": (
+                        "You are a Weights & Biases support manager. Your goal is to enhance the user query by adding "


"Your goal is to enhance...", maybe something a little more specific related to what the end goal is

"Your goal is to refine, clarify and augment the user query before it is passed to another AI support assistant. Please add the following information to the query: ...."

morganmcg1 · 2024-01-10T15:25:56Z

src/wandbot/chat/query_enhancer.py

+        return enhanced_query
+
+
+class QueryHandlerConfig(BaseSettings):


should this live in a separate config file?

morganmcg1 · 2024-01-10T15:27:47Z

src/wandbot/chat/retriever.py

+        return all_nodes
+
+
+class RetrieverConfig(BaseSettings):


should this be in a separate config?

morganmcg1 · 2024-01-10T15:42:28Z

src/wandbot/evaluation/eval/factfulness.py

+Your job is to judge the factful consistency of the generated answer with respect to the document.
+- An answer is considered factually consistent if it contents can be inferred solely from the provided documentation.
+- if an answer contains true information, if the information is not found in the document, then the answer is factually inconsistent.
+- The generated answer must provide only correct information according to the documentation.


"correct information" feels ambiguous, maybe clarify to something more like "provide only information found in the documentation"

morganmcg1 · 2024-01-10T16:00:11Z

src/wandbot/evaluation/eval/factfulness.py

+    safe_parse_eval_response,
+)
+
+SYSTEM_TEMPLATE = """You are a Weight & Biases support expert tasked with evaluating the factful consistency of answers to questions asked by users to a technical support chatbot.


in general, should the system templates be broken out into a separate file(s)?

morganmcg1 · 2024-01-10T16:00:37Z

src/wandbot/evaluation/eval/relevancy.py

+{{
+    "reason": <<Provide a brief explanation for your decision here>>,
+    "score": <<Provide a score as per the above guidelines>>,
+    "decision": <<Provide your final decision here, either 'relevant', or 'irrelevant'>>


same decision comment as above

morganmcg1 · 2024-01-10T16:01:58Z

src/wandbot/evaluation/eval/utils.py

+    except Exception as e:
+        print(e)
+        print(eval_response)
+        score = 0


maybe set to -1 just so its extremely obvious this isn't a valid score returned by the LLM?

morganmcg1 · 2024-01-10T16:05:31Z

src/wandbot/ingestion/prepare_data.py

+            spec = row_dict["spec"]
+            content = json.loads(spec)
+            markdown_content = self.parse_content(content)
+            output["content"] = (


i think it'd be useful to put the title ("display_name") and description in as their own standalone kv pairs, it could be a useful bit of metadata to have

morganmcg1 · 2024-02-01T12:25:29Z

lgtm!

Bharat Ramanathan and others added 22 commits December 15, 2023 09:55

feat: add initial evaluation code

3469d33

feat: add more updates and caching to eval process

5f740b3

feat: add additional few-shot result example for reasoning about scor…

c3f84e3

…ing.

feat: add retrieval endpoint, fix language filters and ad caching in …

277c79d

…evals

feat: add hybrid index retrieval with bm25 and you.com results

bf19259

feat: use litellm instead of openai to allow cohere and anthropic models

55b0c4f

feat: add reports dataloader

02aa6da

feat: add improve ingestion pipeline with better parsing and metadata

4ebfd25

feat: add new query handler and chat prompt

1c32d78

feat: separate out the query handler and retriever from the chat inte…

fb0f06c

…rface

chore: run linters and formatters

c37ccc5

feat: update prompting method with a better templated prompt.

6b52990

fix: json formatting errors and issues

0c869fc

fix: json formatting errors and issues

94a0e5b

chore: run formatters and linters

c71f4d5

feat: fix chat prompt logging in streamtable

d114363

fix: slack message formatting to mrkdwn

60ded81

fix: manually implement partial formatting for placeholder

86d38a7

fix: pydanticv1 compatibility for ResolvedQuery with llama-index

16b83c1

feat: add condensed chat history retriever.

6a746fc

feat: add query enhancer to the chat client with keywords and sub que…

915f440

…ries

chore: run formatters and linters

870723f

parambharat requested review from kldarek, morganmcg1, ayulockin and ash0ts January 8, 2024 09:16

fix: timezone issues in timer util

ac5ce0f

morganmcg1 reviewed Jan 10, 2024

View reviewed changes

parambharat added 2 commits January 11, 2024 11:18

feat: improve fc reports ingestion pipeline

4c68876

feat: change chunk size of markdown and code documents

912576b

parambharat and others added 26 commits January 22, 2024 22:33

Merge branch 'feat/v1.1' of github.com:wandb/wandbot into feat/v1.1

e14fab1

fix: switch to token text splitter for larger chunks

a6ea55e

feat: add wandb edu code loader

28a9404

fix: errors in code parsing logic

c6a12ab

fix: improve retrieval speed by loading retrievers at startup.

34d87d5

add dev depenencies to pyproject.toml

f3965dd

update tags logic for FC Reports

1ebf69b

add fastext model artifact and add fc tag logic

c9229f5

fix model_ naming to keep pydantic happy

5a30dc8

fix fasttext path

f4bc728

fix langdetect fasttext model loading

5f38fc6

exclude dev dependencies in build file

eac6f21

new build-dev.sh file to install all dependencies, including dev depe…

30c9083

…ndencies

add no-result dummy node creation if needed

b403be4

update licence field

20070e9

copy pyproject that works in deployment

e3d733c

re-adds zendesk app to run.sh

ee325a0

fix: replace headers in slack formatter with bold text

386c576

refactor: move FastText language detection to common utils

b04c663

feat: switch from cohere lang detect to fasttext langdetect

1ea8cdb

chore: install fasttext using pip instead of poetry and update lock file

0b7983d

chore: add fasttext to dev install in addition to prod install

6b3ed14

feat: switch to new turbo models

0d1fa36

revert: switch to older models for v1.1 before testing for v1.2

acefd29

chore: run linting and formatting changes

fe5b6b5

Merge branch 'main' of github.com:wandb/wandbot into feat/v1.1

48140fb

parambharat requested a review from ArtsiomWB February 1, 2024 07:05

morganmcg1 approved these changes Feb 1, 2024

View reviewed changes

parambharat merged commit 7a2baf3 into main Feb 12, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat v1.1 Updates to chat client, retrieval, and evauations #54

Feat v1.1 Updates to chat client, retrieval, and evauations #54

parambharat commented Jan 8, 2024

morganmcg1 Jan 9, 2024

parambharat Jan 11, 2024

morganmcg1 Jan 9, 2024

parambharat Jan 11, 2024

morganmcg1 Jan 10, 2024

morganmcg1 Jan 10, 2024

morganmcg1 Jan 10, 2024

morganmcg1 Jan 10, 2024

morganmcg1 Jan 10, 2024

morganmcg1 Jan 10, 2024

morganmcg1 Jan 10, 2024

morganmcg1 Jan 10, 2024

morganmcg1 Jan 10, 2024

morganmcg1 Jan 10, 2024

morganmcg1 commented Feb 1, 2024

		return enhanced_query


		class QueryHandlerConfig(BaseSettings):

Feat v1.1 Updates to chat client, retrieval, and evauations #54

Feat v1.1 Updates to chat client, retrieval, and evauations #54

Conversation

parambharat commented Jan 8, 2024

Overview

Key Enhancements

Additional Features

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

morganmcg1 commented Feb 1, 2024