Pinecone integration #189

iuliadmtru · 2024-08-06T18:21:01Z

This is an attempt to integrate Pinecone into PromptingTools. The preparation and retrieval steps are not necessary with Pinecone. The result of the Pinecone query serves as context in the RAGResult.

Questions

The new types have a PT prefix such as to not cause confusion with Pinecone.jl (there's already a PineconeIndex there, which is a different thing). That is not great.
I added a NoChunker type such that PTPineconeIndexer can still have a chunker. Is that best? Probably not.
I think the dispatch of find_tags might not be necessary.
Is it necessary to have a new abstract type (AbstractPTPineconeIndex)? Probably not.
I also added a new template, JuliaRAGAssistant, and I put that under persona-task. I'm not sure that's the best place.

svilupp · 2024-08-06T20:58:12Z

Thanks for the PR! I'll review it properly later.

The first reaction was that we won't need a custom index (to avoid duplication), but you might be right that it's the easiest way.

Questions for you:

do you keep your document chunks or use the Pinecone to maintain them? (I think so)
do you embed your documents with OpenAI or use Pinecone?
do you rerank documents inside Pinecone (I think not?)

(some notes for me)
I'll check for the following functionality

upload documents + upload embeddings (index composition/some postprocessing of ChunkIndex?)
getting embeddings (get_embeddings) if we use the embedding by Pinecone
find closest docs (find_closest if we only return chunk positions, maybe we could hack documents retrieved in a special CandidateChunks)
getting document chunks (either chunks of subindex or maybe use getindex())

iuliadmtru · 2024-08-07T14:31:49Z

do you keep your document chunks or use the Pinecone to maintain them?

do you embed your documents with OpenAI or use Pinecone?

The chunks and embeddings get uploaded to Pinecone and then Pinecone takes care of sorting/clustering and all that when retrieving.

do you rerank documents inside Pinecone

Well, no, but it's possible I think: https://docs.pinecone.io/guides/inference/rerank

svilupp · 2024-08-08T07:38:47Z

templates/persona-task/JuliaRAGAssistant.json

+      "_type": "metadatamessage"
+    },
+    {
+      "content": "Act as a world-class Julia language programmer with access to the latest Julia-related knowledge via Context Information. \n\n**Instructions:**\n- Answer the question based only on the provided Context.\n- Be precise and answer only when you're confident in the high quality of your answer.\n- Be brief and concise.\n\n**Context Information:**\n---\n{{context}}\n---\n",


It might be further down the line but I'd recommend adding stronger wording about refusing to answer when the context is bad / wrong to avoid weak answer. Knowing when to refuse is a key for hallucination minimization

I think this should be removed from here altogether. It has nothing to do with Pinecone.

svilupp · 2024-08-08T07:39:13Z

Project.toml

@@ -7,15 +7,18 @@ version = "0.45.0"
 AbstractTrees = "1520ce14-60c1-5f80-bbc7-55ef81b5835c"
 Base64 = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f"
 Dates = "ade2ca70-3891-5945-98fb-dc099432e06a"
+DotEnv = "4dc1fcf4-5e3b-5448-94ab-0c38ec0385c1"


I know this is just a Draft, but these deps would have to go out and Pinecone would be an extension

src/Experimental/RAGTools/generation.jl

src/Experimental/RAGTools/preparation.jl

src/Experimental/RAGTools/retrieval.jl

src/Experimental/RAGTools/types.jl

svilupp · 2024-08-08T08:36:21Z

Apologies for scattering comments all over!

One thing, Pinecone should be relevant for "retrieve" step. It should need very little changed for generate! step (only build_context needs work because we build overlap/window extension).
I think it would be good to step back and decide on what do you want to be handled by PT.

Eg, If you say:

we want to chunk documents externally (not sure why)
we don't want to build index in PT (ie, chunk, embed, upload)
we DO want to manage embeddings by PT (risking inconsistency/bad results otherwise)
we DON't want to use any tag/tag filtering (predicate pushdown)
we DON't want reranking

Option 1) The the simplest path should be to have a lightweight PineconeIndex + method for retrieve (manually writing everything) + method for build_context (to avoid creating the overlaps which are not possible with 1:1 retrieval).

Option 2) (this also promotes more re-use and optional addition of other methods as the RAG paradigm progresses)
However, the retrieve would be so butchered that it might be better to:

define No-op for get_tags, find_tags (we have that so just extend), rerank
add methods for get_embeddings, find_closest (-> uses Pinecone.query), build_context

Option 3) (I would recommend this)
I'd keep the current paradigm: the index (stores chunks + embedding/ remotely) + candidatechunks (stores query results/or its reference).
The difference would be that we create a custom candidatechunks object that will hold the chunks, metadata, sources (not the "references" to index)
We create support for chunks(index,candidatechunks) or we continue the getindex with index[candidatechunks,:chunks].
This is slightly more work/thinking upfront, but then you can re-use all existing methods for embeddings, reranking, build_context, etc. the benefit is that to improve your RAG, you can just try different methods without having to "tweak" each one of them first.

The last option requires also some methods for find_closest (your "query"). And a few extensions to support the No-Op with new Index (eg, in find_tags).

What do you think?

iuliadmtru · 2024-08-08T11:51:10Z

Sorry for butchering your API! I didn't mean to actually hit click on creating this PR, it is definitely far from "ready"!

Thanks for all the comments. I would go for your recommendation (option 3), but I am still unsure how to go about that. @pankgeorg what do you say?

iuliadmtru · 2024-08-15T18:15:45Z

I integrated all feedback, I hope. There's still some TODOs left and some choices left to make, as we discussed. I'll try to include everything below:

Comments

The view should be properly implemented. Probably this will be done by adding a new struct, SubManagedIndex, similar to SubChunkIndex.
Names are still a bit weird, I just added them like you first suggested @svilupp cause I couldn't find something better (AbstractCandidateWithChunks).
Right now, build_index is just calling the PineconeIndex constructor. This could be an option for creating an Index (just calling the constructor); another option could be handling chunking and embedding as well, and adding them to Pinecone. Maybe both would be best?

svilupp · 2024-08-16T07:16:28Z

src/Experimental/RAGTools/types.jl

+indexid(cc::CandidateWithChunks) = cc.index_id
+positions(cc::CandidateWithChunks) = cc.positions
+scores(cc::CandidateWithChunks) = cc.scores
+chunks(cc::CandidateWithChunks) = cc.chunks


I don't think we will need these accessors but we can clean it up later

src/Experimental/RAGTools/preparation.jl

svilupp

All looks good so far! We can clean it up later.

As you pointed out, the build_index and the SubManagedIndex are index (for the view op).

It would be amazing to add a minimal example (you can grab some example documents from the docstrings of airag) to show how it works end-to-end

iuliadmtru · 2024-08-26T12:33:16Z

TODOs:

Add chunking, embedding etc in build_index
Add SubManagedIndex + view implementation
Write docs properly
Add examples in the docs
Clean up unnecessary methods
Clean up dependencies

iuliadmtru · 2024-09-03T17:30:30Z

I got to the cleaning up point. There are also a bunch of TODOs scattered around, minor aspects that need fixing. I will do the end-to-end example once those are finished.

@svilupp would you mind having another look at this, please?

svilupp

Looks good so far! A few minor questions around the Pinecone index and its fields.

Btw do you have like a minimal running example? I have a feeling this wouldnt work for airag

Project.toml

src/Experimental/RAGTools/generation.jl

svilupp · 2024-09-05T09:21:57Z

src/Experimental/RAGTools/generation.jl

+
+Dispatch for `AbstractManagedIndex`.
+"""
+function answer!(


do we need a separate method? I believe the index isnt used so maybe we just need to allow the new type in the definition?

I think there was an ambiguity issue with this, that's why I added a different function instead of providing the index type as a Union.

src/Experimental/RAGTools/preparation.jl

src/Experimental/RAGTools/retrieval.jl

svilupp · 2024-09-05T09:26:48Z

src/Experimental/RAGTools/types.jl

+end
+HasKeywords(::PineconeIndex) = false
+HasEmbeddings(::PineconeIndex) = true
+embeddings(index::PineconeIndex) = index.embeddings


should we have these accessors and fields if we know that the data isn’t there? we hold the data in the individual candidatechunk objects

If these are removed, then getting the embeddings in retrieve should be changed.

iuliadmtru · 2024-09-27T09:09:32Z

I integrated your feedback @svilupp. I hope I understood your comments well enough. I left some TODOs and also added a MWE for airag.

iuliadmtru force-pushed the pinecone branch 2 times, most recently from 27729fc to f6c01c8 Compare August 6, 2024 18:53

svilupp reviewed Aug 8, 2024

View reviewed changes

svilupp reviewed Aug 16, 2024

View reviewed changes

src/Experimental/RAGTools/preparation.jl Outdated Show resolved Hide resolved

svilupp reviewed Aug 16, 2024

View reviewed changes

src/Experimental/RAGTools/preparation.jl Show resolved Hide resolved

svilupp reviewed Aug 16, 2024

View reviewed changes

iuliadmtru added 4 commits August 27, 2024 10:16

Integrate Pinecone on all levels

5795913

Integrate feedback; change structs and general workflow

ef778d1

Fix ambiguities

87c56b3

Add chunking, embedding and tagging in build_index

1c4e0f5

iuliadmtru force-pushed the pinecone branch from 29b9335 to 1c4e0f5 Compare August 27, 2024 08:19

Add SubManagedIndex and view

611716e

iuliadmtru force-pushed the pinecone branch from cbd9fc1 to 611716e Compare September 1, 2024 13:55

Add documentation

b027d05

iuliadmtru force-pushed the pinecone branch from ac02b82 to b027d05 Compare September 3, 2024 17:27

svilupp reviewed Sep 5, 2024

View reviewed changes

Integrate feedback; add Pinecone example to airag

3334a48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pinecone integration #189

Pinecone integration #189

iuliadmtru commented Aug 6, 2024 •

edited

Loading

svilupp commented Aug 6, 2024

iuliadmtru commented Aug 7, 2024

svilupp Aug 8, 2024

iuliadmtru Sep 27, 2024

svilupp Aug 8, 2024

svilupp commented Aug 8, 2024

iuliadmtru commented Aug 8, 2024

iuliadmtru commented Aug 15, 2024 •

edited

Loading

svilupp Aug 16, 2024

svilupp left a comment

iuliadmtru commented Aug 26, 2024 •

edited

Loading

iuliadmtru commented Sep 3, 2024

svilupp left a comment

svilupp Sep 5, 2024

iuliadmtru Sep 27, 2024

svilupp Sep 5, 2024

iuliadmtru Sep 27, 2024

iuliadmtru commented Sep 27, 2024

Pinecone integration #189

Are you sure you want to change the base?

Pinecone integration #189

Conversation

iuliadmtru commented Aug 6, 2024 • edited Loading

svilupp commented Aug 6, 2024

iuliadmtru commented Aug 7, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

svilupp commented Aug 8, 2024

iuliadmtru commented Aug 8, 2024

iuliadmtru commented Aug 15, 2024 • edited Loading

Choose a reason for hiding this comment

svilupp left a comment

Choose a reason for hiding this comment

iuliadmtru commented Aug 26, 2024 • edited Loading

iuliadmtru commented Sep 3, 2024

svilupp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

iuliadmtru commented Sep 27, 2024

iuliadmtru commented Aug 6, 2024 •

edited

Loading

iuliadmtru commented Aug 15, 2024 •

edited

Loading

iuliadmtru commented Aug 26, 2024 •

edited

Loading