[Question] Restricting prior internal knowledge for RAG #32

JasonIsaac · 2024-04-26T08:39:55Z

Hello Everyone,

I am building RAG application with model Mistral-7B-Instruct-v0.2. It works well for questions related to the knowledge content. But I want to restrict the LLM from answering to any out of scope questions. This is the current prompt I am using:

You are an assistant to users who have queries about {topic} .
Only USE the following pieces of contexts under **Contexts** to answer. Do not use any external knowledge or information. If the answer cannot be determined from the context, respond with "I don't have enough information to answer that"

**Contexts**:

{contexts}

What is the effective way to stop LLM from answering questions if it is out of scope?.
Can we prompt like above and restrict or is there a better way?

Thanks in Advance

The text was updated successfully, but these errors were encountered:

rsoika · 2024-05-06T17:00:08Z

Hi @JasonIsaac , I am also working with Mistral-7B-Instruct-v0.2 and have also a lot of questions about the prompt format. But I can't find answers - and sorry I can't answer your question too.

But maybe we can discuss the topic a little bit more? Do you use the chat-template?

The problem for me is that this link seems to be the only source of documentation about how to define a prompt. But the documentation is very poor. It only tells us that the prompt format is very important and that <s> is the begin and </s> the end of a string. But it does not explain the difference about a string and "regular strings". This makes it - at least for me - impossible to construct complex prompts.

I also got an answer from another discussion that newlines are also very important inside a prompt. Are you using the line breaks in your example deliberately?

JasonIsaac · 2024-05-08T10:36:25Z

Hi @rsoika ,

Thanks for you reply.

Yes, I came to know we can use the llama2 format which uses <s></s> and [INST][/INST] tags from a different discussion discussion after posting this question. Currently my prompt looks likes this:

<s>[INST] <<SYS>>
You are an assistant to users who have queries about {topic} .
Only USE the following pieces of contexts under **Contexts** to answer. Do not use any external knowledge or information. If the answer cannot be determined from the context, respond with "I don't have enough information to answer that"

**Contexts**:
{contexts}
<</SYS>>

{user_message_1} [/INST] {model_answer_1}</s>

With the above prompt, the model is able to:

Generate response from contexts
maintain the chat history

Restricting answers to out of scope questions is still an unknown.

Can you give an example of a complex prompt?.

Yes, line breaks are deliberate, even I read about it somewhere else.

rsoika · 2024-05-08T13:05:53Z

Hi @JasonIsaac , yes you prompt template is very very interesting too :-)

In the early beginning I also started with the <</SYS>> tags, but than I removed them, because I thought they are not supported by mistra7b....

I show you my current prompt, that I use to analyze/summarize the content of business documents. (e.g. an Invoice Document

<s>[INST] You are a clerk in a logistics company and your job is to check incoming invoices.[/INST]

<FILECONTEXT>^.+\.([pP][dD][fF])$</FILECONTEXT>

</s>
[INST] Briefly summarize this invoice document in the the following 3 data blocks:

** Company Data **
List all relevant information about the company, like the company name, general tax information, the address and contact information.

** Payment Data **
List all the necessary information to be able to make the payment, like the invoice date and total amount, the bank information including IBAN, BIC or SWIFT code and also information about the payment terms.

** Invoice Items **
Create a table with the data of the invoice items or services listed in the invoice. This table can include the description, the quantity, price and tax information for each item. 

Don't calculate any amounts yourself, but only take amounts that actually appear in the invoice document!

[/INST]

<FILECONTEXT is replaced by our application with the content of a OCR parsed document - and this content is extremely unformatted with lot of newlines, spaces and tabs. But this works now very well now.
You can see that I close the <s> very early.

Have you seen this discussion which brings up - at least for me - now new insights.

rsoika · 2024-05-08T13:06:53Z

One more question: how long is your final prompt? In my case it can be more the 8KB

JasonIsaac changed the title ~~[QUESTION] Restricting prior internal knowledge for RAG~~ [Question] Restricting prior internal knowledge for RAG Apr 26, 2024

rsoika mentioned this issue May 6, 2024

Document Prompt Format imixs/imixs-ai#12

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Restricting prior internal knowledge for RAG #32

[Question] Restricting prior internal knowledge for RAG #32

JasonIsaac commented Apr 26, 2024

rsoika commented May 6, 2024

JasonIsaac commented May 8, 2024 •

edited

Loading

rsoika commented May 8, 2024

rsoika commented May 8, 2024

[Question] Restricting prior internal knowledge for RAG #32

[Question] Restricting prior internal knowledge for RAG #32

Comments

JasonIsaac commented Apr 26, 2024

rsoika commented May 6, 2024

JasonIsaac commented May 8, 2024 • edited Loading

rsoika commented May 8, 2024

rsoika commented May 8, 2024

JasonIsaac commented May 8, 2024 •

edited

Loading