AutoMemory Disables Prompt Caching #171

alexey2baranov · 2024-10-02T08:14:34Z

I found an interesting LLM feature, "Prompt Caching," which is supported by Anthropic and DeepSeek (and potentially other providers). You can find more details here: Prompt Caching Documentation.

Unfortunately, this helpful feature is completely deactivated by the current implementation of AutoMemory. This happens because AutoMemory always appends different pieces of memories in the automem block inside System message. So, even when the rest of the long history are the same, Prompt Caching doesn't work.

Suggestion

I suggest moving the AutoMemory block from the System message to the end of the message history, creating a separate message for it.

System: 
User: 
Assistant: 
Tool output: 
Assistant: 
... 
... 
Tool output: 
Memory Suggestion: Here are relevant memories <><><>. Use memory_tool to recall the full memory content.

Benefits

Cost reduction: By enabling Prompt Caching, we reduce the cost significantly.
Improved Speed: The activation of Prompt Caching will also enhance the speed of interactions.

Let me know if you'd like me to submit a PR with this implementation!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AutoMemory Disables Prompt Caching #171

AutoMemory Disables Prompt Caching #171

alexey2baranov commented Oct 2, 2024 •

edited

Loading

AutoMemory Disables Prompt Caching #171

AutoMemory Disables Prompt Caching #171

Comments

alexey2baranov commented Oct 2, 2024 • edited Loading

Suggestion

Benefits

alexey2baranov commented Oct 2, 2024 •

edited

Loading