-
Notifications
You must be signed in to change notification settings - Fork 5.5k
Description
🚀 The feature
Current State:
The mem0 only supports sending text-based messages via JSON. This limits the agent's "long-term memory" to conversation history only.
Proposed Change:
Support document uploads (PDF, TXT, DOCX). This requires:
- Multipart/Form-Data Support: Switching from pure JSON payloads to
MultipartBody. Use blob content bytes in Gemini requests instead of a text content. - Large File Handling: Given our recent experience with Nginx 413 errors, we must ensure the client can stream large files without loading them entirely.
- Metadata Support: Ability to attach
user_id,app_id, and custom tags to the uploaded file for filtered retrieval.
Motivation, pitch
It’s time to take our Agent’s intelligence to the next level. 🧠
Right now, we’ve successfully taught our agent to remember 'what was said.' That’s a great start, but to build a truly world-class AI, we need it to know 'what is written.' We are currently bottlenecked by small text snippets, like trying to teach someone physics through a series of text messages.
By implementing File Ingestion, we are unlocking the ability to 'feed' our system entire libraries of knowledge. We're talking about technical specs, full-scale manuals, and proprietary datasets that will transform our agent from a simple conversationalist into a high-level subject matter expert.
Yes, we’ve faced some '413 Request Entity Too Large' hurdles recently, but that’s just a sign that we’ve outgrown our current infrastructure. Switching to Multipart uploads isn't just a technical fix—it's our ticket to the big leagues of RAG and enterprise-grade AI. Let’s stop sending snippets and start uploading wisdom.
Let's build the brain this project deserves!