First of all... love this! Thanks so much
Is your feature request related to a problem? Please describe.
- So my concern is around handling of summarizer tokens sent to the LLM when things are happening really fast or when a conversation is happening which should not be retained. I know exists for that part but concerns are noted.
- I admit I may just not know how it works under the hood but it was hard to validate what it was actually doing (logs may have had it but if they don't log tokens in/out for summarizing etc they def should!)
- Thanks again for all the work on this!
Describe the solution you'd like
- For , i think we should instead allow toggling a flag that indicates when to handle new memory creation. For example I ran into a situation where I was generating messages so quickly and they had so much context that it cost me nearly $5 within 30 minutes or so overall.
- We should also be able to turn it off completely for a bit. In my case I had to uninstall claude-mem to stop it from summarizing non stop because the messages were repetitive and it was wasting tons of tokens
Describe alternatives you've considered
- I actually had to uninstall claude mem cause once its installed theres no way to temporarily turn it off and I was worried about token usage.
Additional context
- Its worth noting that once i uninstalled it... the web server was still active and I had to find it and kill the process manually.
First of all... love this! Thanks so much
Is your feature request related to a problem? Please describe.
Describe the solution you'd like
Describe alternatives you've considered
Additional context