Skip to content

[Core] Roadmap for handling context overflow #156

@yiranwu0

Description

@yiranwu0

Any help is appreciated!

Thit task demands a considerable amount of effort. If you have insights, suggestions, or can contribute in any way, your help would be immensely valued.

Problem Description

(Continue from #9) Current LLM have limited context size / token limit (gpt3.5turbo: 4096, gpt4 8192, etc). Although the current max_token limit from OpenAI is sufficient for many tasks, the token limit will be always exceeded with the conversation running. autogen.Completion will raise this InvalidRequestError that indicates the context size is exceeded since autogen doesn’t have a way to handle long context sizes.

Potential Methods

  1. Compression: we can utilize LLMs to compress previous messages to reduce context size.
  2. Retrieve related history messages: we can retrieve the most related messages based on the latest message.
  3. Truncation: a simple way is keep the recent k messages and truncate all previous messages. We can also implement some truncation mechanisms, such as remove failed code executions.
  4. A mixture of methods above.

Some References

### Compression & Truncation
- [ ] https://github.com/microsoft/autogen/pull/131
- [ ] https://github.com/microsoft/autogen/pull/421
- [ ] https://github.com/microsoft/autogen/pull/443
- [ ] Allow async compression
- [ ] https://github.com/microsoft/autogen/pull/497
- [ ] https://github.com/microsoft/autogen/issues/685
### Retrieval
- [ ] Explore memGPT agent

Metadata

Metadata

Labels

epicIssues related to roadmap of AutoGen

Type

No type

Projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions