Feature request: add an option to summarize and analyze agent trace with LLM after each run

https://github.com/microsoft/AIOpsLab/issues/101

I have developed some scripts to do this, but it may be valuable to integrate this into the eval pipeline. 

Currently, we lack observability in agent behaviors:
- Traces are long, and nobody bothers to read them
- Traces alone are not enough. We also need the evaluation and fault-injection logic to understand the run better. 
-  Many times, the agent did the right thing, but there are bugs in the benchmark itself.

For summary and analysis, we can feed the context to the most advanced LLM with a better reasoning capability and a larger context length at a low cost(because it is one-step).



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature request: add an option to summarize and analyze agent trace with LLM after each run #108

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature request: add an option to summarize and analyze agent trace with LLM after each run #108

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions