Skip to content

[Feature]: Performance improvement #43

@AB-Law

Description

@AB-Law

Is your feature request related to a problem? Please describe.

I'm currently facing long response times from the LLM when generating story outputs. Each OpenAI API call is taking nearly a minute to complete, which heavily affects the storytelling flow and user experience. My suspicion is that multiple tool calls are being triggered during message generation, and it’s unclear whether all of them are working as intended. Debugging or tracking tool execution is also difficult right now.

Describe the solution you'd like

I’d like to optimize and monitor LLM calls to significantly reduce latency. Ideally, each LLM response should return within a few seconds, even when tools are involved. Possible solutions could include:

Implementing better tracing/logging to confirm which tool calls are actually being executed.
Introducing batching or concurrency for tool calls.
Reducing unnecessary calls or smoothening out the whole flow.
Adding metrics or timing logs to visualize performance bottlenecks.

Describe alternatives you've considered

I’ve tried a few approaches to mitigate the delay:

Reducing the number of tool calls per LLM run, but the latency persists, suggesting underlying inefficiencies in how the calls are handled or awaited.
Switching to smaller or faster models temporarily, which slightly improved performance but didn’t resolve the delay when tools were used.

Additional context

The storyteller bot makes multiple tool calls per story segment (e.g., fetching character data, updating world state, or generating narrative branches). Each of these steps seems to contribute to cumulative delay.
I suspect some calls might be sequential instead of parallel, or waiting unnecessarily for previous responses to complete.

Would you like to contribute this feature?

  • Yes, I would like to implement this feature

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions