Skip to content

Conversation

@sdubov
Copy link
Contributor

@sdubov sdubov commented Sep 29, 2025

KG-148. Support streaming events in agent features

LLM Streaming is available now in the agent pipeline. However, the evens for feature message processors are missing. This PR add missing events to the agent feature processors scope.

Motivation and Context

Add missing feature events to be available in feature processors

Breaking Changes

No


Type of the changes

  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Tests improvement
  • Refactoring

Checklist

  • The pull request has a description of the proposed change
  • I read the Contributing Guidelines before opening the pull request
  • The pull request uses develop as the base branch
  • Tests for the changes have been added
  • All new and existing tests passed
Additional steps for pull requests adding a new feature
  • An issue describing the proposed change exists
  • The pull request includes a link to the issue
  • The change was discussed and approved in the issue
  • Docs have been added / updated

@sdubov sdubov requested review from Rizzen and kpavlov September 29, 2025 13:36
@sdubov sdubov self-assigned this Sep 29, 2025
@sdubov sdubov added documentation Improvements or additions to documentation enhancement New feature or request refactoring Code improvements that enhance structure, readability, and quality without changing functionality labels Sep 29, 2025
@sdubov sdubov requested a review from devcrocod September 29, 2025 13:37
@github-actions
Copy link

github-actions bot commented Sep 29, 2025

Qodana for JVM

780 new problems were found

Inspection name Severity Problems
Check Kotlin and Java source code coverage 🔶 Warning 752
Vulnerable imported dependency 🔶 Warning 17
Unused import directive 🔶 Warning 6
Missing KDoc for public API declaration 🔶 Warning 5
@@ Code coverage @@
+ 71% total lines covered
12784 lines analyzed, 9097 lines covered
# Calculated according to the filters of your coverage tool

☁️ View the detailed Qodana report

Contact Qodana team

Contact us at [email protected]

Copy link
Contributor

@kpavlov kpavlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @sdubov
In general looks good. I have a concern about exposing Prompt in events and naming of fields

public data class LLMStreamingStartingEvent(
val runId: String,
val prompt: Prompt,
val model: String,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's maybe rename it to modelName or modelId? Is it the value of LLModel.id?
Would it be convenient to introduce public typealias ModelId=String (or ModelName) and use it elsewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the KDoc for each parameter to clarify the name and the format that is used for it.

@Serializable
public data class LLMStreamingStartingEvent(
val runId: String,
val prompt: Prompt,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Prompt contains a message list that can be pretty huge, would it be appropriate to include it in the supposedly lightweight event object?

Copy link
Contributor Author

@sdubov sdubov Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we do not have any special requirements about "lightness" of the event objects. We neither faced with issues related to events object size. The only "real case" when it might be important that I can imagine is when you send events with Remote writer that transfer this payload via network. I think it worth solving by using a proper transport configuration (or create some special strategies to configure events in future when needed).

Copy link
Contributor

@kpavlov kpavlov Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had actually faced OOM in "heavy tests" recently after StreamFrame was introduced. We are effectively at the edge. More GC pressure might kill "heavy tests" again.
Let's run heavy tests on this branch before merging, just in case.
cc: @Ololoshechkin

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kpavlov, is it OOM problem related to events and Prompt object? Could you please send me the link to the issue? I would be interested to check the dump.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There have been test failures after Sep 13, e.g. https://github.com/JetBrains/koog/actions/runs/17695253217/job/50294003299
PR that probably improved memory utilization #838, but there are many moving parts

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kpavlov, I do not see OOM errors in the failure that you refer here. Could you please point me to the error that you mean here?

Heavy tests on this branch passed - https://github.com/JetBrains/koog/actions/runs/18111506925

public data class LLMStreamingFrameReceivedEvent(
val runId: String,
val frame: StreamFrame,
override val eventId: String = LLMCallStartingEvent::class.simpleName!!,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typealiases instead of Strings would be more safe and readable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we have discussed this question privately already. Just to summarise my opinion here: for readability, I prefer to use primitive types as is. Special type for a parameter with a good name does not give any value from my vision and just add redundancy.

E.g. compare: eventId: EventId vs. eventId: String. Second option is self-explanatory from my vision, while for the first example, type EventId does not add any meaning to the parameter. In addition, I would need to go to the type definition to understand that it is well-known String primitive.

I'm open to your opinion here.

Copy link
Contributor

@kpavlov kpavlov Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, I respectfully disagree with the notion that String is inherently more readable or type-safe than Tiny-Types. In fact, I personally find the opposite to be true. It's unfortunate that you're not open to exploring it further, as I believe it has a lot of potential benefits.
e.g. this is the good example of using tiny-types pattern

}.use { agent ->
agent.run("")

val actualEvents = writer.messages.filter { event ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it an eventually-consistent test? Shouldn't we use awaitility?

Copy link
Contributor Author

@sdubov sdubov Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, could you please clarify what do you mean here? This line filter a list of collected evens by only ones related to LLM Streaming to not verify them all since agent execution might produce a lot of events.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean instead of assertEquals use await.until(predicate) More examples here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not wait here for events. All expected events are already collected at that point. Here we just filter events to very only ones that we want to test. I do not see any benefit of using external library for awaiting here. Please let me know if I miss something.

Comment on lines +87 to +117
- Agent events:
- AgentStartingEvent
- AgentCompletedEvent
- AgentExecutionFailedEvent
- AgentClosingEvent

- Strategy events:
- GraphStrategyStartingEvent
- FunctionalStrategyStartingEvent
- StrategyCompletedEvent

- Node execution events:
- NodeExecutionStartingEvent
- NodeExecutionCompletedEvent
- NodeExecutionFailedEvent

- LLM call events:
- LLMCallStartingEvent
- LLMCallCompletedEvent

- LLM streaming events:
- LLMStreamingStartingEvent
- LLMStreamingFrameReceivedEvent
- LLMStreamingFailedEvent
- LLMStreamingCompletedEvent

- Tool execution events:
- ToolExecutionStartingEvent
- ToolValidationFailedEvent
- ToolExecutionFailedEvent
- ToolExecutionCompletedEvent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 It would be nice to create a diagram with entities emitting and consuming particles events
🐈📦🏴‍☠️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list of events is pretty straightforward. They just inherited from the DefinedFeatureEvent base class. Thos are events defined inside the library. We can check with @markomitos if we need a diagram here, but from my vision, the only types are important here for an end user are final event classes produced by agent execution (listed here in the MD file).

@kpavlov kpavlov added the question Further information is requested label Sep 29, 2025
* @property eventId Unique identifier for this type of event, defaulting to the name of the starting event class.
* @property runId Unique identifier for the LLM run;
* @property prompt The input prompt provided for the LLM operation;
* @property model The description of the LLM model used during the call. Use the format: 'llm_provider:model_id';
Copy link
Contributor

@kpavlov kpavlov Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very cryptic to me and doesn't seem right. If extended model information is required, let's introduce something like ModelInfo(llmProvider: String, modelId: String). It might be done with #633 in mind

Copy link
Contributor Author

@sdubov sdubov Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a valid point. I've created a separate ticket to address this problem out of scope of this PR. Please see KG-437. It is a bit more work than just rename the parameter.

Copy link
Contributor

@kpavlov kpavlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @sdubov
In general PR is good enough, but the
model field, and, especially, eventId field semantics doesn't seem right to me

* @property prompt The input prompt provided for the LLM operation;
* @property model The description of the LLM model used during the call. Use the format: 'llm_provider:model_id';
* @property tools A list of associated tools or resources that are part of the operation;
* @property eventId A string representing the event type;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem right:

  • eventId by common naming convention is always unique identifier. This change just breaks semantic without compiler warning.
  • Event type field is typically named eventType

cc: @Ololoshechkin

BTW: that wouldn't have been possible with tiny-types.

Copy link
Contributor Author

@sdubov sdubov Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I do not really see the relation to the pattern that you referring to here. This background for this eventId is the following: it was added to provide an optional id for a custom events. Imagine that you created a custom event and would like to split different places where you use this event, e.g. AgentDoHttpCall. The original idea was to add a general event id that can be used to split them. For now, I decided to delete the eventId field and review if we really need it.

@kpavlov kpavlov self-requested a review September 30, 2025 06:44
@kpavlov kpavlov dismissed their stale review September 30, 2025 06:45

if it unblock

@Ololoshechkin
Copy link
Collaborator

Discussed with @sdubov -- he'll drop the eventId completely because it's not yet used other than in logs.

@sdubov sdubov force-pushed the sdubov/KG-148 branch 3 times, most recently from ec0aa21 to d1244b0 Compare October 1, 2025 09:47
sdubov added 3 commits October 1, 2025 12:30
- Massive updates in the commit 3a315e4 miss several places. Updated all related namings in tests and inspections
- Add LLM Streaming events to be able to use these events with feature processors;
- Updated all relevant features, including EventHandler, Tracer, Debugger;
- Add or update tests for each feature to verify streaming events handling.
Copy link
Contributor

@devcrocod devcrocod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with many of @kpavlov comments regarding the code.
Overall, lgtm!

Let’s fix some things in separate prs, since this one has already gone beyond the scope of the task

*/
@Serializable
public data class LLMStreamingCompletedEvent(
val runId: String,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get all the necessary information about the model, tools by runId?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean, having the runId collect this info from an agent. No, we do not have this aggregated object that can provide this info right away. It is a separate set of parameters.

These events are produced by features such as Tracing and Debugger to enable logging, tracing, monitoring, and remote inspection.
The canonical description and definitions of Feature Events now live in the agents-core module. See:

- agents-core module docs: ../agents-core/Module.md (section "Standard Feature Events")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

../agents-core/Module.md
module.md is used for generating api reference docs. In my opinion, it’s better to use a link so that the user can navigate to the corresponding page on the website

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@sdubov sdubov merged commit f241286 into develop Oct 1, 2025
15 checks passed
@sdubov sdubov deleted the sdubov/KG-148 branch October 1, 2025 13:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request question Further information is requested refactoring Code improvements that enhance structure, readability, and quality without changing functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants