-
Notifications
You must be signed in to change notification settings - Fork 98
Live data preview and node transition frequency on Flink minicluster #8047
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: staging
Are you sure you want to change the base?
Conversation
# Conflicts: # engine/flink/executor/src/main/scala/pl/touk/nussknacker/engine/process/compiler/StubbedFlinkProcessCompilerDataFactory.scala
# Conflicts: # components-api/src/main/scala/pl/touk/nussknacker/engine/ModelConfig.scala # engine/flink/executor/src/main/scala/pl/touk/nussknacker/engine/process/compiler/FlinkProcessCompilerDataFactory.scala
* @param id correlation id/trace id used for tracing (logs, error presentation) and for tests mechanism, it should be always defined | ||
* @param variables variables available in evaluation | ||
* @param parentContext context used for scopes handling, mainly for fragment invocation purpose | ||
*/ | ||
case class Context( | ||
initialId: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change will no longer be needed after the ContextId refactor. The initialId is basically ContextId without prefix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So let's add a TODO/FIXME
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the initialId
field. Instead of it, the initial context id is now recognized by regex. There is a todo added about it. After the ContextId refactor the regex won't be necessary, because we will be able co compare parts of the ContextId case class instead.
|
||
final case class LiveDataPreview( | ||
liveDataSamples: TestResults[Json], | ||
nodeTransitionFrequency: Map[NodeTransition, BigDecimal], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that using "throughput" word is more natural in this context. The "frequency" is used when we talk about some cyclic mechanism to show how often it is triggered.
|
||
def getLiveData( | ||
processIdWithName: ProcessIdWithName, | ||
): Future[Option[LiveDataPreview]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This Option
is mysterious. It could be interpreted in many ways. Let's use Either
with explicit error instead. For deployed scenarios, we should always return defined, empty LiveDataPreview
. The error should be returned when someone ask for scenario that is not deployed or is finished/stopped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, either with error in Left. The error only says that the information is not available for the scenario. In that place we do not have detailed information as to why.
} | ||
|
||
case class ContextId(value: String) | ||
|
||
/** | ||
* Context is container for variables used in expression evaluation | ||
* | ||
* @param initialId the initial id of the context (id may change during the processing, but the initial id is preserved in this field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add one more sentence in the comment, why do we need this field with preserved id? Also, maybe we could replace case class with normal class to make changes of initialId
statically impossible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this field, details in the other comment.
Any | ||
] = | ||
baseNuApiEndpoint | ||
.summary("Perform test") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What test this endpoint perform? :P
.summary("Perform test") | ||
.tag("Testing") | ||
.get | ||
.in("scenarioTesting" / path[ProcessName]("scenarioName") / "liveData") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why endpoint is nested in scenarioTesing
resource? Let's create a new liveData
endpoint. I know that under the hood we reuse some features from the scenario testing mechanism, but we don't need to expose it in the interface.
* @param id correlation id/trace id used for tracing (logs, error presentation) and for tests mechanism, it should be always defined | ||
* @param variables variables available in evaluation | ||
* @param parentContext context used for scopes handling, mainly for fragment invocation purpose | ||
*/ | ||
case class Context( | ||
initialId: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So let's add a TODO/FIXME
Option(listenerStorages.getIfPresent(processName.value)).map(_.getLiveDataPreview) | ||
} | ||
|
||
private[livedata] def cleanResults(processName: ProcessName): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that you don't clean results after the job finish, because you want to make it still available for the user for some period of time? Maybe let's write a comment about that because from the first glance it looks like some memory leak.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, comment is now above this method and describes when we want to clean the data
val isNewKey = !results.containsKey(key) | ||
if (isNewKey) orderOfResults.add(key) | ||
results.put(key, newValue) | ||
while (orderOfResults.size() > maxNumberOfSamples) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
size
has O(n) complexity. Have you considered using the ready solution such as EvictingQueue
from guave? we do synchronisation, so I can't see too much added value from using thread-safe structures. Keeping two information in two data structures: results
and orderOfResults
is hard to analyze in code, e.g. as a reviewer I can't guratanee that it is correctly cleaned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately guava EvictingQueue
won't work here. It does not support modifying elements by key. But my solution was not good either.
I changed the solution. I created RingBuffer
util, with unit tests. It is based on the LinkedHashMap, with added synchronization.
|
||
private val results = new ConcurrentHashMap[K, V]() | ||
private val orderOfResults = new ConcurrentLinkedQueue[K]() | ||
private val transitionsByEpochSecond = new ConcurrentHashMap[Long, ConcurrentLinkedQueue[NodeTransition]]() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could just use a MultiMap
to make it simpler? Or a Map NodeTransition
->InstantRateMeter
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately there is no ready-to-use util to do exactly what is needed here. I asked LLM's about it too :) . (The main difficulty is that we have to drop the transitions, that are older than the treshold)
I created an util class SlidingWindowCounter
. It wraps the logic around it and I think it is quite simple and easy to understand. I also added unit tests of that util.
# Conflicts: # docs/Changelog.md
created: #8133 |
# Conflicts: # e2e-tests/src/test/scala/pl/touk/nussknacker/BatchDataGenerationSpec.scala
# Conflicts: # designer/server/src/main/scala/pl/touk/nussknacker/ui/api/ScenarioTestingApiHttpService.scala # designer/server/src/main/scala/pl/touk/nussknacker/ui/api/description/scenarioTesting/Dtos.scala # designer/server/src/main/scala/pl/touk/nussknacker/ui/api/description/scenarioTesting/ScenarioTestingApiEndpoints.scala # designer/server/src/test/scala/pl/touk/nussknacker/ui/api/testing/ScenarioTestingApiHttpServiceSpec.scala # designer/server/src/test/scala/pl/touk/nussknacker/ui/api/testing/SchemalessKafkaJsonTypeTests.scala # docs-internal/api/nu-designer-openapi.yaml # docs/Changelog.md
# Conflicts: # components/openapi/src/main/scala/pl/touk/nussknacker/openapi/enrichers/OpenAPIEnricherFactory.scala
@@ -2023,6 +2024,20 @@ lazy val deploymentManagerApi = (project in file("designer/deployment-manager-ap | |||
) | |||
.dependsOn(extensionsApi, testUtils % Test) | |||
|
|||
lazy val liveDataCollector = (project in file("designer/live-data-collector")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created a new module, reusable by other DMs, instead of adding it in the Flink engine module.
Describe your changes
The data can be fetched using
/api/scenarioTesting/{scenario_name}/liveData
At the moment only for Flink minicluster, because it is run a part of the JVM and we can pass listener to the job.
Checklist before merge