-
Notifications
You must be signed in to change notification settings - Fork 3.6k
PIP-341: Pluggable client metrics tracker interface #22145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@KevinLiLu Please add the following content to your PR description and select a checkbox:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This the first pass.
In the meeting today we can discuss in high level.
|
||
Since we need to record events in both layers, we can create and store the metric trackers in the base classes `ProducerBase`/`ConsumerBase`, so that both `ProducerImpl`/`ConsumerImpl` and `PartitionedProducerImpl`/`MultiTopicsConsumerImpl` will have its own metric tracker. | ||
|
||
For a non-partitioned topic, there will only be one producer/consumer metrics tracker. For partitioned topics, there will be one producer/consumer tracker per partition (topic name `{partitionedTopicName}-partition-{partitionNum}`), and an additional tracker on the multi-topic/partitioned wrapper (topic name `{partitionedTopicName}`). It is up to the metric tracker implementation to decide on how to handle this (ex. aggregation at `partitionedTopicName` level). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might need to make it easy for implementors to prepare a partition number label if it is a partition. We can have another interface targeted at that or some other means. Let's avoid topic parsing :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could add partitionIndex
as a parameter to the MetricsTrackerFactory#create
producer/consumer metrics tracker methods. Value will be -1
for non-partitioned topics. This is the same approach as Producer/ConsumerImpl creation
pulsar/pulsar-client/src/main/java/org/apache/pulsar/client/impl/PulsarClientImpl.java
Lines 436 to 453 in 74be3fd
* @param partitionIndex partition index of a partitioned topic. the value -1 is used for non-partitioned topics. | |
* @param conf producer configuration | |
* @param schema topic schema | |
* @param interceptors producer interceptors | |
* @param producerCreatedFuture future for signaling completion of async producer creation | |
* @param <T> message type class | |
* | |
* @return a producer instance | |
*/ | |
protected <T> ProducerImpl<T> newProducerImpl(String topic, int partitionIndex, | |
ProducerConfigurationData conf, | |
Schema<T> schema, | |
ProducerInterceptors interceptors, | |
CompletableFuture<Producer<T>> producerCreatedFuture, | |
Optional<String> overrideProducerName) { | |
return new ProducerImpl<>(PulsarClientImpl.this, topic, conf, producerCreatedFuture, partitionIndex, schema, | |
interceptors, overrideProducerName); | |
} |
|
||
void close(); | ||
|
||
ProducerStats getStats(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to force this new interface implementation to implement that?
try { | ||
this.metricsTracker.recordMessagesSent(numMsgs, totalMsgsSize); | ||
} catch (Throwable e) { | ||
log.warn("Error executing producer metrics tracker recordMessagesSent call ", e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably throttled.
### LoggingMetricsTrackerFactory | ||
The default configured metrics tracker factory will be the `LoggingMetricsTrackerFactory` as it will replace the existing functionality. | ||
|
||
The logging producer/consumer metrics tracker will initialize and hold an instance of the underlying stats recorder, and forward all calls to the stats recorder. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to do that in the infrastructure and not a concrete implementation.
and after 2 versions, remove it.
|
||
We will set the default metrics tracker factory as the `LoggingMetricsTrackerFactory` to maintain backwards compatibility, however we still need to do some checks: | ||
1. If a non-default non-null metrics tracker factory (ex. OTel one) is configured, then use it | ||
2. Else if `statsIntervalSeconds=0`, then the user has explicitly disabled client metrics so we must use the `NoOpMetricsTrackerFactory` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As said before, I wouldn't couple those two.
2. Else if `statsIntervalSeconds=0`, then the user has explicitly disabled client metrics so we must use the `NoOpMetricsTrackerFactory` | ||
3. Else, use `LoggingMetricsTrackerFactory` | ||
|
||
Inside `LoggingMetricsTrackerFactory`, we must also derive `statsIntervalSeconds`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see it a bit differently.
The producerStats, consumerStats, and its matching interval config will be marked deprecated and removed in 2 subsequent versions.
Users can choose a LoggingMetricsTrackerFactory, which they configure with a new parameter, or build on their own and provide the factory instance, already configured.
3. Finally, default to 60 seconds if `statsIntervalSeconds` is not configured anywhere | ||
|
||
### Supporting existing Consumer/Producer getStats API | ||
There are two major challenges with supporting the existing producer/consumer `getStats()` API: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have the two mechanisms in place, this problem disappears. This implementation will be deleted in two versions from now.
1. Consolidate to a new stats supplier class to handle both partitioned & non-partitioned cases | ||
2. Add setter/builder/supplier methods to the new stats supplier class to allow the metrics tracker to supply stat values | ||
|
||
#### Consolidated ProducerStatsSupplier/ConsumerStatsSupplier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the context here. Maybe you should explain in background knowledge what are those suppliers.
+1 for the motivation. |
TODO:
Producer
/Consumer
'sgetStats()
API)Motivation
Modifications
Verifying this change
(Please pick either of the following options)
This change is a trivial rework / code cleanup without any test coverage.
(or)
This change is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)
Does this pull request potentially affect one of the following parts:
If the box was checked, please highlight the changes
Documentation
doc
doc-required
doc-not-needed
doc-complete
Matching PR in forked repository
PR in forked repository: