Description
Suggestion
A default deny may be warranted here, however, in the interest of having more data to do a root cause analysis, I'm going to start with the suggestion of some block list additions to the default JMX exporter config. I note we have some in JIRA under the now deprecated black list object names parameter. ( might want to update that as well )
Confluence seems to have some fairly nasty issues in the bean imports the jmx exporter does. Particularly as it pertains to:
com.atlassian.confluence.metrics
Example:
bean events of category01=logging
also seen in other metrics namespaces such as OneMinuteRate and many more.
com_atlassian_confluence_metrics_OneMinuteRate{category00="confluence",category01="logging",
We are exporting logging events to the prometheus exporter in the default jmx-config configmap configuration.
This results in any ingestion from that endpoint growing in infinite key cardinality, as well as an increasing delay in how long it takes to scrape the endpoint.
This isn't the only problematic metrics key we have. the category00=hazelcast keys tend to also create tons of unique traffic that grows infinitely. Again increasing the key cardinality on prometheus and infinitely increasing the scrape time of the exporter endpoint.
At the moment the default config for the jmx exporter is dangerous to directly integrate into prometheus. I began trying to put together a jmx-config config map that would help me blocklist the offenders. But to be frank, I am a bit outside my depth here, I would love some help.
As a starting point I have this in my values yaml:
jmxExporterCustomConfig:
jmx-config:
excludeObjectNames:
- 'com.atlassian.confluence.metrics:category01=logging,*'
- 'com.atlassian.confluence.metrics:category00=http,category01=rest,name=request,*'
- 'com.atlassian.confluence.metrics:category00=bandana,*'
- 'com.atlassian.confluence.metrics:category00=hazelcast,*'
- 'com.atlassian.confluence.metrics.Value:category00=bandana,*'
- 'com.atlassian.confluence.metrics.999thPercentile:category00=bandana,*'
- 'com.atlassian.confluence.metrics.MeanRate:category00=bandana,*'
- 'com.atlassian.confluence.metrics.OneMinuteRate:category00=bandana,*'
- 'com.atlassian.confluence.metrics:category00=hazelcast,*'
- 'com.atlassian.confluence.metrics.Value:category00=hazelcast,*'
- 'com.atlassian.confluence.metrics.Count:category00=hazelcast,*'
- 'com.atlassian.confluence.metrics.FifteenMinuteRate:category00=hazelcast,*'
- 'com.atlassian.confluence.metrics.FiveMinuteRate:category00=hazelcast,*'
- 'com.atlassian.confluence.metrics.OneMinuteRate:category:00=hazelcast,*'
- 'com.atlassian.confluence.metrics.999thPercentile:category00=hazelcast,*'
- 'com.atlassian.confluence.metrics.MeanRate:category00=hazelcast,*'
rules:
- pattern: '(java.lang)<type=(\w+)><>(\w+):'
name: java_lang_$2_$3
- pattern: 'java.lang<type=Memory><HeapMemoryUsage>(\w+)'
name: java_lang_Memory_HeapMemoryUsage_$1
- pattern: 'java.lang<name=G1 (\w+) Generation, type=GarbageCollector><>(\w+)'
name: java_lang_G1_$1_Generation_$2
- pattern: '.*'
but this seems to only deal with my logging events... I have tried a bunch of variations for clearing category00 stuff without success. I suspect I am just wrong about the syntax and this would be obvious to a java developer.
Anyways, I open the floor to suggestions, proposals, and rebukes. I would like to collaboratively dig through the list of metrics confluence is making available to the jmx exporter and block the dynamic growth ones.
Product
Confluence
Code of Conduct
- I agree to follow this project's Code of Conduct