Hoist-core provides a central metrics infrastructure built on Micrometer, enabling applications to publish observable metrics to platforms such as Prometheus, Grafana, and Datadog. The system is designed to work transparently across Hoist's clustered architecture, with automatic namespace prefixing, default tags, and cluster-wide scrape support.
The framework automatically publishes a range of built-in metrics covering JVM health, JDBC
connection pooling, WebSocket activity, client activity tracking, and Hoist monitor results.
Applications can register their own custom metrics using the standard Micrometer API via
MetricsService.registry.
- Central registry β
MetricsServiceexposes aCompositeMeterRegistrythat all meters register through. - Export registries β built-in support for Prometheus (pull-based) and OTLP (push-based), configured via soft config. Additional registries (e.g. Datadog) can be added programmatically.
- Cluster-wide Prometheus scrape β a single endpoint can return metrics from all instances,
each distinguished by a
xh.instancetag. - Built-in metrics β JVM (memory, GC, threads, classloader, CPU), JDBC pool, WebSocket channels, client activity tracking, and Hoist monitor results are instrumented out of the box.
- Admin Console β a cluster-wide metrics viewer is available via
MetricsAdminController.
| File | Location | Role |
|---|---|---|
MetricsService.groovy |
grails-app/services/io/xh/hoist/telemetry/ |
Central Micrometer registry, namespace/tagging, export registries |
MetricsConfig.groovy |
src/main/groovy/io/xh/hoist/telemetry/ |
Typed wrapper around xhMetricsConfig |
MonitorMetricsService.groovy |
grails-app/services/io/xh/hoist/monitor/ |
Publishes Hoist monitor results as Micrometer metrics |
TrackMetricsService.groovy |
grails-app/services/io/xh/hoist/track/ |
Client activity metrics from track log entries |
MetricsAdminService.groovy |
grails-app/services/io/xh/hoist/admin/ |
Cluster-wide meter listing for admin UI |
MetricsAdminController.groovy |
grails-app/controllers/io/xh/hoist/admin/cluster/ |
REST endpoint for admin metrics viewer |
File: grails-app/services/io/xh/hoist/telemetry/MetricsService.groovy
The central service for all Micrometer metrics in a Hoist application. Initialized early in the
bootstrap sequence (before other services), it provides the CompositeMeterRegistry that all
framework and application meters register through.
Access the registry via metricsService.registry β a standard Micrometer
CompositeMeterRegistry that supports all Micrometer meter builders directly. For the common
cases, MetricsService exposes a family of methods that handle registration, default tags,
distribution config, and name-prefixing from an optional owner BaseService's telemetryPrefix:
| Method | Use for |
|---|---|
configureTimer(name, description?, tags?, percentiles?, slos?, publishHistogram?, minExpected?, maxExpected?, owner?, useNamePrefix?) |
Configures distribution stats and default metadata for a named Timer (no concrete Timer registered). |
registerTimer(name, description?, tags?, owner?, useNamePrefix?) |
Registers a concrete Timer (uses distribution config from any prior configureTimer). |
configureCounter(name, description?, tags?, owner?, useNamePrefix?) |
Configures default metadata for a named Counter. |
registerCounter(name, description?, tags?, owner?, useNamePrefix?) |
Registers a concrete Counter for the name. |
registerGauge(name, valueFn, description?, tags?, baseUnit?, owner?, useNamePrefix?) |
Registers a Gauge whose value is read from valueFn on demand. |
registerFunctionCounter(name, countFn, description?, tags?, baseUnit?, owner?, useNamePrefix?) |
Registers a monotonically-increasing FunctionCounter from countFn. |
Pass owner: this from your service to have its telemetryPrefix prepended to the metric name
and an xh.owner tag added automatically. Set useNamePrefix: false to opt out of prefixing
when supplying a fully-qualified name.
class MyService extends BaseService {
String telemetryPrefix = 'myService'
MetricsService metricsService
void init() {
metricsService.registerGauge(
name: 'queueDepth',
description: 'Current items in processing queue',
valueFn: { queueSize() },
owner: this
)
}
}For meter shapes the above methods don't cover (e.g. DistributionSummary), use the underlying
metricsService.registry directly with the Micrometer builder API β remember to prefix the
metric name yourself, e.g. "${telemetryPrefix}.myMeter".
All meters registered through the service automatically receive:
- Default tags:
xh.applicationβ the application code (e.g.myApp)xh.instanceβ the cluster instance name (e.g.e36ca82b)xh.sourceβ classifies the metric's origin ('hoist' or 'app')
Metrics tagged with instance=cluster are only accepted on the primary instance. This prevents
duplicate registration of cluster-level aggregates (such as overall monitor status) across multiple
instances.
When prometheusEnabled: true in xhMetricsConfig, a PrometheusMeterRegistry is added to the
composite registry. Prometheus scrapes are served by calling metricsService.prometheusData(), which
fans out to all cluster instances via Hazelcast, collects each instance's scrape output, and
concatenates the results. Each metric already carries a xh.instance tag distinguishing its source.
Applications expose this via a simple controller:
import io.xh.hoist.BaseController
import io.xh.hoist.security.AccessAll
@AccessAll
class PrometheusController extends BaseController {
def metricsService
def index() {
render(
contentType: 'text/plain; version=0.0.4; charset=utf-8',
text: metricsService.prometheusData()
)
}
}This cluster-wide endpoint should be used instead of the spring default, /actuator/prometheus
which will not contain any Hoist metrics and is not configured by default.
Additional Prometheus configuration properties can be passed via the prometheusConfig map in
xhMetricsConfig. Keys are mapped to Micrometer's PrometheusConfig properties (e.g.
{"step": "PT30S"}).
When otlpEnabled: true, an OtlpMeterRegistry is added for push-based export (e.g. to
Grafana Cloud, New Relic, or any OTLP-compatible backend). Configuration properties are passed via
otlpConfig (e.g. {"url": "https://otlp.example.com/v1/metrics", "step": "PT60S"}).
OTLP export is suppressed by default when the app is running in local development, even when
otlpEnabled: true in xhMetricsConfig. This avoids polluting a shared OTLP backend with
developer-machine metrics during routine work. The same gating applies to trace export β see
tracing.md.
To opt in, set the otlpEnabledInLocalDev instance config to 'true'. Local-development
detection follows Utils.isLocalDevelopment, which reflects the Grails runtime mode
(Environment.isDevelopmentMode() β true when started via bootRun, false in a deployed war).
This is independent of the configured appEnvironment, so a deployed instance configured as
Development is not affected by this flag.
When OTLP export runs in local dev, the deployment.environment.name resource attribute is
suffixed with the OS username (e.g. Development-johndoe) so per-developer data can be
distinguished in a shared backend. Override
ClusterConfig.getOtelResourceAttributes()
if your backend prefers a different scheme.
Applications can add additional export registries programmatically:
metricsService.registry.add(myDatadogRegistry)Automatically bound at startup via Micrometer's standard binders:
| Metric prefix | Source | Description |
|---|---|---|
jvm.memory.* |
JvmMemoryMetrics |
Heap and non-heap memory usage |
jvm.gc.* |
JvmGcMetrics |
Garbage collection counts and pause times |
jvm.threads.* |
JvmThreadMetrics |
Thread counts by state |
jvm.classes.* |
ClassLoaderMetrics |
Loaded and unloaded class counts |
system.cpu.* |
ProcessorMetrics |
CPU usage and available processors |
Published by ConnectionPoolMonitoringService via the Tomcat JDBC pool:
| Metric | Type | Description |
|---|---|---|
jdbc.pool.size |
Gauge | Total connections (active + idle) |
jdbc.pool.active |
Gauge | Active/in-use connections |
jdbc.pool.idle |
Gauge | Idle connections |
jdbc.pool.waitCount |
Gauge | Threads waiting for a connection |
jdbc.pool.borrowed |
Counter | Cumulative connections borrowed |
jdbc.pool.returned |
Counter | Cumulative connections returned |
jdbc.pool.created |
Counter | Cumulative connections created |
jdbc.pool.released |
Counter | Cumulative connections destroyed |
jdbc.pool.reconnected |
Counter | Connections re-established after failure |
jdbc.pool.removeAbandoned |
Counter | Connections removed due to abandonment |
jdbc.pool.releasedIdle |
Counter | Idle connections released by evictor |
Published by WebSocketService:
| Metric | Type | Description |
|---|---|---|
websocket.channels |
Gauge | Active WebSocket channels |
websocket.messages.sent |
Counter | Messages sent successfully |
websocket.messages.received |
Counter | Messages received from clients |
websocket.messages.sendErrors |
Counter | Message send failures |
websocket.sessions.opened |
Counter | Sessions registered |
websocket.sessions.closed |
Counter | Sessions unregistered |
Published by MonitorMetricsService after each monitor evaluation cycle on the primary instance.
For each configured monitor, three metrics are published:
| Metric | Type | Description |
|---|---|---|
hoist.monitor.status.{code} |
Gauge | Status severity (0=INACTIVE .. 4=FAIL) |
hoist.monitor.value.{code} |
Gauge | Current numeric metric value |
hoist.monitor.executionTime.{code} |
Timer | Execution time of the monitor check |
Each carries a xh.instance tag indicating which cluster instance ran the check, or cluster for
aggregate status. Meters are automatically removed when monitors or instances are decommissioned.
See monitoring.md for full documentation of the Hoist monitoring system.
Published by TrackMetricsService, which subscribes to the xhTrackReceived Hazelcast topic on
the primary instance. These metrics are cluster-scoped (instance=cluster) and tagged with
clientApp to distinguish activity from different client applications.
| Metric | Type | Description |
|---|---|---|
xh.client.track.messages |
Counter | All track log entries received |
xh.client.track.errors |
Counter | Client error track entries (category == 'Client Error') |
xh.client.load.totalTime |
Timer | Total app load elapsed time |
xh.client.load.authTime |
Timer | App load authentication phase duration |
Load timers are recorded only for App / Loaded track entries that include a timings map in
their data payload, confirming they represent a standard Hoist client load event. Both timers
emit percentile histograms, supporting server-side aggregation (e.g. p90/p99) in Prometheus and
OTLP-receiving backends.
See activity-tracking.md for documentation of the track log system.
| Property | Value |
|---|---|
| Type | json |
| Default | See below |
| Client Visible | No |
| Purpose | Metrics infrastructure configuration β export registries and namespace. |
Default value:
{
"prometheusEnabled": false,
"otlpEnabled": false,
"prometheusConfig": {},
"otlpConfig": {}
}| Key | Type | Description |
|---|---|---|
prometheusEnabled |
Boolean | Enable the Prometheus export registry. Dynamic β takes effect on next config refresh. |
prometheusConfig |
Map | Additional Prometheus configuration properties (e.g. {"step": "PT30S"}). |
otlpEnabled |
Boolean | Enable the OTLP export registry. Dynamic. In local development, additionally gated β see Local-development gating. |
otlpConfig |
Map | OTLP configuration properties (e.g. {"url": "...", "step": "PT60S"}). |
When xhMetricsConfig is updated, the export registries are torn down and recreated with the
new settings. This is handled by clearCaches() responding to the xhConfigChanged event.
MetricsAdminController provides a listMetrics endpoint that fans out to all cluster instances
and returns a merged list of all registered meters. Each entry includes:
nameβ the fully-qualified metric name (with namespace prefix)typeβ Micrometer meter type (GAUGE, COUNTER, TIMER, etc.)valueβ the current value (interpretation depends on type)count,maxβ for Timer/DistributionSummary typesdescriptionβ human-readable descriptionbaseUnitβ unit of measurementtagsβ all tags includingxh.application,xh.instance,xh.sourcestatsβ raw statistics map
This endpoint requires the HOIST_ADMIN_READER role.