This document provides a comprehensive overview of all metrics in the Apache Sling Distribution Journal bundle, organized by publisher and subscriber components.
Most publisher metrics are prefixed with sling_distribution_journal_publisher_ and include:
pub_name: Name of the publish agent
The queue_size gauge additionally includes clearable (see below).
- Type: Histogram
- Unit: Bytes
- Description: Histogram of the exported content package size in bytes
- Tags:
pub_name
- Type: Meter
- Description: Rate of requests returning a
DistributionRequestState.ACCEPTEDstate - Tags:
pub_name
- Type: Meter
- Description: Rate of requests returning a
DistributionRequestState.DROPPEDstate - Tags:
pub_name
- Type: Timer
- Unit: Milliseconds
- Description: Duration of building a content package
- Tags:
pub_name
- Type: Timer
- Unit: Milliseconds
- Description: Duration of adding a package to the queue
- Tags:
pub_name
- Type: Gauge
- Description: Max backlog depth for a subscriber cohort, measured from the minimum
lastProcessedOffsetin that cohort (tail size in the journal cache). Clearable cohort: subscribers with a non-null clear callback onQueueState. Non-clearable cohort: subscribers with aQueueStateand a null clear callback. Two time series per publisher (see tags). Values are cached and refreshed in the background every 30 seconds. Publisher throttling uses the clearable series only (clearable=true). - Tags:
pub_name,clearable(trueorfalse) - Migration: Previously this metric used only
pub_name. Series are now distinguished byclearable; update dashboards and alerts to include theclearabletag (e.g.clearable=truefor the former single series). - Staleness: The value can be up to ~30 seconds stale under normal conditions. When queue size computation scales linearly (O(n)) with queue size and takes longer than the refresh interval, staleness increases: refresh cycles can back up, and the displayed value may lag further behind the true queue size. See
sling_distribution_journal_publisher_queue_size_computation_durationfor monitoring computation cost.
- Type: Timer
- Unit: Milliseconds
- Description: Duration of computing both clearable and non-clearable max queue sizes for an agent during each background refresh. The computation is O(n) with queue size; slow values (>30s) indicate increased staleness.
- Tags:
pub_name
- Type: Counter
- Description: Count of fetch operations to feed the queue cache
- Tags:
pub_name
- Type: Counter
- Description: Count of queue access errors
- Tags:
pub_name
- Type: Gauge
- Description: Current number of subscribers for this publisher
- Tags:
pub_name
All subscriber metrics are prefixed with sling_distribution_journal_subscriber_ and typically include the following tags:
sub_name: Name of the subscriber agentpub_name: Name of the subscribed publish agent (first only if more than one)editable: Whether the queue is editable (true/false)
- Type: Histogram
- Unit: Bytes
- Description: Histogram of the imported content package size in bytes
- Tags:
sub_name,pub_name,editable
- Type: Timer
- Unit: Milliseconds
- Description: Duration of successful package import operations
- Tags:
sub_name,pub_name,editable
- Type: Gauge
- Unit: Milliseconds
- Description: Current duration of the package import operation in progress (0 if no import is currently happening)
- Tags:
sub_name,pub_name,editable
- Type: Meter
- Description: Rate of failures to import packages (increased on every failure to apply a package)
- Tags:
sub_name,pub_name,editable
- Type: Timer
- Unit: Milliseconds
- Description: Duration of packages successfully removed from an editable subscriber
- Tags:
sub_name,pub_name,editable
- Type: Timer
- Unit: Milliseconds
- Description: Duration of packages successfully removed automatically from a subscriber supporting error queue
- Tags:
sub_name,pub_name,editable
- Type: Counter
- Description: Counter for all the different package statuses. Tracks the number of packages that reached each status
- Tags:
sub_name,pub_name,editable,status - Status Values:
IMPORTED: Package was successfully importedREMOVED: Package was successfully removed from an editable subscriberREMOVED_FAILED: Package failed and was removed from error queue
- Type: Counter
- Description: Count of packages that failed before but then succeeded (transient errors that were eventually resolved)
- Tags:
sub_name,pub_name,editable
- Type: Counter
- Description: Count of permanent import errors (only counted in error queue setup)
- Tags:
sub_name,pub_name,editable
- Type: Counter
- Description: Count of packages that failed more than n times and thus caused a blocked queue (blocking import errors)
- Tags:
sub_name,pub_name,editable
- Type: Gauge
- Description: Number of packages with at least one failure to apply (current retry count)
- Tags:
sub_name,pub_name,editable
- Type: Counter
- Description: Number of import pre-processing requests
- Tags:
sub_name,pub_name,editable
- Type: Counter
- Description: Number of successful import pre-processing operations
- Tags:
sub_name,pub_name,editable
- Type: Timer
- Unit: Milliseconds
- Description: Duration of import pre-processing operations
- Tags:
sub_name,pub_name,editable
- Type: Counter
- Description: Number of import post-processing requests
- Tags:
sub_name,pub_name,editable
- Type: Counter
- Description: Number of successful import post-processing operations
- Tags:
sub_name,pub_name,editable
- Type: Timer
- Unit: Milliseconds
- Description: Duration of import post-processing operations
- Tags:
sub_name,pub_name,editable
- Type: Counter
- Description: Number of package installation operations
- Tags:
sub_name,pub_name,editable
- Type: Timer
- Unit: Milliseconds
- Description: Duration of package installation operations
- Tags:
sub_name,pub_name,editable
- Type: Counter
- Description: Number of invalidation process requests
- Tags:
sub_name,pub_name,editable
- Type: Counter
- Description: Number of successful invalidation process operations
- Tags:
sub_name,pub_name,editable
- Type: Timer
- Unit: Milliseconds
- Description: Duration of invalidation process operations
- Tags:
sub_name,pub_name,editable
- Type: Timer
- Unit: Milliseconds
- Description: Duration of distributing a distribution package. The timer starts when the package is enqueued and stops when the package is successfully imported
- Tags:
sub_name,pub_name,editable
- Type: Timer
- Unit: Milliseconds
- Description: Duration that a package spent in the distribution journal. The timer starts when the package is enqueued and stops when the package is consumed
- Tags:
sub_name,pub_name,editable
- Type: Timer
- Unit: Milliseconds
- Description: Duration of sending a stored package status
- Tags:
sub_name,pub_name,editable
- Type: Timer
- Unit: Seconds
- Description: Duration until the subscriber becomes ready, categorized by the reason for readiness
- Tags:
sub_name,pub_name,editable,ready_reason - Ready Reason Values:
IDLE: Subscriber became ready after being idle for a configured durationMAX_RETRIES: Subscriber became ready because retries exceeded maximumLATENCY: Subscriber became ready because package message latency was below acceptable limitFORCE: Subscriber became ready due to forced readiness after timeout
- Type: Counter
- Description: Count of FileVault progress messages during package import/extraction operations. FileVault (Apache Jackrabbit FileVault) is used for installing content packages.
- Tags:
sub_name,pub_name,editable
- Type: Counter
- Description: Count of FileVault errors encountered during package import/extraction operations
- Tags:
sub_name,pub_name,editable
- Counter: A cumulative metric that represents a single monotonically increasing counter
- Gauge: A metric that represents a single numerical value that can arbitrarily go up and down
- Histogram: A metric that samples observations and counts them in configurable buckets
- Meter: A metric that measures the rate of events over time
- Timer: A metric that measures both the rate that a particular piece of code is called and the distribution of its duration