Skip to content

Enhancing Observability of IBM MQ with OpenTelemetry and Prometheus

Rui Liu (刘睿) edited this page Jan 10, 2025 · 11 revisions

Overview

IBM MQ is the long-established messaging middleware supported by IBM, and OpenTelemetry is the open standard for system and application observability, the OpenMetrics protocol used by Prometheus is a one of the major backends for OpenTelemetry metrics, and which has been a de facto standard for metrics since before OpenTelemetry. Using OpenTelemetry and the Prometheus protocol simply and efficiently enhances the observability of IBM MQ middleware because IBM MQ provides a rich API to support querying of metrics.

This document uses a simple and effective tool called OJR (OpenTelemetry Receives by Java), which is a simple encapsulation of OpenTelemetry's Java SDK, but does greatly simplify the logic of implementing OpenTelemetry and provides end users with a number of directly usable agents. OJR provides a variety of OpenTelemetry signals (Traces, Metrics, Logs) support, but the main implementation of the Agent are basically Metrics-related. OJR provides a convenient way to use Prometheus to output OpenTelemetry Metrics (without having to convert them through the OpenTelemetry Collector).

If you want to use the IBM MQ Agent provided by OJR, you need to install Java 8 or later. OJR Agent is equivalent to an OpenTelemetry Receiver plus an OTLP exporter, which can get the data and send it to a big number of back-end systems and applications that support the OpenTelemetry protocol (for example, OpenTelemetry Collector). Currently OJR IBM MQ Agent only supports metrics.

There are two ways of connecting IBM MQ Managers: One is for OJR IBM MQ Agent located in the same host of IBM MQ queue managers (local mode). The other is for OJR IBM MQ Agent located in different host of IBM MQ queue managers (non-local mode). There is an "isLocal" parameter which can decide to use local mode (isLocal: true) or non-local mode (isLocal: false).

  • The local mode requires related shared libraries (e.g. libmqjbnd.so) to be found. User need to set LD_LIBRARY_PATH environment variable. For example:

    export LD_LIBRARY_PATH=/opt/mqm/java/lib64
    
  • The non-local mode requires proper IBM MQ listener, server connection channel, and security requirements just like remote IBM MQ client.

Refer to the IBM MQ documentation for more details of IBM MQ configurations. ​For OJR IBM MQ Agent related configuration, please refer to OJR IBM MQ Agent document。 ​

Installation and Configuration

Here is the overall architecture of the demo environment: Demo of OJR IBM MQ Agent

The following briefly describes the steps to make IBM MQ support Prometheus or OpenTelemetry via OJR. Then you have to download the IBM MQ Agent for OJR. e.g. Use the following command to download the IBM MQ Agent provided with OJR release v0.6.2:

wget https://github.com/liurui-software/ojr/releases/download/v0.6.2/ojr-ibmmq-0.1.3.tar

Expand the Agent using the following command:

tar vxf ojr-ibmmq-0.1.2.tar
cd ojr-ibmmq-0.1.2

The next step is to configure the IBM MQ connection. Let's assume that the user has two IBM MQ managers, one called VENUS and one called SATURN. We use local-binding to connect to VENUS and Java Client to connect to SATURN. Here is an example of the configuration file config/config.yaml:

instances:
  - queueManager: VENUS
    isLocal: true
    queuesMonitored: Q*
 
    ## Data collector properties:
    otel.poll.interval: 25
    otel.callback.interval: 30
    #otel.backend.url: http://localhost:4318
    otel.transport: prometheus
    prometheus.port: 16543
 
  - queueManager: SATURN
    isLocal: false
    #user: liurui
    #password: xxxxxxxx
    host: 192.168.1.1
    port: 1414
    queuesMonitored: Q*
    #customEventQueues:
    #keystore:
    #keystorePassword:
    #cipherSuite: TLS_RSA_WITH_AES_256_CBC_SHA256
 
    ## Data collector properties:
    otel.poll.interval: 25
    otel.callback.interval: 30
    #otel.backend.url: http://localhost:4318
    otel.transport: prometheus
    prometheus.port: 16543

As you can see from this configuration file, we have filtered the monitored queues (using the filter pattern string “Q*” for the “queuesMonitored” value). This is because IBM MQ has a lot of internal queues whose status is generally not of concern to the user. Please refer to the IBM MQ documentation for information on how to write filter pattern strings. Another point is that the “otel.transport” value is only set to “prometheus”. so only the Prometheus endpoint is presented and no OpenTelemetry data is sent out.

Now it's time to start the OJR MQ Agent. If you use synchronization, you can use the following command:

./bin/ojr-ibmmq

So we can see log message similar to the following:

Dec 31, 2024 1:02:19 PM com.ojr.core.BasicDcAgent start
INFO: DC No.1 is collecting data...
Dec 31, 2024 1:02:19 PM com.ojr.core.BasicDcAgent start
INFO: DC No.2 is collecting data...
Dec 31, 2024 1:02:20 PM com.ojr.ibmmq.MQDc collectData
INFO: Start to collect metrics
Dec 31, 2024 1:02:20 PM com.ojr.ibmmq.mqclient.MQClient connect
INFO: Connecting to VENUS in local binding mode...
Dec 31, 2024 1:02:20 PM com.ojr.ibmmq.MQDc collectData
INFO: Start to collect metrics
Dec 31, 2024 1:02:20 PM com.ojr.ibmmq.mqclient.MQClient connect
INFO: Connecting to SATURN in client mode...
Dec 31, 2024 1:02:21 PM com.ojr.core.metric.RawMetric$DataPoint setValue
INFO: New metric value: ibmmq.qmgr.metadata/default=1
Dec 31, 2024 1:02:21 PM com.ojr.core.metric.RawMetric$DataPoint setValue
INFO: New metric value: ibmmq.qmgr.cmd.level/default=941
Dec 31, 2024 1:02:21 PM com.ojr.core.metric.RawMetric$DataPoint setValue
INFO: New metric value: ibmmq.qmgr.max.handles/default=256
Dec 31, 2024 1:02:21 PM com.ojr.core.metric.RawMetric$DataPoint setValue
INFO: New metric value: ibmmq.qmgr.metadata/default=1
… …

Now OJR IBM MQ Agent presents the IBM MQ metrics as Prometheus endpoint. This can be done by visiting "http://localhost:16543/metrics" to read those metrics. Below is a screenshot of a portion of the metrics:

# HELP ibmmq_channel_buffers_received_total The number of buffers received by the channel
# TYPE ibmmq_channel_buffers_received_total counter
ibmmq_channel_buffers_received_total{channel="TO.SATURN",host_name="LRPC",ojr="ibmmq",queueManager="VENUS"} 2.0
ibmmq_channel_buffers_received_total{channel="TO.SATURN",ojr="ibmmq",queueManager="SATURN"} 2.0
# HELP ibmmq_channel_buffers_sent_total The number of buffers sent by the channel
# TYPE ibmmq_channel_buffers_sent_total counter
ibmmq_channel_buffers_sent_total{channel="TO.SATURN",host_name="LRPC",ojr="ibmmq",queueManager="VENUS"} 2.0
ibmmq_channel_buffers_sent_total{channel="TO.SATURN",ojr="ibmmq",queueManager="SATURN"} 2.0
# HELP ibmmq_channel_bytes_received_total The number of bytes received by the channel
# TYPE ibmmq_channel_bytes_received_total counter
ibmmq_channel_bytes_received_total{channel="TO.SATURN",host_name="LRPC",ojr="ibmmq",queueManager="VENUS"} 296.0
ibmmq_channel_bytes_received_total{channel="TO.SATURN",ojr="ibmmq",queueManager="SATURN"} 296.0
# HELP ibmmq_channel_bytes_sent_total The number of bytes sent by the channel
# TYPE ibmmq_channel_bytes_sent_total counter
ibmmq_channel_bytes_sent_total{channel="TO.SATURN",host_name="LRPC",ojr="ibmmq",queueManager="VENUS"} 296.0
ibmmq_channel_bytes_sent_total{channel="TO.SATURN",ojr="ibmmq",queueManager="SATURN"} 296.0
# HELP ibmmq_channel_indoubt_status The in-doubt status of the channel (0: No, 1: Yes)
# TYPE ibmmq_channel_indoubt_status gauge
ibmmq_channel_indoubt_status{channel="TO.SATURN",host_name="LRPC",ojr="ibmmq",queueManager="VENUS"} 0.0
ibmmq_channel_indoubt_status{channel="TO.SATURN",ojr="ibmmq",queueManager="SATURN"} 0.0
# HELP ibmmq_channel_status The status of the channel (0: Inactive, 1: Binding, 2: Quiescing, 3: Running, 4: Stopping, 5: Retrying, 6: Stopped, 7: Requesting, 8: Paused, 13: Initializing, 14: Switching)
# TYPE ibmmq_channel_status gauge
ibmmq_channel_status{channel="TO.SATURN",host_name="LRPC",ojr="ibmmq",queueManager="VENUS"} 3.0
ibmmq_channel_status{channel="TO.SATURN",ojr="ibmmq",queueManager="SATURN"} 3.0
# HELP ibmmq_channel_type The type of the channel (1: Sender, 2: Server, 3: Receiver, 4: Requester, 5: All, 6: Client connection, 7: Server connection, 8: Cluster receiver, 9: Cluster sender, 10: Telemetry channel, 11: AMQP)
# TYPE ibmmq_channel_type gauge
ibmmq_channel_type{channel="TO.SATURN",host_name="LRPC",ojr="ibmmq",queueManager="VENUS"} 1.0
ibmmq_channel_type{channel="TO.SATURN",ojr="ibmmq",queueManager="SATURN"} 3.0
… …

Prometheus support

So, we can access the new IBM MQ data source by adding the following to the Prometheus configuration file:

scrape_configs:
  - job_name: 'ibmmq'
    scrape_interval: 60s
    static_configs:
      - targets: ['monitor-host:16543']

This way, we can use the Grafana Dashboard to browse IBM MQ metrics. Here is a snippet of the dashboard:

Grafana dashboard for IBM MQ -1 Grafana dashboard for IBM MQ -2

The Grafana Dashboard can be configured by referring to the following example:

Here is a video showing the Grafana dashboard for IBM MQ:

[An easy way to see the Grafana dashboard for your IBM MQ]

If you do not want to install and configure your Prometheus and Grafana, while you want to see the Grafana dashboard for your IBM MQ immediately, there is an easy way for that! You still need to install and configure OJR IBM MQ Agent. The simplest way is to put OJR IBM MQ agent in the same host of IBM MQ so that there is no need to configure security parameter of IBM MQ. Then you can write a Prometheus configuration file to read the output from OJR IBM MQ agent. There is a sample Prometheus configuration file in config/prometheus.yml.

# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["localhost:9090"]

  - job_name: 'ibmmq'
    scrape_interval: 60s
    static_configs:
      - targets: ['192.168.108.128:16543']

Revise the endpoint ('192.168.108.128:16543') of the OJR IBM MQ Agent. Then you can run a Docker container to launch Prometheus and Grafana in several seconds.

  • Verify if the full path of the Prometheus configuration file is correct.
ls $PWD/config/prometheus.yml
  • Run Container for Prometheus & Grafana with the predefined Prometheus configuration file ($PWD/config/prometheus.yml).
docker run --name prom-graf-ibmmq -d -p 3300:3300 -v $PWD/config/prometheus.yml:/opt/svrs/prom/prometheus.yml quay.io/liurui/prom-graf-ibmmq:0.0.3 

And finally, you can visit Grafana dashboard using port 3300 (for example: http://localhost:3300). The default user id is "admin", password is "ojr-password".

Supporting OpenTelemetry

Below we look at the backend if OpenTelemetry is supported. There are many systems that support OpenTelemetry, Prometheus is only one of the backends and can only be used for metrics. if we modify the OJR IBM MQ configuration file config/config.yaml and change the value of the "otel.transport" key from "prometheus" to "prometheus+http", meaning that both Prometheus and "otlp/http" protocols are supported, then you can output OpenTelemetry data using the standard OTLP protocol as well as Prometheus protocol. The endpoint of "otlp/http" can be configured to use the local port 4318 by default, i.e. "http://localhost:4318". Please refer to OJR's online documentation for more information. Note that each MQ instance should be configured accordingly. Here is the entire configuration file:

instances:
  - queueManager: VENUS
    isLocal: true
    queuesMonitored: Q*
 
    ## Data collector properties:
    otel.poll.interval: 25
    otel.callback.interval: 30
    #otel.backend.url: http://localhost:4318
    otel.transport: prometheus+http
    prometheus.port: 16543
 
  - queueManager: SATURN
    isLocal: false
    #user: liurui
    #password: xxxxxxxx
    host: 192.168.1.1
    port: 1414
    queuesMonitored: Q*
    #customEventQueues:
    #keystore:
    #keystorePassword:
    #cipherSuite: TLS_RSA_WITH_AES_256_CBC_SHA256
 
    ## Data collector properties:
    otel.poll.interval: 25
    otel.callback.interval: 30
    #otel.backend.url: http://localhost:4318
    otel.transport: prometheus+http
    prometheus.port: 16543

There are many backend applications that can accept OpenTelemetry data, one of the more commonly used is OpenTelemetry Collector. there are many distributions of OpenTelemetry Collector, I am using opentelemetry-collector-contrib here. use the OpenTelemetry Collector receives metrics from IBM MQ, which can be processed and sent to the next level of the OpenTelemetry Collector backend. I have an example here of sending to the ClickHouse database using the following configuration:

receivers:
  otlp:
    protocols:
      grpc:
      http:
        cors:
          allowed_origins:
            - "http://*"
            - "https://*"
 
exporters:
  debug:
    verbosity: detailed
  clickhouse:
    endpoint: tcp://127.0.0.1:9000?dial_timeout=10s
    username: default
    password: clickme
    database: default
    async_insert: true
    ttl: 72h
    compress: lz4
    create_schema: true
    logs_table_name: otel_logs
    traces_table_name: otel_traces
    timeout: 5s
    metrics_tables:
      gauge:
        name: "otel_metrics_gauge"
      sum:
        name: "otel_metrics_sum"
      summary:
        name: "otel_metrics_summary"
      histogram:
        name: "otel_metrics_histogram"
      exponential_histogram:
        name: "otel_metrics_exp_histogram"
    retry_on_failure:
      enabled: true
      initial_interval: 5s
      max_interval: 30s
      max_elapsed_time: 300s
 
processors:
  batch:
 
service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [debug,clickhouse]
    metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [debug,clickhouse]
    logs:
      receivers: [otlp]
      processors: [batch]
      exporters: [debug,clickhouse]

After starting the OJR IBM MQ Agent, we see that the data has been sent to the ClickHouse database. This can be viewed using the following command:

MYPC. :) select TimeUnix,MetricName,Attributes,Value from otel_metrics_sum limit 50
 
SELECT
    TimeUnix,
    MetricName,
    Attributes,
    Value
FROM otel_metrics_sum
LIMIT 50
 
Query id: 9dd51d48-ed9e-497b-9609-a00368ba3830
 
    ┌──────────────────────TimeUnix─┬─MetricName─────────────────────┬─Attributes──────────────┬─Value─┐
 1. │ 2024-12-31 00:13:24.104000000 │ ibmmq.channel.buffers.received │ {'channel':'TO.SATURN'} │     1 │
 2. │ 2024-12-31 00:13:55.082000000 │ ibmmq.channel.buffers.received │ {'channel':'TO.SATURN'} │     1 │
 3. │ 2024-12-31 00:14:25.882000000 │ ibmmq.channel.buffers.received │ {'channel':'TO.SATURN'} │     1 │
 4. │ 2024-12-31 00:14:56.757000000 │ ibmmq.channel.buffers.received │ {'channel':'TO.SATURN'} │     1 │
 5. │ 2024-12-31 00:15:27.645000000 │ ibmmq.channel.buffers.received │ {'channel':'TO.SATURN'} │     1 │
 6. │ 2024-12-31 00:15:58.517000000 │ ibmmq.channel.buffers.received │ {'channel':'TO.SATURN'} │     1 │
 7. │ 2024-12-31 00:16:29.423000000 │ ibmmq.channel.buffers.received │ {'channel':'TO.SATURN'} │     1 │
 8. │ 2024-12-31 00:17:00.313000000 │ ibmmq.channel.buffers.received │ {'channel':'TO.SATURN'} │     1 │
 9. │ 2024-12-31 00:17:30.313000000 │ ibmmq.channel.buffers.received │ {'channel':'TO.SATURN'} │     1 │
10. │ 2024-12-31 00:18:01.632000000 │ ibmmq.channel.buffers.received │ {'channel':'TO.SATURN'} │     1 │
11. │ 2024-12-31 00:18:32.505000000 │ ibmmq.channel.buffers.received │ {'channel':'TO.SATURN'} │     2 │
12. │ 2024-12-31 00:19:03.395000000 │ ibmmq.channel.buffers.received │ {'channel':'TO.SATURN'} │     2 │
13. │ 2024-12-31 00:19:34.318000000 │ ibmmq.channel.buffers.received │ {'channel':'TO.SATURN'} │     2 │
14. │ 2024-12-31 00:20:05.109000000 │ ibmmq.channel.buffers.received │ {'channel':'TO.SATURN'} │     2 │
15. │ 2024-12-31 00:20:35.970000000 │ ibmmq.channel.buffers.received │ {'channel':'TO.SATURN'} │     2 │
16. │ 2024-12-31 00:21:06.796000000 │ ibmmq.channel.buffers.received │ {'channel':'TO.SATURN'} │     2 │
17. │ 2024-12-31 00:21:37.707000000 │ ibmmq.channel.buffers.received │ {'channel':'TO.SATURN'} │     2 │
18. │ 2024-12-31 00:22:08.588000000 │ ibmmq.channel.buffers.received │ {'channel':'TO.SATURN'} │     2 │
19. │ 2024-12-31 00:22:38.587000000 │ ibmmq.channel.buffers.received │ {'channel':'TO.SATURN'} │     2 │
20. │ 2024-12-31 00:23:09.647000000 │ ibmmq.channel.buffers.received │ {'channel':'TO.SATURN'} │     3 │
… …

Postscript

Through the introduction of this article can be seen, the use of OJR IBM MQ Agent to enhance the observability of IBM MQ is a relatively easy thing. OJR IBM MQ Agent is still in the process of improvement, if you find a variety of problems or need to improve the place, you can open a Ticket to the author. The related code is at: "https://github.com/liurui-software/ojr/tree/main/ojr-ibmmq".

Appendix

IBM MQ Java applications have two ways of connecting IBM MQ Managers, one called the Java Binding mode and one called the Java Client mode.

  • The MQ Java Binding approach (also known as local binding approach) uses JNI (Java Native Interface) similar to the MQ Server application. The fastest way to connect to MQSeries Java Client Server is the MQ Java Binding approach, which requires the MQ Java application and MQ Server to be on the same machine. Using the MQ Java Binding approach avoids the overhead of establishing a network connection, so it should be used when connectivity is critical to performance.

  • The MQ Java Client method connects through a server connection channel defined on the Server side, and the Server side needs to start a listening program. The MQ Java Client method is used to connect when the Java client program and the server are not on the same machine, and there is no need to install the MQ Client or the Server.

If you use the MQ Java Binding method, you need to configure the following environment variables to specify the location of the JNI shared libraries (e.g. libmqjbnd.so), for example:

export LD_LIBRARY_PATH=/opt/mqm/java/lib64

Refer to the IBM MQ documentation for details on setting up security.