DataDog
diff --git a/‎n8n/README.md‎
Lines changed: 106 additions & 21 deletions b/‎n8n/README.md‎
Lines changed: 106 additions & 21 deletions
diff --git a/‎n8n/assets/configuration/spec.yaml‎
Lines changed: 1 addition & 1 deletion b/‎n8n/assets/configuration/spec.yaml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎n8n/changelog.d/23635.changed‎
Lines changed: 6 additions & 0 deletions b/‎n8n/changelog.d/23635.changed‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎n8n/datadog_checks/n8n/check.py‎
Lines changed: 33 additions & 36 deletions b/‎n8n/datadog_checks/n8n/check.py‎
Lines changed: 33 additions & 36 deletions
diff --git a/‎n8n/datadog_checks/n8n/data/conf.yaml.example‎
Lines changed: 1 addition & 1 deletion b/‎n8n/datadog_checks/n8n/data/conf.yaml.example‎
Lines changed: 1 addition & 1 deletion
@@ -2,15 +2,15 @@
 
 ## Overview
 
-This check monitors [n8n][1] through the Datadog Agent. 
+This check monitors [n8n][1] through the Datadog Agent.
 
 Collect n8n metrics including:
-- Cache metrics: Hit and miss statistics.
-- Message event bus metrics: Event-related metrics.
-- Workflow metrics: Can include workflow ID labels.
-- Node metrics: Can include node type labels.
-- Credential metrics: Can include credential type labels.
-- Queue metrics
+- Cache metrics: hit, miss, and update counts.
+- Workflow metrics: started, success, failed counters, audit workflow lifecycle counters; in n8n 2.x, an execution-duration histogram.
+- Node metrics: per-node started and finished counters emitted by worker processes in queue mode.
+- Queue metrics: queue depth; enqueued, dequeued, completed, failed, and stalled counters; and scaling-mode worker gauges.
+- HTTP metrics: request duration histograms tagged with status code.
+- Process and Node.js runtime metrics.
 
 
 ## Setup
@@ -40,13 +40,79 @@ N8N_METRICS_INCLUDE_CACHE_METRICS=true
 N8N_METRICS_INCLUDE_MESSAGE_EVENT_BUS_METRICS=true
 N8N_METRICS_INCLUDE_WORKFLOW_ID_LABEL=true
 N8N_METRICS_INCLUDE_API_ENDPOINTS=true
+N8N_METRICS_INCLUDE_QUEUE_METRICS=true
+
+# Optional: n8n 2.x adds workflow_statistics gauges (workflows, users, executions, ...) - opt in
+N8N_METRICS_INCLUDE_WORKFLOW_STATISTICS=true
 
 # Optional: Customize the metric prefix (default is 'n8n_')
 N8N_METRICS_PREFIX=n8n_
 ```
 
 For more details, see the n8n documentation on [enabling Prometheus metrics][10].
 
+If you change `N8N_METRICS_PREFIX` from its default of `n8n_`, you **must** also set `raw_metric_prefix` in the integration's `conf.yaml` to the same value. Otherwise the check will not recognize the exposed metric names and will silently submit nothing:
+
+```yaml
+instances:
+  - openmetrics_endpoint: http://localhost:5678/metrics
+    raw_metric_prefix: my_custom_prefix_
+```
+
+#### Event-driven counters
+
+Most n8n counters are registered dynamically the first time their underlying event fires. The integration ships mappings for around 70 of these event-bus counters, including:
+
+- Workflow lifecycle: `n8n.workflow.started.count`, `n8n.workflow.success.count`, `n8n.workflow.failed.count`, `n8n.workflow.cancelled.count`
+- Audit (workflow, user, credentials, package, variable, execution data): `n8n.audit.workflow.executed.count`, `n8n.audit.user.login.success.count`, `n8n.audit.user.credentials.created.count`, and similar
+- AI nodes: `n8n.ai.tool.called.count`, `n8n.ai.llm.generated.count`, `n8n.ai.vector.store.searched.count`, and similar
+- Runner, queue, and node lifecycle: `n8n.runner.task.requested.count`, `n8n.queue.job.completed.count`, `n8n.node.started.count`, `n8n.node.finished.count`
+
+These counters do not appear on the `/metrics` endpoint until the corresponding event has occurred. A healthy idle deployment will not produce data points for them until that activity fires. The complete list is in [`metadata.csv`][7].
+
+If a future n8n release exposes a new event-driven counter that is not yet covered by this integration, add it to the `extra_metrics` option in your instance configuration:
+
+```yaml
+instances:
+  - openmetrics_endpoint: http://n8n:5678/metrics
+    extra_metrics:
+      - some_new_n8n_event_total: some.new.n8n.event
+```
+
+The left-hand side is the Prometheus counter name as n8n exposes it (keep the `_total` suffix). The right-hand side is the dotted Datadog metric name to submit it as.
+
+#### Queue mode and workers
+
+In queue mode, n8n runs separate worker processes that execute jobs picked up from a Redis-backed queue. Each worker exposes its own `/metrics` endpoint and emits a different subset of metrics than the main process. Worker-observed metrics include `n8n.queue.job.dequeued.count`, `n8n.queue.job.stalled.count`, `n8n.node.started.count`, `n8n.node.finished.count`, and `n8n.runner.task.requested.count`. Main-only metrics include `n8n.instance.role.leader` and the `n8n.scaling.mode.queue.jobs.*` family.
+
+To expose worker metrics, set `QUEUE_HEALTH_CHECK_ACTIVE=true` and `QUEUE_HEALTH_CHECK_PORT=<port>` on each worker. **In n8n 2.x, port `5679` is reserved for the task runner broker, so pick a different port (for example `5680`).**
+
+For full coverage in queue deployments, configure one Datadog instance per n8n process exposing `/metrics`, including main and worker processes:
+
+```yaml
+instances:
+  - openmetrics_endpoint: http://n8n-main:5678/metrics
+  - openmetrics_endpoint: http://n8n-worker:5680/metrics
+```
+
+#### Version-specific metrics
+
+Several metric families were introduced in n8n 2.x and are not emitted on n8n 1.x:
+
+- `n8n.workflow.execution.duration.seconds.*` (histogram). Gated by `N8N_METRICS_INCLUDE_WORKFLOW_EXECUTION_DURATION`, which defaults to `true` in n8n 2.x.
+- `n8n.audit.workflow.activated.count`, `n8n.audit.workflow.deactivated.count`, `n8n.audit.workflow.executed.count`, `n8n.audit.workflow.resumed.count`, `n8n.audit.workflow.version.updated.count`, and `n8n.audit.workflow.waiting.count`
+- `n8n.embed.login.requests.count` (tagged with `result:success` or `result:failure`), `n8n.embed.login.failures.count` (tagged with `reason`)
+- `n8n.token.exchange.requests.count` (tagged with `result:success` or `result:failure`), `n8n.token.exchange.failures.count` (tagged with `reason`), `n8n.token.exchange.identity.linked.count`, `n8n.token.exchange.jit.provisioning.count`
+- `n8n.process.pss.bytes` (Linux only)
+- The `n8n.{production,manual,production.root}.executions`, `n8n.users.total`, `n8n.enabled.users`, `n8n.workflows.total`, and `n8n.credentials.total` family. Only emitted when `N8N_METRICS_INCLUDE_WORKFLOW_STATISTICS=true` is set.
+- The `n8n.expression.*` family (`evaluation.duration.seconds`, `code.cache.{hit,miss,eviction,size}`, `pool.{acquired,replenish.failed,scaled.up,scaled.to.zero}`). Only emitted when n8n is running the new VM-isolated expression engine *and* observability for it is on. Set `N8N_EXPRESSION_ENGINE=vm` and `N8N_EXPRESSION_ENGINE_OBSERVABILITY_ENABLED=true` on the n8n process; both default to off (the engine defaults to `legacy`). These metrics surface the per-expression evaluation latency, the compiled-expression LRU cache hit and miss rates, and the V8-isolate pool's idle scaling behavior. They are most useful for troubleshooting workflow latency that traces back to slow `{{ ... }}` evaluation.
+
+Some metrics only emit samples after the corresponding runtime event occurs. For example, failures-only counters (`*.failures.count`) need an authentication failure, audit workflow counters need the matching workflow state transition, and the libuv `n8n.nodejs.active.requests` gauge needs an in-flight libuv request. A healthy idle deployment may not produce data points for these metrics until that activity occurs.
+
+#### Tag cardinality
+
+When `N8N_METRICS_INCLUDE_WORKFLOW_ID_LABEL=true`, http and workflow execution histograms are tagged with `workflow_id` (and similar labels for nodes). On deployments with many distinct workflows or nodes, this can produce high-cardinality metrics. Drop the label via `exclude_labels` or omit `N8N_METRICS_INCLUDE_WORKFLOW_ID_LABEL` to keep tag cardinality bounded.
+
 #### Configure the Datadog Agent
 
 1. Edit the `n8n.d/conf.yaml` file, in the `conf.d/` folder at the root of your Agent's configuration directory to start collecting your n8n performance data. See the [sample n8n.d/conf.yaml][4] for all available configuration options.
@@ -59,27 +125,32 @@ _Available for Agent versions >6.0_
 
 #### Enable n8n logging
 
-Configure n8n to output logs by setting the following environment variables:
+Configure n8n application logs by setting the following environment variables:
 
 ```bash
 # Set the log level (error, warn, info, debug)
 N8N_LOG_LEVEL=info
 
-# Output logs to console (for containerized environments) or file
+# Output application logs to console or file
 N8N_LOG_OUTPUT=console
 
-# If using file output, specify the log file location
+# Use JSON formatting so Datadog can parse n8n application log attributes
+N8N_LOG_FORMAT=json
+
+# If using file output, specify the application log file location
 N8N_LOG_FILE_LOCATION=/var/log/n8n/n8n.log
 ```
 
 #### Structured event logs
 
-n8n can output structured JSON logs to `n8nEventLog.log` containing detailed workflow execution events. Enable this by setting the log output to file:
+n8n also writes structured event bus logs to `n8nEventLog*.log`. These logs contain workflow, node, queue, runner, and audit events and are separate from the application logs controlled by `N8N_LOG_OUTPUT` and `N8N_LOG_FILE_LOCATION`.
 
-```bash
-N8N_LOG_OUTPUT=file
-N8N_LOG_FILE_LOCATION=/var/log/n8n/
-```
+By default, event bus log files are written under the n8n user folder, for example:
+
+- Host installations: `~/.n8n/n8nEventLog*.log`
+- Official Docker image: `/home/node/.n8n/n8nEventLog*.log`
+
+If you use a custom n8n user folder, collect the event bus logs from that folder instead. If you customize the event bus log file base name with `N8N_EVENTBUS_LOGWRITER_LOGBASENAME`, update the Datadog log path to match.
 
 The event log includes the following event types:
 
@@ -102,32 +173,46 @@ Each event contains rich metadata including `executionId`, `workflowId`, `workfl
    logs_enabled: true
    ```
 
-2. Add this configuration block to your `n8n.d/conf.yaml` file to start collecting your n8n logs:
+2. Add log collection entries to your `n8n.d/conf.yaml` file.
+
+   For a host-based n8n installation where the Agent can read local files, collect the application log file and the event bus log files:
 
    ```yaml
    logs:
      - type: file
        path: /var/log/n8n/*.log
        source: n8n
-       service: n8n
+       service: <SERVICE>
+     - type: file
+       path: /home/n8n/.n8n/n8nEventLog*.log
+       source: n8n
+       service: <SERVICE>
    ```
 
-   For containerized environments using Docker, use the following configuration instead:
+   Adjust `/home/n8n/.n8n/n8nEventLog*.log` to the n8n user folder on your host.
+
+   For a containerized n8n deployment, collect stdout and stderr from the n8n container for application logs, and make the n8n user folder available to the Agent for event bus file logs. For example, if the n8n data directory is mounted on the host at `/var/lib/n8n`, configure:
 
    ```yaml
    logs:
      - type: docker
        source: n8n
-       service: n8n
+       service: <SERVICE>
+     - type: file
+       path: /var/lib/n8n/n8nEventLog*.log
+       source: n8n
+       service: <SERVICE>
    ```
 
+   If the Agent runs in a container, mount the n8n data volume or host directory into the Agent container and use the path as seen from inside the Agent container.
+
 3. [Restart the Agent][5].
 
 ### Validation
 
 [Run the Agent's status subcommand][6] and look for `n8n` under the Checks section.
 
-## Data Collected
+## Data collected
 
 ### Metrics
 
@@ -137,7 +222,7 @@ See [metadata.csv][7] for a list of metrics provided by this integration.
 
 The n8n integration does not include any events.
 
-### Service Checks
+### Service checks
 
 See [service_checks.json][8] for a list of service checks provided by this integration.
 
 
@@ -12,7 +12,7 @@ files:
         openmetrics_endpoint.required: true
         openmetrics_endpoint.hidden: false
         openmetrics_endpoint.display_priority: 1
-        openmetrics_endpoint.value.example: http://localhost:5678
+        openmetrics_endpoint.value.example: http://localhost:5678/metrics
         openmetrics_endpoint.description: |
           Endpoint exposing the n8n's metrics in the OpenMetrics format. For more information, refer to:
           https://docs.n8n.io/hosting/logging-monitoring/monitoring/ 
 
@@ -0,0 +1,6 @@
+Improve the n8n metric coverage:
+
+  - Correct missing or incorrect metrics.
+  - Add metrics introduced in n8n 2.x (workflow execution duration, audit events, authentication, workflow and user statistics, expression engine, and process memory).
+  - Track n8n's dynamic events (workflow cancellations, audit activity, AI nodes, user and credential changes, package and variable changes).
+  - Add support for monitoring n8n worker processes alongside the main process.
@@ -2,58 +2,55 @@
 # All rights reserved
 # Licensed under a 3-clause BSD style license (see LICENSE)
 
-from urllib.parse import urljoin
+from functools import cached_property
+from typing import Any
+from urllib.parse import urljoin, urlparse
+
+from requests.exceptions import RequestException
 
 from datadog_checks.base import OpenMetricsBaseCheckV2
 from datadog_checks.n8n.metrics import METRIC_MAP, RENAME_LABELS_MAP
 
 from .config_models import ConfigMixin
 
-DEFAULT_READY_ENDPOINT = '/healthz/readiness'
+DEFAULT_READY_PATH = '/healthz/readiness'
 
 
 class N8nCheck(OpenMetricsBaseCheckV2, ConfigMixin):
     __NAMESPACE__ = 'n8n'
     DEFAULT_METRIC_LIMIT = 0
 
-    def __init__(self, name, init_config, instances=None):
-        super(N8nCheck, self).__init__(
-            name,
-            init_config,
-            instances,
-        )
-        self.openmetrics_endpoint = self.instance["openmetrics_endpoint"]
-        self.tags = self.instance.get('tags', [])
-        self._ready_endpoint = DEFAULT_READY_ENDPOINT
-
-    def get_default_config(self):
+    def get_default_config(self) -> dict[str, Any]:
         return {
             'metrics': [METRIC_MAP],
             'rename_labels': RENAME_LABELS_MAP,
             'raw_metric_prefix': 'n8n_',
         }
 
-    def _check_n8n_readiness(self):
-        endpoint = urljoin(self.openmetrics_endpoint, self._ready_endpoint)
-        response = self.http.get(endpoint)
-
-        # Determine metric value and status_code tag
-        if response.status_code is None:
-            self.log.warning("The readiness endpoint did not return a status code")
-            metric_value = 0
-            metric_tags = self.tags + ['status_code:null']
-        elif response.status_code == 200:
-            # Ready - submit 1
-            metric_value = 1
-            metric_tags = self.tags + [f'status_code:{response.status_code}']
-        else:
-            # Not ready - submit 0
-            metric_value = 0
-            metric_tags = self.tags + [f'status_code:{response.status_code}']
-
-        # Submit metric with appropriate value and status_code tag
-        self.gauge('readiness.check', metric_value, tags=metric_tags)
-
-    def check(self, instance):
-        super().check(instance)
+    @cached_property
+    def _readiness_endpoint(self) -> str:
+        parsed = urlparse(self.config.openmetrics_endpoint)
+        base = f'{parsed.scheme}://{parsed.netloc}'
+        return urljoin(base, DEFAULT_READY_PATH)
+
+    def _check_n8n_readiness(self) -> None:
+        endpoint = self._readiness_endpoint
+        tags = list(self.config.tags or ())
+
+        try:
+            response = self.http.get(endpoint)
+        except RequestException as e:
+            self.log.warning("Could not reach n8n readiness endpoint %s: %s", endpoint, e)
+            self.gauge('readiness.check', 0, tags=tags + ['status_code:none'])
+            return
+
+        is_ready = 200 <= response.status_code < 300
+        self.gauge(
+            'readiness.check',
+            1 if is_ready else 0,
+            tags=tags + [f'status_code:{response.status_code}'],
+        )
+
+    def check(self, instance: dict[str, Any]) -> None:
         self._check_n8n_readiness()
+        super().check(instance)
@@ -18,7 +18,7 @@ instances:
     ## https://docs.n8n.io/hosting/logging-monitoring/monitoring/ 
     ## https://docs.n8n.io/hosting/configuration/environment-variables/endpoints/
     #
-  - openmetrics_endpoint: http://localhost:5678
+  - openmetrics_endpoint: http://localhost:5678/metrics
 
     ## @param raw_metric_prefix - string - optional - default: n8n_
     ## The prefix prepended to all metrics from n8n.
Original file line number	Diff line number	Diff line change
`@@ -18,7 +18,7 @@ instances:`
`18`	`18`	`## https://docs.n8n.io/hosting/logging-monitoring/monitoring/`
`19`	`19`	`## https://docs.n8n.io/hosting/configuration/environment-variables/endpoints/`
`20`	`20`	`#`
`21`		`- - openmetrics_endpoint: http://localhost:5678`
	`21`	`+ - openmetrics_endpoint: http://localhost:5678/metrics`
`22`	`22`
`23`	`23`	`## @param raw_metric_prefix - string - optional - default: n8n_`
`24`	`24`	`## The prefix prepended to all metrics from n8n.`