11# Guidance Document: Reliable Audit Logging with OpenTelemetry
22
3- This document describes the recommended 3-tier architecture (Client SDK → OpenTelemetry Collector → Final Storage Sink) for highly reliable
4- audit log delivery. It focuses on minimizing data loss, ensuring long retention, and keeping operational complexity under control.
3+ This guidance paper explains recommended practices and an operational architecture for building a highly reliable audit logging pipeline
4+ using OpenTelemetry (OTel). Its primary goal is to help architects, platform engineers and application developers design audit log delivery
5+ that minimizes data loss, supports long retention, and remains operationally manageable.
6+
7+ Purpose and motivation:
8+
9+ - Provide concise, actionable recommendations for each layer of the pipeline (Client SDK → Collector → Final Storage) so teams can make
10+ consistent design choices across environments.
11+ - Emphasize durability and predictable retention over low latency: audit logs often have legal and compliance requirements where losing
12+ events is unacceptable and duplicates are preferable to missing data.
13+ - Reduce operational complexity by recommending focused components (dedicated collector pipelines, node‑local buffering where appropriate,
14+ and clear monitoring/alerting points) rather than mixing audit logs with high‑volume telemetry.
15+
16+ Audience:
17+
18+ - Platform and SRE teams building or operating OTel collection and processing infrastructure.
19+ - Application and SDK developers who implement audit logging clients and integrations.
20+ - Security, compliance, and data governance teams evaluating retention and immutable storage requirements.
521
622## Scope & Goals
723
24+ In scope: reliable delivery patterns, buffering strategies (client vs. agent), collector configuration recommendations, monitoring and
25+ runbooks, and guidance for final sink durability and compliance controls.
26+
827Primary goals:
928
1029- No audit event loss ("at least once" delivery – duplicates acceptable, loss is not).
@@ -14,6 +33,9 @@ Primary goals:
1433
1534Non-goals:
1635
36+ Out of scope: vendor‑specific implementation details for every storage backend, full legal retention policy text, and high‑performance
37+ telemetry optimizations that trade durability for throughput.
38+
1739- Ultra low latency for audit logs (latency is secondary to durability).
1840- Mixing audit and high-volume debug/info logs in the same pipeline.
1941
0 commit comments