Skip to content

Commit 4f715d2

Browse files
authored
Update acd.md
removed lucid chart and added a architectureOverview.png file to the root Signed-off-by: Sanjay BS <sanjay.bangalore.shivanna@sap.com>
1 parent 72b7dce commit 4f715d2

File tree

1 file changed

+17
-23
lines changed

1 file changed

+17
-23
lines changed

acd.md

Lines changed: 17 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -19,26 +19,25 @@
1919
## 1. EXECUTIVE SUMMARY
2020

2121
This Architectural Concept Document (ACD) presents a Proof of Concept (POC) for implementing OpenTelemetry (Otel) SDK’s logging features in
22-
a distributed architecture. The POC centers on a Recommendation Service generating log data, which traverses several processing layers en
23-
route to Audit Log Services V3. The objective is to test logging message integrity, identify data loss points, and optimize telemetry flows
22+
a distributed architecture. The POC centers on a Recommendation Service generating log data, which traverses several processing layers. The objective is to test logging message integrity, identify data loss points, and optimize telemetry flows
2423
for robust observability.
2524

2625
---
2726

2827
## 2. INTRODUCTION
2928

3029
This document details a technical blueprint for leveraging OpenTelemetry’s logging SDK within a cloud-native architecture. The focus is to
31-
assess potential logging message loss and performance bottlenecks, primarily within the Recommendation Service and its downstream audit log
30+
assess potential logging message loss and performance bottlenecks, primarily within the Recommendation Service and its downstream log sink
3231
pipeline.
3332

3433
---
3534

3635
## 3. BUSINESS CASE
3736

38-
Ensuring audit logs are reliably captured and transmitted is critical for compliance, troubleshooting, and operational visibility. The
37+
Ensuring logs are reliably captured and transmitted is critical for compliance, troubleshooting, and operational visibility. The
3938
adoption of OpenTelemetry promises unified observability but raises questions regarding potential data loss and reliability, particularly
4039
when logs traverse complex or unreliable network paths. This POC provides a structured method to evaluate, optimize, and ultimately
41-
standardize audit logging practices.
40+
standardize logging practices.
4241

4342
---
4443

@@ -52,41 +51,36 @@ standardize audit logging practices.
5251
| **SDK Exporter** | In-process module that forwards log data to Otel Collector. |
5352
| **Otel Collector** | Middleware node aggregating, processing, and routing logs. |
5453
| **Processors** | Sub-components within Otel Collector (filtering, enriching, batching). |
55-
| **AuditLog Services V3 Exporter** | Sends the finalized log data to external Audit Log endpoint via internet. |
5654

5755
### 4.3 Architecture Diagram
5856

59-
lucid chart link :
60-
<https://lucid.app/lucidchart/7a9fa1de-2640-4a2d-a038-0f7284a0800f/edit?page=p92ebrH0iSU9r&invitationId=inv_812faa02-bebb-4df2-aec3-882d5b027543#>
57+
![Architecture Overview] (ArchitectureOverview.png)
58+
6159

6260
## 5. ARCHITECTURE DECISIONS
6361

6462
Use OpenTelemetry SDK within application code for cross-vendor and standardized telemetry generation. Externalize processing to Otel
65-
Collector for operational flexibility without code deployment. Employ processors (filtering, batching) for scaling and compliance with
63+
Collector for operational flexibility. Employ processors (filtering, batching) for scaling and compliance with
6664
remote API limits. Decouple network transmission from application code, handing over all egress responsibilities to Otel Collector.
67-
Instrument with checkpoints and monitoring at each component boundary for reliability assessment. Select AuditLog Services V3 Exporter due
68-
to organizational integration requirements.
65+
Instrument with checkpoints and monitoring at each component boundary for reliability assessment.
6966

7067
## 6. OPEN POINTS
7168

72-
Otel SDK & Collector Version Compatibility: Need to validate if all required features and data formats are supported. API Rate Limits &
73-
Back-pressure: How will surges and API slowdowns/throttling be gracefully handled? Data Privacy & Security: Ensure logging data is
74-
sanitized/encrypted as required before egress. Collector Failure Modes: What happens to logs if Otel Collector crashes or network partition
75-
occurs? Lossy Operations in Processors: Need clear bounds on filtering/batching impacts to log completeness.
69+
Otel SDK & Collector Version Compatibility: Need to validate if all required features and data formats are supported.
70+
API Rate Limits & Back-pressure: How will surges and API slowdowns/throttling be gracefully handled?
71+
Data Privacy & Security: Ensure logging data is sanitized/encrypted as required before egress.
72+
Collector Failure Modes: What happens to logs if Otel Collector crashes or network partition occurs?
73+
Lossy Operations in Processors: Need clear bounds on filtering/batching impacts to log completeness.
7674

7775
## 7. CONCLUSION AND NEXT STEPS
7876

79-
This POC will validate the comprehensive logging flow’s reliability and highlight improvements for audit log delivery. Next steps include:
77+
This POC will validate the comprehensive logging flow’s reliability and highlights findings if there are any loss of logs as per the delivery gurantee.
8078

81-
Building and deploying test harnesses for each stage. Executing validation and stress tests. Analyzing end-to-end message integrity/loss
82-
metrics. Tuning collector/processors for optimal throughput and minimal loss. Compiling a findings and recommendations report for broader
83-
system rollout.
79+
Next steps include: Building and deploying test harnesses for each stage. Executing validation and stress tests. Analyzing end-to-end message integrity/loss metrics. Tuning collector/processors for optimal throughput and minimal loss. Compiling a findings and recommendations report for broader system rollout.
8480

8581
## 8. DECISION PROTOCOL
8682

87-
Decisions Tracked: All key design changes/choices documented in versioned change log. Review Frequency: Weekly checkpoints during POC,
83+
Decisions Tracked: All key design changes/choices documented in versioned change log.
84+
Review Frequency: Weekly checkpoints during POC,
8885
rolling up to steering committee.
8986

90-
## 9. APPENDIX
91-
92-
References to OpenTelemetry documentation Diagrams (link/attachments) API schemas and configs Example log events Test plans and scripts

0 commit comments

Comments
 (0)