Skip to content

Commit f006b06

Browse files
danielgblancomaryliagdmathieu
authored
OTel Blueprints project proposal (#3094)
* OTel Blueprints project proposal * Remove usage of British spelling * Modifications after initial feedback * Add dmathiue * Add @reyang as TC sponsor * Add new contributors to the project * Add tiffany76 to contributors * Update projects/otel-blueprints.md Co-authored-by: Daniel Gomez Blanco <dgomezblanco@newrelic.com> * Add Neil Fordyce Co-authored-by: Damien Mathieu <42@dmathieu.com> * Add contributor names to cSpell config * Add contributor names to cSpell config --------- Co-authored-by: Marylia Gutierrez <maryliag@gmail.com> Co-authored-by: Damien Mathieu <42@dmathieu.com>
1 parent 8cbb679 commit f006b06

File tree

2 files changed

+186
-0
lines changed

2 files changed

+186
-0
lines changed

.cspell.yaml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,14 +18,17 @@ ignoreRegExpList:
1818
- GitHub Handle in YML
1919
words:
2020
- Abinet
21+
- Alain
2122
- Alff
2223
- Arize
24+
- Aronoff
2325
- Ashpole
2426
- automations
2527
- Baeyens
2628
- calendar-localization-ptbr
2729
- Causely
2830
- Cheler
31+
- Ciukaj
2932
- Collibra
3033
- Coralogix
3134
- Cortez
@@ -64,6 +67,7 @@ words:
6467
- lightstep
6568
- logz
6669
- Luca
70+
- Lukasz
6771
- maintainership
6872
- Makefiles
6973
- Marylia
@@ -76,12 +80,14 @@ words:
7680
- otep
7781
- otlp
7882
- passcodes
83+
- Pham
7984
- proto
8085
- Purvi
8186
- pytest
8287
- isovalent
8388
- labs
8489
- Liudmila
90+
- Mathieu
8591
- Nale
8692
- REXX
8793
- scaphandre
@@ -138,6 +144,7 @@ words:
138144
- endsigs
139145
- faas
140146
- fong
147+
- Fordyce
141148
- frzifus
142149
- gbbr
143150
- genai
@@ -147,6 +154,7 @@ words:
147154
- heptio
148155
- hongalex
149156
- horovits
157+
- Hrabusa
150158
- instrgen
151159
- jackjia
152160
- jaglowski
@@ -242,6 +250,7 @@ words:
242250
- sarif
243251
- scavarda
244252
- schäfer
253+
- Schmitt
245254
- semconv
246255
- sergey
247256
- severin

projects/otel-blueprints.md

Lines changed: 177 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,177 @@
1+
# OTel Blueprints
2+
3+
## Background and description
4+
5+
This project aims to deliver a set of architecture blueprints, with the goal of facilitating and guiding adoption of best practices when deploying OpenTelemetry on a defined set of common environments.
6+
We'd like these blueprints to be backed by evidence in the form of reference architectures shared by end users.
7+
8+
The end-goal is to provide holistic, incremental, high-level guidance that any adopter can apply across their environments, resulting in mature architectures ready for production use, at scale.
9+
10+
### Current challenges
11+
12+
These are some of the high-level adoption challenges that this project aims to help with, some of which are unique to OpenTelemetry as a cross-cutting concern.
13+
14+
#### Adopting OpenTelemetry is a cross-functional effort, likely involving many roles
15+
16+
Adopting OpenTelemetry implies changes in multiple parts of an organization.
17+
The components required for a complete implementation are naturally distributed across different areas of responsibility.
18+
19+
For instance, application teams or library maintainers interact with the OpenTelemetry API to add domain-specific instrumentation.
20+
Platform teams often aim to provide consistent SDK configuration, while supporting centralized telemetry pipelines.
21+
While infrastructure teams may be responsible for ensuring telemetry from hosts and other devices is collected in a standard fashion.
22+
23+
When these efforts are not coordinated, the resulting telemetry can become fragmented, and adoption suffers, failing to deliver the end-to-end observability part of OTel's core promise.
24+
25+
#### There is no "one-size-fits-all" architecture
26+
27+
While OTel adoption may trigger new conversations about platform engineering strategy (i.e. [Reverse Conway Maneuver](https://www.agileanalytics.cloud/blog/team-topologies-the-reverse-conway-manoeuvre)), the project's goal is to cater to all organizational structures, not to force a specific one.
28+
29+
The resulting architectures will (and should) look different depending on the organization's model.
30+
For instance:
31+
32+
- A company with federated, autonomous teams might favour a pattern of team-level Collectors routing to a central gateway.
33+
- An organization with a strong central platform team might provide a "paved road" via a fully managed Collector layer and base SDK configurations.
34+
35+
Both are valid approaches.
36+
The challenge is that our guidance must be flexible enough to present these different patterns, acknowledging that a "one-size-fits-all" deployment model will not work.
37+
38+
#### Documentation is typically focused on specific solutions, not challenges
39+
40+
The existing OpenTelemetry user documentation is rightly focused on providing and describing solutions.
41+
It's great at explaining what a specific component is, how to configure an SDK, or how to deploy a Collector at scale.
42+
This is essential for a technical project.
43+
44+
The gap, however, is in connecting these solutions to a path forward for common adoption challenges.
45+
Adopters often start with a problem, such as "How do I provide stable SDK config across multiple languages?" or "How do I build a scalable, multi-tenant gateway?".
46+
47+
Blueprints must bridge this gap, starting from the problem and mapping it to a set of principles and actionable patterns.
48+
49+
#### Feedback is often component-specific, not strategic
50+
51+
Currently, feedback in OpenTelemetry is mainly gathered via surveys and interviews conducted by the End-User SIG.
52+
These are normally focused on specific components, or helping specific SIGs prioritize work.
53+
54+
This creates a risk that development efforts in different parts of OTel are not always informed by the most pressing optimizations from the perspective of adoption.
55+
We may be optimizing components in a silo, while a user's main pain point is connecting them.
56+
These blueprints, by capturing common patterns, can serve as that feedback mechanism to help guide the project's priorities.
57+
58+
#### Sharing learnings from highly regulated environments
59+
60+
Signing up to an _OTel in Practice_ or _OTel Me_ session organized by the End-User SIG is not always easy, or even an option, for end users in highly regulated environments.
61+
This is due to the inherent lack of framework or standard format in these sessions, paired with rules and regulations in place in these organizations to avoid publicly sharing sensitive information.
62+
63+
### Goals, objectives, and requirements
64+
#### Goals
65+
66+
The high-level goals of this project are to:
67+
68+
- Enable scalable adoption of OpenTelemetry by providing clear, challenge-oriented guidance.
69+
- Improve feedback loops from end-users to maintainers, capturing common patterns and challenges to help guide future development.
70+
- Provide a set of templates to capture reference architectures and design blueprints, allowing end users to easily communicate to stakeholders in their organization the type of information that will be publicly shared.
71+
72+
#### Objectives
73+
74+
To achieve these goals, this project will:
75+
76+
- Define a standard, repeatable process for capturing and publishing end-user reference architectures.
77+
- Define a standard, strategic template for authoring blueprints that map common challenges to OTel-based solutions.
78+
- Publish an initial set of 5 reference architectures from end users that have successfully adopted OpenTelemetry at scale.
79+
- Identify most common 3 environments and challenges as the base for an initial set of blueprints.
80+
- Publish this initial set of 3 blueprints, collating best practices as seen in the field.
81+
- Establish a clear, discoverable location for this content on the OpenTelemetry website, managed by the End-User SIG.
82+
83+
**Note:** DevEx SIG has already been documenting reference architectures with end-users.
84+
They have so far conducted 4 interviews and document them as reference architectures.
85+
Ideally, all these reference architectures will be hosted in the same space as others.
86+
87+
#### Why now?
88+
89+
OpenTelemetry has successfully moved passed the "early adopter" stage.
90+
New waves of adopters are typically composed of platform teams in large organizations.
91+
They require common, vendor-neutral guidance to piece together a large-scale strategy from low-level component documentation.
92+
They need a "paved road" and a set of proven best practices.
93+
Providing this guidance is one of the biggest levers we can pull to accelerate widespread, successful adoption.
94+
95+
## Deliverables
96+
97+
This project will output two types of deliverables:
98+
99+
- **Reference architectures**: Similar to [CNCF reference architectures](https://architecture.cncf.io/architectures), scoped to OpenTelemetry (potentially cross-shared between these).
100+
These will share how different companies or institutions, under different organizational structures and technology stacks, are approaching OpenTelemetry adoption, and the outcomes it has delivered.
101+
- **Blueprints**: Focused on a given environment, these will give specific guidance to solve common challenges.
102+
The format of these blueprints will be discussed as part of this project, however the general proposal is to follow popular forms of [strategic documentation](https://itsadeliverything.com/good-strategy-bad-strategy-the-difference-and-why-it-matters-by-richard-rumelt).
103+
For each of them, we'll identify:
104+
1. The main challenges the blueprint will solve, and the scope it applies to.
105+
2. The guiding principles and best practices that solve these challenges.
106+
3. Individual actions to implement these best practices, linking to more specific guidance in order to avoid duplication of existing parts of the OpenTelemetry documentation (e.g. getting started, SDK config, Collector deployment patterns, etc).
107+
108+
For both of these, this project aims to define templates and processes in order to make it easier to contribute both new reference architectures or blueprints.
109+
110+
After this project is complete, the End User SIG will expand the library of reference architectures and blueprints as part of their BAU operation.
111+
112+
## Staffing / Help Wanted
113+
114+
### Industry outreach
115+
116+
End users were contacted during KubeCon NA, providing very positive feedback in this initiative and willingness to contribute.
117+
118+
Solutions/observability architects/consultants from organizations like New Relic, Splunk and Grafana were contacted and are interested in joining this effort.
119+
120+
We will also reach out to past guests of sessions organized by the End-User SIG to encourage their participation.
121+
122+
### SIG
123+
End-User SIG & DevEx SIG
124+
125+
### Required staffing
126+
See [Project Staffing](/project-management.md#project-staffing)
127+
128+
#### Project Leads(s)
129+
Dan Gomez Blanco (@danielgblanco)
130+
Damien Mathieu (@dmathieu)
131+
132+
#### Other Staffing
133+
134+
- Contributors/architects willing to help coordinate with end-users, create templates, analyze reference architectures, and write up blueprints:
135+
- Jacob Aronoff (@jaronoff97)
136+
- Lukasz Ciukaj (@luke6Lh43)
137+
- Alain Pham (@alainpham)
138+
- ChaosKyle (@ChaosKyle)
139+
- Brad Schmitt (@bpschmitt)
140+
- End-Users willing to contribute reference architectures:
141+
- Neil Fordyce, Skyscanner
142+
- Maintainers/approvers from Comms SIG to help reviewing and copy editing
143+
- Tiffany Hrabusa (@tiffany76 )
144+
- Others
145+
146+
### Sponsorship
147+
See [Project Sponsorship](/project-management.md#project-sponsorship)
148+
149+
#### TC Sponsor
150+
Reiley Yang (@reyang)
151+
152+
#### Delegated TC Sponsor (Optional)
153+
TBD
154+
155+
#### GC Liaison
156+
Marylia Gutierrez (@maryliag)
157+
158+
## Expected Timeline
159+
160+
- 1 month: Decide on initial format for reference architectures and blueprint documents, and which verticals/architecture types to write blueprints for.
161+
- 3-6 moths: Gather and document reference architectures from end users, identify most common challenges, and collate blueprints.
162+
163+
## Labels
164+
165+
`otel-blueprints`
166+
167+
## GitHub Project (Post-Approval)
168+
169+
**TO-DO**
170+
171+
## SIG Meetings, Roadmap, and Other Info (Post-Approval)
172+
173+
* Slack channel: [#otel-sig-end-user](https://cloud-native.slack.com/archives/C01RT3MSWGZ)
174+
* Meeting notes: [End-User SIG Meeting Notes](https://docs.google.com/document/d/1e-UNZA3Tuno9b53RQbe--whUcO0VIXF3P81oXsrBK6g)
175+
* Meeting times: Every other Thursday at 10:00 PT
176+
177+
**TO-DO**: Roadmap item will be added after new GH project is created.

0 commit comments

Comments
 (0)