Skip to content

Commit b4955ac

Browse files
author
Dhanisha Phadate
committed
Add GSoC 2026 project proposal for Kubeflow SDK OpenTelemetry integration
Signed-off-by: Dhanisha Phadate <[email protected]>
1 parent b4eacc0 commit b4955ac

File tree

1 file changed

+42
-0
lines changed

1 file changed

+42
-0
lines changed

content/en/events/upcoming-events/gsoc-2026.md

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,5 +102,47 @@ To participate in GSoC with Kubeflow, you **must** meet the GSoC [eligibility re
102102
- List briefly skills you expect from contributor
103103
-
104104

105+
### Project X : Integrate Kubeflow SDK with OpenTelemetry
106+
107+
**Components:** [kubeflow/sdk](https://www.github.com/kubeflow/sdk)
108+
109+
**Mentors:** [@dhanishaphadate](https://github.com/dhanishaphadate), [@jaiakash](https://www.github.com/jaiakash)
110+
111+
**Contributor:**
112+
113+
**Details:**
114+
115+
The Kubeflow SDK enables users with limited Kubernetes knowledge to interact with the Kubeflow ecosystem using standard Python APIs. As AI/ML workloads become more complex and distributed, observability into pipeline execution, model training, and inference workflows becomes critical.
116+
117+
This project aims to integrate the Kubeflow SDK with OpenTelemetry (OTel) to provide standardized, vendor-neutral telemetry for Kubeflow-based workloads. The integration will enable end-to-end visibility into SDK operations by capturing distributed traces, metrics, and logs across pipeline compilation, submission, execution, and training lifecycles.
118+
119+
The project will also explore leveraging existing OpenTelemetry and Generative AI instrumentation patterns—such as span conventions for model execution, prompt handling, and inference steps—where applicable.
120+
121+
[Kubeflow SDK Documentation](https://sdk.kubeflow.org/en/latest/index.html)
122+
123+
[Opentelemetry for genAI](https://opentelemetry.io/blog/2024/otel-generative-ai/)
124+
125+
[Issue](https://github.com/kubeflow/sdk/issues/164)
126+
127+
**Features Expected:**
128+
129+
- Add OpenTelemetry instrumentation to key Kubeflow SDK components
130+
- Enable distributed tracing for pipeline execution and SDK operations
131+
- Collect and export metrics related to AI/ML workloads
132+
- Provide configurable OTel exporters and sampling options
133+
- Documentation and examples demonstrating observability setup and usage
134+
135+
**Difficulty:** [intermediate|hard]
136+
137+
**Size:** [350 hours]
138+
139+
**Skills Required/Preferred:**
140+
141+
- Python
142+
- Understanding of the Kubeflow Ecosystem (preferred)
143+
- OpenTelemetry (tracing, metrics, logging)
144+
- Distributed systems and observability concepts
145+
- Kubernetes and CRDs
146+
105147
---
106148

0 commit comments

Comments
 (0)