Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# --- Required ---
OPENAI_API_KEY=sk-...
GALILEO_API_KEY=...
GALILEO_CONSOLE_URL=https://app.galileo.ai

# --- Galileo project routing (used by the collector) ---
GALILEO_PROJECT=distributed-tracing-demo
GALILEO_LOG_STREAM=main

# --- Galileo API URL (derived from console URL; adjust if different) ---
GALILEO_API_URL=https://api.galileo.ai

# --- OTel Collector endpoint ---
# docker-compose overrides this to http://otel-collector:4318 (the bundled
# demo collector). For your real deployment, set it to the OTLP/HTTP endpoint
# of the collector you already run, e.g. https://otel-collector.internal:4318.
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Secrets
.env

# Python
python-service/__pycache__/
python-service/*.py[cod]
python-service/.venv/

# Java / Maven
java-service/target/
java-service/*.class

# ChromaDB local data
chroma-data/

# OS
.DS_Store
164 changes: 164 additions & 0 deletions python/logging-samples/distributed-tracing-otel-python-java/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
# Distributed Tracing: Java + Python with Galileo via OTEL

A runnable example showing how to trace a request across a **Java Spring Boot gateway** and a **Python FastAPI + LangGraph RAG service** — with a single trace ID stitched together end-to-end in [Galileo](https://galileo.ai).

```
User ──POST /ask──▶ Java Gateway ──POST /process──▶ Python RAG Service
│ Spring AI (classify) │ LangGraph (route → retrieve → generate)
│ OTel Java Agent │ LangChainInstrumentor + RetrieverSpan
└──────────────────┬─────────────┘
│ OTLP HTTP
OpenTelemetry Collector
│ OTLP HTTP (with Galileo auth)
Galileo
```

## What this demonstrates

- **Cross-language trace propagation** — the OTel Java agent injects a W3C `traceparent` header on every outbound HTTP call; the Python service continues the same trace automatically
- **Auto-instrumentation + targeted manual spans** — `LangChainInstrumentor` handles LangGraph nodes automatically; a Galileo `RetrieverSpan` surfaces retrieved source documents in the trace UI; the Java service adds `gen_ai.*` attributes for proper agent hierarchy rendering
- **Collector-agnostic** — both services export plain OTLP/HTTP to `OTEL_EXPORTER_OTLP_ENDPOINT`. Point them at the bundled demo collector, or at whatever OTel Collector you already run.
- **Financial services domain** — ChromaDB is pre-seeded with realistic financial policy documents (wire transfers, KYC/AML, mortgage guidelines, etc.)

## Expected trace in Galileo

```
invoke_agent financial-assistant ← Java: outer agent span (manual)
chat classify_question ← Java: LLM classification (manual span)
POST /process ← Java agent: auto-instrumented HTTP call
rag_pipeline ← Python: WorkflowSpan (Galileo)
LangGraph.invoke ← Python: auto-instrumented by LangChainInstrumentor
route ← Python: LangGraph node (auto)
retrieve ← Python: LangGraph node (auto)
chromadb_search ← Python: RetrieverSpan with source docs (Galileo)
generate ← Python: LangGraph node (auto)
openai.chat ← Python: LLM span (auto, OpenAIInstrumentor)
```

## Prerequisites

| Requirement | Version |
|---|---|
| Docker + Docker Compose | 24+ |
| OpenAI API key | — |
| Galileo account + API key | [console.galileo.ai](https://console.galileo.ai) |

## Quick start

**1. Clone and configure**

```bash
git clone https://github.com/rungalileo/sdk-examples.git
cd sdk-examples/python/logging-samples/distributed-tracing-otel-python-java
cp .env.example .env
```

Edit `.env` and fill in your keys:

```ini
OPENAI_API_KEY=sk-...
GALILEO_API_KEY=...
GALILEO_PROJECT=distributed-tracing-demo
GALILEO_LOG_STREAM=main
GALILEO_API_URL=https://api.galileo.ai
```

**2. Start all services**

```bash
docker compose up --build
```

On first run this builds the Java and Python images (~2–3 min) and seeds ChromaDB. Once you see `Started GatewayApplication`, everything is ready.

**3. Send a question**

```bash
curl -s -X POST http://localhost:8089/ask \
-H "Content-Type: application/json" \
-d '{"question": "What are the wire transfer limits?"}' | jq
```

Expected response:

```json
{
"answer": "Domestic wire transfers are limited to $50,000 per business day...",
"category": "FINANCIAL",
"sources": ["wire-transfer-policy"]
}
```

**4. View traces in Galileo**

Open your Galileo project and navigate to the log stream you configured in `.env`. You should see a trace with the full span hierarchy shown above.

## Service ports

| Service | Host port | Purpose |
|---|---|---|
| Java gateway | 8089 | `POST /ask` — entry point |
| Python RAG service | 8099 | `POST /process` — internal |
| ChromaDB | 8299 | Vector store |

## Architecture

### Java gateway (`java-service/`)

- **Spring Boot 3.4.1 + Spring AI 1.0.0** — handles routing and LLM classification
- **OTel Java agent** (bundled in Docker image) — auto-instruments all HTTP calls and injects `traceparent` headers
- **Manual spans** — two spans with `gen_ai.*` attributes tell Galileo to render this as an agent with an LLM sub-call; Spring AI doesn't yet emit these automatically

### Python RAG service (`python-service/`)

- **FastAPI + LangGraph** — three-node graph: `route → retrieve → generate`
- **`LangChainInstrumentor` + `OpenAIInstrumentor`** — auto-create spans for each LangGraph node and every OpenAI call
- **`RetrieverSpan`** (Galileo) — wraps the ChromaDB query so retrieved documents appear as source context in Galileo
- **`WorkflowSpan`** (Galileo) — top-level span for the `/process` endpoint

### How spans reach Galileo

Both services export OTLP/HTTP to whatever endpoint `OTEL_EXPORTER_OTLP_ENDPOINT` points at. That endpoint is always an **OpenTelemetry Collector** — the collector batches, retries on failure, and forwards to Galileo with the right auth headers. Galileo credentials live in the collector config, not in the application code.

#### Using the bundled demo collector

For running this repo end-to-end on a laptop, `docker compose up` starts a minimal collector alongside the two services. Its config is in [`otel-collector/config.yaml`](./otel-collector/config.yaml). Services automatically point at it:

```
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318
```

This is a demo convenience — not something you deploy.

#### Using your own existing collector

In any real environment you almost certainly already run an OTel Collector. To point the services at it instead:

1. Add a Galileo exporter to your existing collector config (see [`otel-collector/config.yaml`](./otel-collector/config.yaml) for the `otlphttp/galileo` block to copy).
2. Set `OTEL_EXPORTER_OTLP_ENDPOINT` in `.env` (or your deployment config) to your collector's OTLP/HTTP endpoint, e.g. `https://otel-collector.internal.example.com:4318`.
3. Remove the `otel-collector` service from `docker-compose.yml` (or just stop pointing at it).

No application code changes. The instrumentation stays identical.

## Troubleshooting

**Java service fails to start**
- Check that `OPENAI_API_KEY` is set in `.env`
- Maven downloads dependencies on first build; allow 2–3 min

**No traces in Galileo**
- Verify `GALILEO_API_KEY` and `GALILEO_API_URL` in `.env`
- Check `GALILEO_PROJECT` and `GALILEO_LOG_STREAM` match what you're looking at in the UI

**ChromaDB connection refused**
- Python service waits up to 60 s for ChromaDB; check `docker compose logs python-service`

**Empty answers**
- Confirm `OPENAI_API_KEY` is valid and has quota
- Try a question in the FINANCIAL category (e.g., "What is the minimum credit score for a mortgage?")

## Further reading

[Distributed Tracing with OpenTelemetry — Galileo Docs](https://docs.galileo.ai/sdk-api/logging/distributed-tracing-otel)
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
services:

chromadb:
image: chromadb/chroma:0.6.3
ports:
- "8299:8000"
volumes:
- chroma-data:/chroma/chroma
environment:
- ANONYMIZED_TELEMETRY=false

otel-collector:
image: otel/opentelemetry-collector-contrib:0.120.0
command: ["--config", "/etc/otel-collector/config.yaml"]
volumes:
- ./otel-collector/config.yaml:/etc/otel-collector/config.yaml:ro
environment:
- GALILEO_API_URL=${GALILEO_API_URL}
- GALILEO_API_KEY=${GALILEO_API_KEY}
- GALILEO_PROJECT=${GALILEO_PROJECT}
- GALILEO_LOG_STREAM=${GALILEO_LOG_STREAM}
ports:
- "4317:4317" # OTLP gRPC
- "4318:4318" # OTLP HTTP

python-service:
build: ./python-service
ports:
- "8099:8099"
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- OTEL_SERVICE_NAME=python-rag-service
- OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318
- CHROMADB_HOST=chromadb
- CHROMADB_PORT=8000
# Silences a harmless posthog telemetry error in chromadb-client.
- ANONYMIZED_TELEMETRY=False
depends_on:
- chromadb
- otel-collector

java-service:
build: ./java-service
ports:
- "8089:8089"
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- OTEL_SERVICE_NAME=java-gateway
- OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318
- OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
- OTEL_TRACES_EXPORTER=otlp
- OTEL_METRICS_EXPORTER=none
- OTEL_LOGS_EXPORTER=none
- PYTHON_SERVICE_URL=http://python-service:8099
depends_on:
- python-service
- otel-collector

volumes:
chroma-data:
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
FROM eclipse-temurin:21-jdk AS build
WORKDIR /app

RUN apt-get update && apt-get install -y curl && \
curl -fsSL https://repo.maven.apache.org/maven2/org/apache/maven/apache-maven/3.9.9/apache-maven-3.9.9-bin.tar.gz \
| tar xz -C /opt && \
ln -s /opt/apache-maven-3.9.9/bin/mvn /usr/local/bin/mvn

COPY pom.xml .
RUN mvn dependency:resolve -q || true
COPY src ./src
RUN mvn clean package -DskipTests -q

FROM eclipse-temurin:21-jre
WORKDIR /app
COPY --from=build /app/target/*.jar app.jar
ADD https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/download/v2.12.0/opentelemetry-javaagent.jar /app/opentelemetry-javaagent.jar

EXPOSE 8089
ENTRYPOINT ["java", "-javaagent:/app/opentelemetry-javaagent.jar", "-jar", "app.jar"]
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>3.4.1</version>
</parent>

<groupId>com.example</groupId>
<artifactId>gateway-service</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>Gateway Service</name>

<properties>
<java.version>21</java.version>
<spring-ai.version>1.0.0</spring-ai.version>
</properties>

<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-bom</artifactId>
<version>${spring-ai.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>

<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-starter-model-openai</artifactId>
<version>${spring-ai.version}</version>
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-api</artifactId>
<version>1.46.0</version>
</dependency>
</dependencies>

<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
</project>
Loading
Loading