|
| 1 | +# OpenTelemetry Collector Gateway Configuration |
| 2 | + |
| 3 | +This directory contains Kubernetes manifests for deploying the OpenTelemetry Collector in a **gateway deployment pattern**, where agents send telemetry data to a centralized gateway collector running on a **separate host** before it's forwarded to Splunk Observability Cloud. |
| 4 | + |
| 5 | +## Architecture Overview |
| 6 | + |
| 7 | +```text |
| 8 | +┌─────────────────────────────────────┐ ┌──────────────────────────────┐ |
| 9 | +│ Kubernetes Cluster │ │ Separate Host │ |
| 10 | +│ │ │ │ |
| 11 | +│ ┌─────────────────┐ │ │ ┌──────────────────┐ │ ┌──────────────────────┐ |
| 12 | +│ │ Applications │ │ │ │ OTel Gateway │ │ │ Splunk Observability │ |
| 13 | +│ │ & Services │ │ │ │ │ │ │ Cloud │ |
| 14 | +│ └────────┬────────┘ │ │ │ Standalone │────────┼─────>│ │ |
| 15 | +│ │ │ │ │ Collector │ │ │ - Traces (OTLP) │ |
| 16 | +│ v │ │ │ │ │ │ - Metrics (SignalFx)│ |
| 17 | +│ ┌─────────────────┐ │ │ └──────────────────┘ │ │ - Logs (HEC) │ |
| 18 | +│ │ OTel Agent │────────────────┼─────>│ │ └──────────────────────┘ |
| 19 | +│ │ (DaemonSet) │ HTTP │ │ 192.168.5.158 │ |
| 20 | +│ └─────────────────┘ OTLP │ │ :4318, :9943, :6060 │ |
| 21 | +│ │ │ │ |
| 22 | +└─────────────────────────────────────┘ └──────────────────────────────┘ |
| 23 | +``` |
| 24 | + |
| 25 | +**Key Architecture Points:** |
| 26 | + |
| 27 | +- **OTel Agents** run as DaemonSet in Kubernetes cluster |
| 28 | +- **OTel Gateway** runs on a separate host outside the cluster (IP: 192.168.5.158) |
| 29 | +- All telemetry flows from agents → gateway → Splunk Cloud |
| 30 | +- Gateway acts as a centralized aggregation and processing layer |
| 31 | + |
| 32 | +## Key Differences from Standard Deployment |
| 33 | + |
| 34 | +### 1. Gateway Location |
| 35 | + |
| 36 | +**Standard Configuration:** |
| 37 | + |
| 38 | +- Agents send directly to Splunk public cloud endpoints |
| 39 | +- No intermediate gateway |
| 40 | + |
| 41 | +**Gateway Configuration:** |
| 42 | + |
| 43 | +- Agents send to a **separate host** running the gateway collector |
| 44 | +- Gateway IP: `192.168.5.158` (external to Kubernetes cluster) |
| 45 | +- Gateway then forwards to Splunk public endpoints |
| 46 | + |
| 47 | +### 2. Agent Exporter Configuration |
| 48 | + |
| 49 | +**Standard Configuration (Direct to Splunk):** |
| 50 | + |
| 51 | +```yaml |
| 52 | +exporters: |
| 53 | + otlphttp: |
| 54 | + traces_endpoint: "https://ingest.[REALM].signalfx.com/v2/trace/otlp" |
| 55 | + metrics_endpoint: "https://ingest.[REALM].signalfx.com/v2/datapoint/otlp" |
| 56 | + signalfx: |
| 57 | + access_token: ${SPLUNK_OBSERVABILITY_ACCESS_TOKEN} |
| 58 | + realm: ${SPLUNK_REALM} |
| 59 | + ingest_url: "https://ingest.[REALM].signalfx.com" |
| 60 | + splunk_hec/platform_logs: |
| 61 | + endpoint: "[URL]:[PORT]/services/collector/event" |
| 62 | + token: ${SPLUNK_HEC_TOKEN} |
| 63 | +``` |
| 64 | +
|
| 65 | +**Gateway Configuration (Agent to Separate Host):** |
| 66 | +
|
| 67 | +```yaml |
| 68 | +exporters: |
| 69 | + otlphttp: |
| 70 | + traces_endpoint: "http://192.168.5.158:4318/v1/traces" # External gateway |
| 71 | + metrics_endpoint: "http://192.168.5.158:4318/v1/metrics" # External gateway |
| 72 | + logs_endpoint: "http://192.168.5.158:4318/v1/logs" # External gateway |
| 73 | +``` |
| 74 | +
|
| 75 | +### 3. Environment Variables |
| 76 | +
|
| 77 | +**Standard Configuration:** |
| 78 | +
|
| 79 | +```yaml |
| 80 | +# Points directly to Splunk public endpoints |
| 81 | +splunk_trace_url: "https://ingest.[REALM].signalfx.com/v2/trace/otlp" |
| 82 | +splunk_api_url: "https://api.[REALM].signalfx.com" |
| 83 | +splunk_ingest_url: "https://ingest.[REALM].signalfx.com" |
| 84 | +splunk_hec_url: "https://[URL]:[PORT]/services/collector/event" |
| 85 | +``` |
| 86 | +
|
| 87 | +**Gateway Configuration:** |
| 88 | +
|
| 89 | +```yaml |
| 90 | +# Points to the external gateway host |
| 91 | +splunk_otlp_trace_url: "http://192.168.5.158:4318/v1/traces" |
| 92 | +splunk_otlp_metric_url: "http://192.168.5.158:4318/v1/metrics" |
| 93 | +splunk_otlp_log_url: "http://192.168.5.158:4318/v1/logs" |
| 94 | +splunk_api_url: "http://192.168.5.158:6060" |
| 95 | +splunk_ingest_url: "http://192.168.5.158:9943" |
| 96 | +``` |
| 97 | +
|
| 98 | +### 4. Data Flow Changes |
| 99 | +
|
| 100 | +**Standard Configuration:** |
| 101 | +
|
| 102 | +```text |
| 103 | +K8s Agents ──────────────────> Splunk Observability Cloud |
| 104 | + (HTTPS, Public) |
| 105 | +``` |
| 106 | + |
| 107 | +**Gateway Configuration:** |
| 108 | + |
| 109 | +```text |
| 110 | +K8s Agents ────────> Gateway Host ────────> Splunk Observability Cloud |
| 111 | + (HTTP, LAN) 192.168.5.158 (HTTPS, Public) |
| 112 | +``` |
| 113 | + |
| 114 | +- Agents send **all telemetry** (traces, metrics, logs) via OTLP HTTP to the external gateway |
| 115 | +- Gateway performs centralized processing and exports to Splunk endpoints |
| 116 | +- Single protocol (OTLP) between agents and gateway |
| 117 | +- Gateway handles all authentication with Splunk Cloud |
| 118 | + |
| 119 | +### 5. Gateway Receiver Configuration |
| 120 | + |
| 121 | +The gateway (running on separate host) is configured to receive multiple protocols: |
| 122 | + |
| 123 | +```yaml |
| 124 | +receivers: |
| 125 | + otlp: |
| 126 | + protocols: |
| 127 | + grpc: |
| 128 | + endpoint: "${SPLUNK_LISTEN_INTERFACE}:4317" # Listens on gateway host |
| 129 | + http: |
| 130 | + endpoint: "${SPLUNK_LISTEN_INTERFACE}:4318" # Listens on gateway host |
| 131 | + jaeger: |
| 132 | + protocols: |
| 133 | + grpc: |
| 134 | + endpoint: "${SPLUNK_LISTEN_INTERFACE}:14250" |
| 135 | + thrift_http: |
| 136 | + endpoint: "${SPLUNK_LISTEN_INTERFACE}:14268" |
| 137 | + # ... other Jaeger protocols |
| 138 | + signalfx: |
| 139 | + endpoint: "${SPLUNK_LISTEN_INTERFACE}:9943" |
| 140 | + zipkin: |
| 141 | + endpoint: "${SPLUNK_LISTEN_INTERFACE}:9411" |
| 142 | +``` |
| 143 | +
|
| 144 | +### 6. Gateway Resource Attributes |
| 145 | +
|
| 146 | +The gateway adds a specific resource attribute to identify itself: |
| 147 | +
|
| 148 | +```yaml |
| 149 | +processors: |
| 150 | + resource/add_mode: |
| 151 | + attributes: |
| 152 | + - action: insert |
| 153 | + value: "gateway" |
| 154 | + key: otelcol.service.mode |
| 155 | +``` |
| 156 | +
|
| 157 | +## Benefits of External Gateway Deployment |
| 158 | +
|
| 159 | +1. **Reduced Egress Costs**: Single point of egress to Splunk Cloud instead of each Kubernetes node |
| 160 | +2. **Network Isolation**: Keep Splunk credentials off Kubernetes cluster |
| 161 | +3. **Centralized Processing**: Complex processing (sampling, filtering, enrichment) done at gateway |
| 162 | +4. **Simplified Agent Configuration**: Agents use lightweight configuration, send to local network |
| 163 | +5. **Better Resource Utilization**: Gateway can run on dedicated hardware, scaled independently |
| 164 | +6. **Enhanced Security**: Splunk access tokens only stored on gateway host, not in cluster secrets |
| 165 | +7. **Protocol Flexibility**: Gateway can receive multiple protocols and normalize to Splunk formats |
| 166 | +8. **Cross-Cluster Aggregation**: Multiple Kubernetes clusters can send to the same gateway |
| 167 | +9. **Reduced TLS Overhead**: Agent-to-gateway uses HTTP on trusted LAN, gateway handles HTTPS to cloud |
| 168 | +
|
| 169 | +## Deployment Components |
| 170 | +
|
| 171 | +### Gateway Collector (Separate Host - 192.168.5.158) |
| 172 | +
|
| 173 | +- **Config**: `gateway_config.yaml` |
| 174 | +- **Location**: Runs on separate host outside Kubernetes |
| 175 | +- **Purpose**: Receives telemetry from agents and forwards to Splunk Observability Cloud |
| 176 | +- **Ports**: |
| 177 | + - 4317 (OTLP gRPC) |
| 178 | + - 4318 (OTLP HTTP) - **Primary port used by agents** |
| 179 | + - 9943 (SignalFx) |
| 180 | + - 6060 (HTTP Forwarder) |
| 181 | + |
| 182 | +### Agent (Kubernetes DaemonSet) |
| 183 | + |
| 184 | +- **File**: `daemonset.yaml` |
| 185 | +- **ConfigMap**: `configmap-agent.yaml` |
| 186 | +- **Purpose**: Runs on each Kubernetes node to collect telemetry |
| 187 | +- **Exports to**: External gateway at 192.168.5.158 |
| 188 | + |
| 189 | +### Cluster Receiver (Kubernetes Deployment) |
| 190 | + |
| 191 | +- **File**: `deployment-cluster-receiver.yaml` |
| 192 | +- **ConfigMap**: `configmap-cluster-receiver.yaml` |
| 193 | +- **Purpose**: Collects cluster-level metrics (single replica) |
| 194 | + |
| 195 | +### Configuration & Secrets |
| 196 | + |
| 197 | +- **File**: `configmap-and-secrets.yaml` |
| 198 | +- **Contains**: External gateway endpoints and access tokens (for agents only) |
| 199 | + |
| 200 | +## Configuration Steps |
| 201 | + |
| 202 | +### 1. Deploy Gateway on Separate Host |
| 203 | + |
| 204 | +On the gateway host (192.168.5.158), install the Splunk OpenTelemetry Collector: |
| 205 | + |
| 206 | +```bash |
| 207 | +curl -sSL https://dl.signalfx.com/splunk-otel-collector.sh > /tmp/splunk-otel-collector.sh && \ |
| 208 | +sudo sh /tmp/splunk-otel-collector.sh --realm eu0 -- <ACCESS_TOKEN> --mode gateway |
| 209 | +``` |
| 210 | + |
| 211 | +### 2. Update Gateway Endpoints in ConfigMap |
| 212 | + |
| 213 | +Edit `configmap-and-secrets.yaml` to point to your gateway host IP: |
| 214 | + |
| 215 | +```yaml |
| 216 | +data: |
| 217 | + splunk_otlp_trace_url: "http://192.168.5.158:4318/v1/traces" |
| 218 | + splunk_otlp_metric_url: "http://192.168.5.158:4318/v1/metrics" |
| 219 | + splunk_otlp_log_url: "http://192.168.5.158:4318/v1/logs" |
| 220 | + splunk_api_url: "http://192.168.5.158:6060" |
| 221 | + splunk_ingest_url: "http://192.168.5.158:9943" |
| 222 | + k8s_cluster_name: "your-cluster-name" |
| 223 | + deployment_environment: "your-environment" |
| 224 | +``` |
| 225 | + |
| 226 | +**Important:** Update the IP address if your gateway is on a different host! |
| 227 | + |
| 228 | +## Performance Considerations |
| 229 | + |
| 230 | +### High Availability Options |
| 231 | + |
| 232 | +- Deploy multiple gateway hosts with load balancer |
| 233 | +- Use DNS round-robin for gateway endpoint |
| 234 | +- Configure agent retry logic for failover |
| 235 | + |
| 236 | +### Batching and Buffering |
| 237 | + |
| 238 | +- Gateway batches data before forwarding (see `batch` processor in `gateway_config.yaml`) |
| 239 | +- Adjust batch size based on network latency and throughput |
| 240 | +- Memory limiter prevents OOM on gateway host |
0 commit comments