Skip to content

Commit eb0122c

Browse files
[rust/docs] Add kafka-es-indexer sample config file and update documentation
Add rust/config/kafka-es-indexer.yaml with comprehensive documentation of all configuration options, following the same pattern as rqd.yaml. Documentation updates: - Reference sample config file in kafka-es-indexer README - Add config file section to rust/README.md listing all config files - Update monitoring-reference.md with config file usage example - Update deploying-monitoring.md with Docker mount example for config - Update monitoring-development.md with config file example Addresses review feedback to create a sample config file in the rust config directory with complete documentation of all options.
1 parent 56e7416 commit eb0122c

File tree

6 files changed

+169
-18
lines changed

6 files changed

+169
-18
lines changed

docs/_docs/developer-guide/monitoring-development.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -361,6 +361,14 @@ export ELASTICSEARCH_INDEX_PREFIX=opencue
361361
kafka-es-indexer
362362
```
363363

364+
Example with a config file:
365+
366+
```bash
367+
kafka-es-indexer --config /path/to/kafka-es-indexer.yaml
368+
```
369+
370+
A sample configuration file with complete documentation of all options is available at `rust/config/kafka-es-indexer.yaml`.
371+
364372
### Prometheus configuration
365373

366374
| Property | Default | Description |

docs/_docs/getting-started/deploying-monitoring.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -183,6 +183,18 @@ The `kafka-es-indexer` is a standalone Rust service that consumes events from Ka
183183
--index-prefix opencue
184184
```
185185

186+
Or with a configuration file (mount the config file into the container):
187+
188+
```bash
189+
docker run -d --name kafka-es-indexer \
190+
--network your-network \
191+
-v /path/to/kafka-es-indexer.yaml:/etc/opencue/kafka-es-indexer.yaml \
192+
opencue/kafka-es-indexer \
193+
--config /etc/opencue/kafka-es-indexer.yaml
194+
```
195+
196+
A sample configuration file with complete documentation is available at `rust/config/kafka-es-indexer.yaml`.
197+
186198
3. Verify the indexer is running:
187199

188200
```bash

docs/_docs/reference/monitoring-reference.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -304,6 +304,14 @@ kafka-es-indexer \
304304
--index-prefix opencue
305305
```
306306

307+
Example using a configuration file:
308+
309+
```bash
310+
kafka-es-indexer --config /path/to/kafka-es-indexer.yaml
311+
```
312+
313+
A sample configuration file with complete documentation of all options is available at `rust/config/kafka-es-indexer.yaml`.
314+
307315
### Prometheus configuration
308316

309317
```properties

rust/README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,11 @@ Project crates:
88
* opencue_proto: Wrapper around grpc's generated code for the project protobuf modules
99
* kafka-es-indexer: Kafka to Elasticsearch indexer for OpenCue monitoring events
1010

11+
Sample configuration files are available in the `config/` directory:
12+
* `config/rqd.yaml` - RQD configuration
13+
* `config/rqd.fake_linux.yaml` - RQD configuration for simulating Linux on macOS
14+
* `config/kafka-es-indexer.yaml` - Kafka-Elasticsearch indexer configuration
15+
1116
## Build Instructions
1217

1318
Follow these steps to build and run the Rust-based RQD and Dummy Cuebot modules.

rust/config/kafka-es-indexer.yaml

Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
# Kafka-Elasticsearch Indexer Configuration File
2+
#
3+
# This file configures the kafka-es-indexer service that consumes OpenCue
4+
# monitoring events from Kafka and indexes them into Elasticsearch for
5+
# historical analysis and querying.
6+
#
7+
# Data Flow: Cuebot (Producer) -> Kafka -> kafka-es-indexer (Consumer) -> Elasticsearch
8+
9+
# =============================================================================
10+
# KAFKA CONFIGURATION
11+
# =============================================================================
12+
kafka:
13+
# Kafka bootstrap servers (comma-separated list)
14+
# Multiple brokers can be specified for high availability
15+
# Default: localhost:9092
16+
bootstrap_servers: "localhost:9092"
17+
18+
# Consumer group ID
19+
# All indexer instances with the same group_id will share partition
20+
# assignments and coordinate offset commits. Use a unique ID per cluster.
21+
# Default: opencue-elasticsearch-indexer
22+
group_id: "opencue-elasticsearch-indexer"
23+
24+
# What to do when there is no initial offset in Kafka
25+
# Options:
26+
# earliest - Start from the oldest available message
27+
# latest - Start from the newest message (skip historical)
28+
# Default: earliest
29+
auto_offset_reset: "earliest"
30+
31+
# Enable automatic offset commits
32+
# When true, offsets are committed periodically based on auto_commit_interval_ms
33+
# When false, offsets are committed manually after each message is processed
34+
# Default: true
35+
enable_auto_commit: true
36+
37+
# Interval between automatic offset commits (in milliseconds)
38+
# Only used when enable_auto_commit is true
39+
# Lower values reduce duplicate processing on restart but increase overhead
40+
# Default: 5000 (5 seconds)
41+
auto_commit_interval_ms: 5000
42+
43+
# Maximum number of records to fetch per poll
44+
# Higher values improve throughput but increase memory usage
45+
# Default: 500
46+
max_poll_records: 500
47+
48+
# Kafka session timeout (in milliseconds)
49+
# If the consumer doesn't send heartbeats within this interval,
50+
# it will be removed from the consumer group and partitions will be rebalanced
51+
# Default: 30000 (30 seconds)
52+
session_timeout_ms: 30000
53+
54+
# Kafka topics to subscribe to
55+
# These are the event topics published by Cuebot
56+
# Default: all OpenCue event topics
57+
topics:
58+
- "opencue.job.events"
59+
- "opencue.layer.events"
60+
- "opencue.frame.events"
61+
- "opencue.host.events"
62+
- "opencue.proc.events"
63+
64+
# =============================================================================
65+
# ELASTICSEARCH CONFIGURATION
66+
# =============================================================================
67+
elasticsearch:
68+
# Elasticsearch URL
69+
# Can be a single node or a load balancer in front of a cluster
70+
# Default: http://localhost:9200
71+
url: "http://localhost:9200"
72+
73+
# Username for Elasticsearch authentication (optional)
74+
# Required when Elasticsearch has security features enabled
75+
# Can also be set via ELASTICSEARCH_USERNAME environment variable
76+
# username: "elastic"
77+
78+
# Password for Elasticsearch authentication (optional)
79+
# Required when Elasticsearch has security features enabled
80+
# Can also be set via ELASTICSEARCH_PASSWORD environment variable
81+
# password: "changeme"
82+
83+
# Index name prefix for all OpenCue event indices
84+
# Indices are created with pattern: {prefix}-{event-type}-{date}
85+
# Example: opencue-frame-events-2024.11.29
86+
# Default: opencue
87+
index_prefix: "opencue"
88+
89+
# Number of primary shards for event indices
90+
# More shards allow parallel indexing and searching
91+
# For small deployments, 1 shard is sufficient
92+
# For large deployments with many events, consider 3-5 shards
93+
# Default: 1
94+
num_shards: 1
95+
96+
# Number of replica shards for event indices
97+
# Replicas provide redundancy and improve read throughput
98+
# Set to 0 for development/testing, 1+ for production
99+
# Default: 0
100+
num_replicas: 0
101+
102+
# Maximum number of events to batch before sending to Elasticsearch
103+
# Higher values improve throughput but increase latency and memory usage
104+
# Events are also flushed based on flush_interval_ms
105+
# Default: 100
106+
bulk_size: 100
107+
108+
# Maximum time to wait before flushing events to Elasticsearch (in milliseconds)
109+
# Events are flushed when either bulk_size is reached or this interval elapses
110+
# Lower values reduce latency but increase indexing overhead
111+
# Default: 5000 (5 seconds)
112+
flush_interval_ms: 5000
113+
114+
# =============================================================================
115+
# LOGGING CONFIGURATION
116+
# =============================================================================
117+
#
118+
# Logging is configured via the LOG_LEVEL environment variable or --log-level
119+
# CLI argument. This section documents the available options.
120+
#
121+
# Options: trace, debug, info, warn, error
122+
# Default: info
123+
#
124+
# Examples:
125+
# - trace: Very verbose, includes all debug info (development only)
126+
# - debug: Detailed info including each received event
127+
# - info: Standard operation logs (recommended for production)
128+
# - warn: Only warnings and errors
129+
# - error: Only errors
130+
#
131+
# Can also use RUST_LOG env var for fine-grained control:
132+
# RUST_LOG=kafka_es_indexer=debug,rdkafka=info

rust/crates/kafka-es-indexer/README.md

Lines changed: 4 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -50,28 +50,14 @@ kafka-es-indexer
5050

5151
### Configuration File
5252

53-
```yaml
54-
# config.yaml
55-
kafka:
56-
bootstrap_servers: "localhost:9092"
57-
group_id: "opencue-elasticsearch-indexer"
58-
auto_offset_reset: "earliest"
59-
enable_auto_commit: true
60-
auto_commit_interval_ms: 5000
61-
62-
elasticsearch:
63-
url: "http://localhost:9200"
64-
index_prefix: "opencue"
65-
num_shards: 1
66-
num_replicas: 0
67-
bulk_size: 100
68-
flush_interval_ms: 5000
69-
```
53+
A sample configuration file with complete documentation is available at `rust/config/kafka-es-indexer.yaml`.
7054

7155
```bash
72-
kafka-es-indexer --config config.yaml
56+
kafka-es-indexer --config /path/to/kafka-es-indexer.yaml
7357
```
7458

59+
See the [sample config](../../config/kafka-es-indexer.yaml) for all available options and their descriptions.
60+
7561
## Docker
7662

7763
Build the Docker image:

0 commit comments

Comments
 (0)