11# Monitor Llamastack & vLLM in OpenShift
22
33Follow this README to configure an observability stack in OpenShift to visualize Llamastack telemetry and vLLM metrics.
4+ First, ensure Llamastack and vLLM are configured to generate telemetry by following this [ configuration guide] ( ./run-configuration.md )
45
5- ## Generate telemetry from Llamastack and vLLM
6-
7- ### vLLM
8-
9- For vLLM, metrics are generated by default and are exposed at ` vllm-endpoint:port/metrics ` . For a list of metrics,
10- you can ` curl localhost:8000/metrics ` from within a vLLM container.
11-
12- ### Llamastack
13-
14- With Llamastack, you need to specify in the run-config.yaml to enable telemetry collection with an opentelemetry receiver.
15- Here's how to do that:
16-
17- #### Updated manifests for telemetry trace collection with opentelemetry receiver endpoint
18-
19- This is for traces only. There is a similar ` otel_metric ` sink and ` otel_metric_endpoint ` , however, there are currently
20- only 4 metrics generated within Llamastack, and these are duplicates of what vLLM provides.
21-
22- [ kubernetes/llama-stack/configmap.yaml] ( ../llama-stack/configmap.yaml )
23-
24- ``` yaml
25- ---
26- telemetry :
27- - provider_id : meta-reference
28- provider_type : inline::meta-reference
29- config :
30- service_name : ${env.OTEL_SERVICE_NAME:llama-stack}
31- sinks : ${env.TELEMETRY_SINKS:console, otel_trace, sqlite} <-add otel_trace and/or otel_metric
32- otel_trace_endpoint : ${env.OTEL_TRACE_ENDPOINT:} <-add ONLY if opentelemetry receiver endpoint is available.
33- ---
34- ```
35- And, in [ kubernetes/llama-stack/deployment.yaml] ( ../llama-stack/deployment.yaml )
36-
37- ``` yaml
38- ---
39- env :
40- - name : OTEL_SERVICE_NAME
41- value : llamastack
42- - name : OTEL_TRACE_ENDPOINT
43- value : http://otel-collector-collector.observability-hub.svc.cluster.local:4318/v1/traces
44- # - name: OTEL_METRIC_ENDPOINT
45- # - value: http://otel-collector-collector.observability-hub.svc.cluster.local:4318/v1/metrics
46- ---
47- ```
48-
49- The otel-endpoint is ` http://service-name-otc.namespace-of-otc.svc.cluster.local:4318/v1/traces,metrics ` if exporting to
50- central otel-collector. If using otel-collector sidecar, this would be ` http://localhost:4318/v1/traces,metrics ` .
516
527## OpenShift Observability Operators
538
@@ -88,6 +43,12 @@ oc create ns observability-hub
8843
8944### Tracing Backend (Tempo with Minio for S3 storage)
9045
46+ In order to view distributed tracing data from LLamastack and/or vLLM, you must deploy a tracing backend. The supported tracing backend in OpenShift
47+ is Tempo. See the OpenShift Tempo
48+ [ documentation] ( https://docs.redhat.com/en/documentation/openshift_container_platform/4.18/html/distributed_tracing/distributed-tracing-platform-tempo#distr-tracing-tempo-install-tempostack-web-console_dist-tracing-tempo-installing )
49+ for further details. Tempo must be paired with a storage solution. For this example, ` MinIO ` is used. The necessary resources can be created by
50+ applying the ` ./tempo ` manifests.
51+
9152``` bash
9253# edit storageclassName & secret as necessary
9354# secret and storage for testing only
@@ -97,7 +58,7 @@ oc apply --kustomize ./tempo -n observability-hub
9758### OpenTelemetryCollector deployment
9859
9960OpenTelemetry Collector is used to aggregate telemetry from various workloads, process individual signals, and export
100- to various backends. This is used to collect traces from various workloads and export all as a single
61+ to various backends. This example will collect traces from various workloads and export all as a single
10162authenticated stream to the in-cluster TempoStack. For in-cluster only, opentelemetry-collector is not necessary to collect
10263metrics. Metrics are sent to the in-cluster user-workload-monitoring prometheus by creating the podmonitors and servicemonitors.
10364However, if exporting off-cluster to a 3rd party observability vendor, the collector is necessary for all signals,
@@ -134,9 +95,24 @@ oc patch deployment <deployment-name> \
13495 -p ' {"spec":{"template":{"metadata":{"annotations":{"sidecar.opentelemetry.io/inject":"vllm-otelsidecar"}}}}}'
13596```
13697
98+ ### Cluster Observability Operator Tracing UIPlugin
99+
100+ The Jaeger frontend feature of TempoStack is no longer supported by Red Hat. This has been replaced by the COO UIPlugin. To create the UIPlugin for
101+ Tracing, first ensure the TempoStack described above is created. This is a prerequisite. Then, all that's necessary to view traces from
102+ the OpenShift console at ` Observe -> Traces ` is to create the following [ Tracing UIPlugin resource] ( ./tracing-ui-plugin.yaml ) .
103+
104+ ``` bash
105+ oc apply ./tracing-ui-plugin.yaml
106+ ```
107+
108+ You should now see traces and metrics in the OpenShift console, from the ` Oberve ` tab.
109+
137110### Grafana
138111
139- This will deploy a Grafana instance, and Prometheus & Tempo DataSources
112+ Most users are familiar with Grafana for visualizing and analyzing telemetry. To create the Grafana resources necessary to view
113+ Llamastack and vLLM telemetry, follow the below example.
114+
115+ This example will deploy a Grafana instance, and Prometheus & Tempo DataSources
140116The prometheus datasource is the user-workload-monitoring prometheus running in ` openshift-user-workload-monitoring ` namespace.
141117The Grafana console is configured with ` username: rhel, password: rhel `
142118
@@ -157,13 +133,3 @@ The dashboard is slightly modified from https://github.com/kevchu3/openshift4-gr
157133``` bash
158134oc apply -n observability-hub -f cluster-metrics-dashboard/cluster-metrics.yaml
159135```
160-
161- ### Cluster Observability Operator Tracing UIPlugin
162-
163- The Jaeger frontend feature of TempoStack is no longer supported by Red Hat. This has been replaced by the COO UIPlugin. To create the UIPlugin for
164- Tracing, first ensure the TempoStack described above is created. This is a prerequisite. Then, all that's necessary to view traces from
165- the OpenShift console at ` Observe -> Traces ` is to create the following [ Tracing UIPlugin resource] ( ./tracing-ui-plugin.yaml ) .
166-
167- ``` bash
168- oc apply ./tracing-ui-plugin.yaml
169- ```
0 commit comments