Skip to content

Commit 133d7a4

Browse files
scarlet25151chenyu.jiang
andauthored
[Doc] add gateway prometheus config (#1954)
* add gateway prometheus config * move the instruction into gateway plugin folder Signed-off-by: chenyu.jiang <chenyu.jiang@bytedance.com> Co-authored-by: chenyu.jiang <chenyu.jiang@bytedance.com>
1 parent 194cbc7 commit 133d7a4

File tree

1 file changed

+71
-3
lines changed

1 file changed

+71
-3
lines changed

docs/source/features/gateway-plugins.rst

Lines changed: 71 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -159,6 +159,8 @@ Below are routing strategies gateway supports:
159159
* ``prefix-cache-preble``: routes request considering both prefix cache hits and pod load, implementation is based of Preble: Efficient Distributed Prompt Scheduling for LLM Serving: https://arxiv.org/abs/2407.00023.
160160
* ``vtc-basic``: routes request using a hybrid score balancing fairness (user token count) and pod utilization. It is a simple variant of Virtual Token Counter (VTC) algorithm. See more details at https://github.com/Ying1123/VTC-artifact
161161

162+
Some routing strategies rely on metrics queried from the Prometheus HTTP API (PromQL). See :ref:`prometheus-api-access` for configuration.
163+
162164
.. code-block:: bash
163165
164166
curl -v http://${ENDPOINT}/v1/chat/completions \
@@ -205,7 +207,7 @@ How session affinity works:
205207
- This is especially useful for **multi-turn chat applications** where maintaining context on the same backend instance improves performance and consistency.
206208

207209
.. note::
208-
The x-session-id header is not a security token—it only encodes network location. Do not rely on it for authentication or authorization.
210+
The x-session-id header is not a security token—it only encodes network location. Do not rely on it for authentication or authorization.
209211

210212
Rate Limiting
211213
-------------
@@ -233,7 +235,7 @@ To set up rate limiting, add the user header in the request, like this:
233235

234236

235237
External Filter
236-
===============
238+
---------------
237239
The ``external-filter`` header is evaluated **before** the routing strategy selects the optimal target pod. allows users to dynamically restrict the target Pods using Kubernetes ``labelSelector`` expressions.
238240

239241
The header value follows the Kubernetes label selector syntax:
@@ -261,7 +263,7 @@ https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
261263
262264
.. note::
263265
1. Filtering happens **before** the routing strategy. It never changes which Pods are considered “optimal” by the routing strategy.
264-
2. The ``external-filter`` only takes effect when a ``routing-strategy``` is set.
266+
2. The ``external-filter`` only takes effect when a ``routing-strategy`` is set.
265267
3. It only reduces the Pod selected by `model.aibrix.ai/name` and set by applying extra label constraints.
266268
4. Same as `no target pod`, If the filter eliminates all Pods, the request will fail with ``no ready pods for routing``.
267269
5. ``external-filter`` is optional. When omitted, no extra filtering is applied.
@@ -430,3 +432,69 @@ Below are starting pointers to help debug.
430432
aibrix-redis-master-7d6b77c794-bcqxc 1/1 Running 0 22m
431433
432434
kubectl logs aibrix-gateway-plugins-6bd9fcd5b9-2bwpr -n aibrix-system
435+
436+
.. _prometheus-api-access:
437+
438+
Prometheus API Access
439+
---------------------
440+
441+
Some routing strategies rely on metrics queried from the Prometheus HTTP API (PromQL). Configure the API endpoint and optional Basic Auth with the following environment variables.
442+
443+
.. list-table::
444+
:header-rows: 1
445+
:widths: 40 18 60
446+
447+
* - Environment Variable
448+
- Default
449+
- Description
450+
* - ``PROMETHEUS_ENDPOINT``
451+
- (empty)
452+
- Prometheus HTTP API base URL (for example: ``http://prometheus-operated.prometheus.svc:9090``). If empty, PromQL-based metrics are skipped.
453+
* - ``PROMETHEUS_BASIC_AUTH_SECRET_NAME``
454+
- (empty)
455+
- Kubernetes Secret name that stores the Basic Auth credentials. When set, it takes precedence over the plaintext env vars below.
456+
* - ``PROMETHEUS_BASIC_AUTH_SECRET_NAMESPACE``
457+
- ``aibrix-system``
458+
- Namespace of the Secret specified by ``PROMETHEUS_BASIC_AUTH_SECRET_NAME``.
459+
* - ``PROMETHEUS_BASIC_AUTH_USERNAME_KEY``
460+
- ``username``
461+
- Key in ``Secret.data`` used as the Basic Auth username.
462+
* - ``PROMETHEUS_BASIC_AUTH_PASSWORD_KEY``
463+
- ``password``
464+
- Key in ``Secret.data`` used as the Basic Auth password.
465+
* - ``PROMETHEUS_BASIC_AUTH_USERNAME``
466+
- (empty)
467+
- Basic Auth username, used only when ``PROMETHEUS_BASIC_AUTH_SECRET_NAME`` is not set.
468+
* - ``PROMETHEUS_BASIC_AUTH_PASSWORD``
469+
- (empty)
470+
- Basic Auth password, used only when ``PROMETHEUS_BASIC_AUTH_SECRET_NAME`` is not set.
471+
472+
Example (plaintext env vars)
473+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
474+
475+
.. code-block:: bash
476+
477+
export PROMETHEUS_ENDPOINT="http://prometheus-operated.prometheus.svc:9090"
478+
export PROMETHEUS_BASIC_AUTH_USERNAME="prom_user"
479+
export PROMETHEUS_BASIC_AUTH_PASSWORD="prom_pass"
480+
481+
Example (Kubernetes Secret)
482+
^^^^^^^^^^^^^^^^^^^^^^^^^^^
483+
484+
.. code-block:: yaml
485+
486+
apiVersion: v1
487+
kind: Secret
488+
metadata:
489+
name: prometheus-basic-auth
490+
namespace: aibrix-system
491+
type: Opaque
492+
stringData:
493+
username: prom_user
494+
password: prom_pass
495+
496+
.. code-block:: bash
497+
498+
export PROMETHEUS_ENDPOINT="http://prometheus-operated.prometheus.svc:9090"
499+
export PROMETHEUS_BASIC_AUTH_SECRET_NAME="prometheus-basic-auth"
500+
export PROMETHEUS_BASIC_AUTH_SECRET_NAMESPACE="aibrix-system"

0 commit comments

Comments
 (0)