Skip to content

Commit 7b52180

Browse files
committed
fix: add CLOUD_API_URL, fix GLM-5 model name, enable JSON logs
- Add CLOUD_API_URL=https://cloud-api.near.ai to all Rust proxy services (small-models, Qwen3.5-122B) so usage is reported - Fix MODEL_NAME in GLM-5.yaml: zai-org/GLM-5 -> zai-org/GLM-5-FP8 (exact name cloud-api expects, was causing 404 on usage reporting) - Add LOG_FORMAT=json to all proxy services for structured logging
1 parent 33ca4ea commit 7b52180

3 files changed

Lines changed: 6 additions & 1 deletion

File tree

GLM-5.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ x-vllm-proxy-common: &vllm-proxy-common
3131
restart: unless-stopped
3232
environment:
3333
- NVIDIA_VISIBLE_DEVICES=all
34+
- LOG_FORMAT=json
3435
logging: *logging-conf
3536

3637
services:
@@ -57,7 +58,7 @@ services:
5758
- /var/run/dstack.sock:/var/run/dstack.sock
5859
- certs:/etc/letsencrypt:ro
5960
environment:
60-
- MODEL_NAME=zai-org/GLM-5
61+
- MODEL_NAME=zai-org/GLM-5-FP8
6162
- TOKEN=${PROXY_TOKEN}
6263
- VLLM_BASE_URL=http://glm:8000
6364
- TLS_CERT_PATH=/etc/letsencrypt/live/completions.near.ai/fullchain.pem

Qwen3.5-122B.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,8 @@ x-vllm-proxy-common: &vllm-proxy-common
2525
restart: unless-stopped
2626
environment:
2727
- NVIDIA_VISIBLE_DEVICES=all
28+
- CLOUD_API_URL=https://cloud-api.near.ai
29+
- LOG_FORMAT=json
2830
logging: *logging-conf
2931

3032
x-sglang-healthcheck: &sglang-healthcheck

small-models.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,8 @@ x-vllm-proxy-common: &vllm-proxy-common
4141
restart: unless-stopped
4242
environment:
4343
- NVIDIA_VISIBLE_DEVICES=all
44+
- CLOUD_API_URL=https://cloud-api.near.ai
45+
- LOG_FORMAT=json
4446
logging: *logging-conf
4547

4648
x-vllm-env:

0 commit comments

Comments
 (0)