Skip to content

Commit 9a10260

Browse files
committed
Updated: README and test_tokens.py
1 parent 0cdcc02 commit 9a10260

File tree

2 files changed

+98
-81
lines changed

2 files changed

+98
-81
lines changed

testing/maas_billing_tests_independent/tests/README.MD

Lines changed: 39 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ It covers Admin, Free, and Premium and shows how to generate HTML/JUnit reports.
1818
Create a venv and install deps:
1919

2020
```bash
21-
cd maas_billing_tests_independent_v5_full
21+
cd maas_billing_tests_independent
2222
python3 -m venv .venv
2323
source .venv/bin/activate
2424
pip install -r requirements.txt
@@ -38,32 +38,25 @@ oc login https://api.<cluster>:6443 --token '<your-user-token>'
3838
oc whoami
3939
```
4040

41-
Maas API base URLs:
42-
43-
# Preferred: apps-domain host (TLS + correct Host header)
44-
APPS=$(oc get ingresses.config/cluster -o jsonpath='{.spec.domain}')
45-
export MAAS_API_BASE_URL="https://maas-api.${APPS}"
46-
export USAGE_API_BASE="${MAAS_API_BASE_URL}" # used by usage tests
47-
48-
# Fallback (commented): ELB endpoint + explicit Host header
49-
# HOST=$(oc -n openshift-ingress get gateway openshift-ai-inference -o jsonpath='{.status.addresses[0].value}')
50-
# curl -H "Host: maas-api.${APPS}" "http://${HOST}/v1/models"
51-
41+
Get Gateway Endpoint and set base URLs:
5242

43+
```bash
44+
HOST=$(oc -n openshift-ingress get gateway openshift-ai-inference -o jsonpath='{.status.addresses[0].value}')
45+
export MAAS_API_BASE_URL="http://${HOST}/maas-api"
46+
export USAGE_API_BASE="${MAAS_API_BASE_URL}" # used by usage tests
47+
```
5348
Export the **current user’s** OpenShift token into `FREE_OC_TOKEN`
5449
(the tests use this name for “who you are right now”):
5550

5651
```bash
5752
export FREE_OC_TOKEN="$(oc whoami -t)"
5853
```
59-
6054
Pick a **MODEL_NAME** from the catalog (`id` field):
6155

6256
```bash
6357
curl -s -H "Authorization: Bearer ${FREE_OC_TOKEN}" "${MAAS_API_BASE_URL}/v1/models" | jq -r '.data[] | [.id,.name,.url] | @tsv'
6458
export MODEL_NAME="<paste-id-from-output>" # e.g., facebook-opt-125m-simulated
6559
```
66-
6760
---
6861

6962
## 2) Configure limits the tests will use
@@ -74,16 +67,16 @@ Read your gateway **RateLimitPolicy** values and export them so the tests know
7467
what to expect:
7568

7669
```bash
77-
# Free (update the jsonpath if your CR layout differs)
70+
# FREE request-rate burst (per window)
7871
export RATE_LIMIT_BURST_FREE=$(
79-
oc -n openshift-ingress get ratelimitpolicies.gateway.networking.k8s.io gateway-rate-limits \
72+
oc -n openshift-ingress get ratelimitpolicies.kuadrant.io gateway-rate-limits \
8073
-o jsonpath='{.spec.limits.free.rates[0].limit}'
8174
)
8275

8376
# Premium (optional; only needed for the Free-vs-Premium test)
8477
export RATE_LIMIT_BURST_PREMIUM=$(
85-
oc -n openshift-ingress get ratelimitpolicies.gateway.networking.k8s.io gateway-rate-limits \
86-
-o jsonpath='{.spec.limits.premium.rates[0].limit}'
78+
oc -n openshift-ingress get ratelimitpolicies.kuadrant.io gateway-rate-limits \
79+
-o jsonpath='{.spec.limits.enterprise.rates[0].limit}'
8780
)
8881
```
8982

@@ -108,9 +101,11 @@ The suite assumes **`FREE_OC_TOKEN`** holds the *current* user’s token.
108101
### A) Admin (sanity / wiring)
109102

110103
```bash
111-
pytest -q tests/test_tokens.py::test_minted_token_is_jwt
112-
pytest -q tests/test_models_user.py
113-
pytest -q tests/test_gateway_endpoints.py::test_chat_completion_works
104+
105+
pytest -q testing/maas_billing_tests_independent/tests/test_tokens.py::test_minted_token_is_jwt
106+
pytest -q testing/maas_billing_tests_independent/tests/test_models_user.py
107+
pytest -q testing/maas_billing_tests_independent/tests/test_gateway_endpoints.py::test_chat_completion_works
108+
114109
```
115110

116111
### B) Free user (authz + request-rate burst + usage)
@@ -120,35 +115,35 @@ pytest -q tests/test_gateway_endpoints.py::test_chat_completion_works
120115
export FREE_OC_TOKEN="$(oc whoami -t)"
121116

122117
# basics
123-
pytest -q tests/test_tokens.py::test_minted_token_is_jwt
124-
pytest -q tests/test_models_user.py
125-
pytest -q tests/test_gateway_endpoints.py::test_chat_completion_works
118+
pytest -q testing/maas_billing_tests_independent/tests/test_tokens.py::test_minted_token_is_jwt
119+
pytest -q testing/maas_billing_tests_independent/tests/test_models_user.py
120+
pytest -q testing/maas_billing_tests_independent/tests/test_gateway_endpoints.py::test_chat_completion_works
126121

127122
# request-rate burst (expects some 429s after RATE_LIMIT_BURST_FREE)
128-
pytest -q tests/test_quota_global.py::test_rate_limit_burst
123+
pytest -q testing/maas_billing_tests_independent/tests/test_quota_global.py::test_rate_limit_burst
129124

130125
# usage (optional; requires USAGE_API_BASE)
131-
pytest -q tests/test_usage_logs.py
126+
pytest -q testing/maas_billing_tests_independent/tests/test_usage_logs.py
132127
```
133128

134129
#### Token‑rate for Free
135130
Trigger token-based limiting by making each call expensive in tokens:
136131
```bash
137132
export TOKENS_PER_CALL_LARGE=1200
138-
pytest -q tests/test_token_ratelimit.py
133+
pytest -q testing/maas_billing_tests_independent/tests/test_token_ratelimit.py
139134
```
140135

141136
#### Interplay for Free — which limiter fires first?
142137
**Request‑rate first:** many *cheap* calls
143138
```bash
144139
export TOKENS_PER_CALL_SMALL=16
145140
export BURST_SLEEP=0.05
146-
pytest -q tests/test_quota_global.py::test_rate_limit_burst
141+
pytest -q testing/maas_billing_tests_independent/tests/test_quota_global.py::test_rate_limit_burst
147142
```
148143
**Token‑rate first:** few *expensive* calls
149144
```bash
150145
export TOKENS_PER_CALL_LARGE=1200
151-
pytest -q tests/test_token_ratelimit.py
146+
pytest -q testing/maas_billing_tests_independent/tests/test_token_ratelimit.py
152147
```
153148

154149
### C) Premium user (same flow + Free-vs-Premium comparison)
@@ -158,31 +153,31 @@ pytest -q tests/test_token_ratelimit.py
158153
export FREE_OC_TOKEN="$(oc whoami -t)" # current user’s token again
159154
export PREMIUM_OC_TOKEN="$FREE_OC_TOKEN" # used by the test to mint for premium
160155

161-
pytest -q tests/test_gateway_endpoints.py::test_chat_completion_works
156+
pytest -q testing/maas_billing_tests_independent/tests/test_gateway_endpoints.py::test_chat_completion_works
162157

163158
# Compare Free vs Premium burst; Premium must not be worse than Free
164159
# (uses RATE_LIMIT_BURST_FREE / RATE_LIMIT_BURST_PREMIUM)
165-
pytest -q tests/test_quota_per_user.py::test_free_vs_premium_quota
160+
pytest -q testing/maas_billing_tests_independent/tests/test_quota_per_user.py::test_free_vs_premium_quota
166161
```
167162

168163
#### Token‑rate for Premium
169164
Run the token limiter test while logged in as your Premium user:
170165
```bash
171166
export TOKENS_PER_CALL_LARGE=1200
172-
pytest -q tests/test_token_ratelimit.py
167+
pytest -q testing/maas_billing_tests_independent/tests/test_token_ratelimit.py
173168
```
174169

175170
#### Interplay for Premium — which limiter fires first?
176171
**Request‑rate first:** many *cheap* calls (uses `RATE_LIMIT_BURST_PREMIUM` if you exported it)
177172
```bash
178173
export TOKENS_PER_CALL_SMALL=16
179174
export BURST_SLEEP=0.05
180-
pytest -q tests/test_quota_global.py::test_rate_limit_burst
175+
pytest -q testing/maas_billing_tests_independent/tests/test_quota_global.py::test_rate_limit_burst
181176
```
182177
**Token‑rate first:** few *expensive* calls
183178
```bash
184179
export TOKENS_PER_CALL_LARGE=1200
185-
pytest -q tests/test_token_ratelimit.py
180+
pytest -q testing/maas_billing_tests_independent/tests/test_token_ratelimit.py
186181
```
187182

188183
### D) Token-rate (current user – Free **or** Premium)
@@ -191,7 +186,7 @@ If you want to *exercise* token-rate limiting, increase tokens per call to make
191186

192187
```bash
193188
export TOKENS_PER_CALL_LARGE=1200 # example value to drive token usage
194-
pytest -q tests/test_token_ratelimit.py
189+
pytest -q testing/maas_billing_tests_independent/tests/test_token_ratelimit.py
195190
```
196191

197192
---
@@ -206,12 +201,12 @@ By shaping traffic as above (many *cheap* calls vs few *expensive* calls), you c
206201
## 4) Reports (HTML & JUnit)
207202

208203
```bash
209-
mkdir -p reports
204+
mkdir -p testing/maas_billing_tests_independent/reports
210205

211206
# Example: run everything for the current user and produce reports
212-
pytest -q \
213-
--html=reports/current.html --self-contained-html \
214-
--junitxml=reports/current.xml
207+
pytest -q testing/maas_billing_tests_independent/tests \
208+
--html=testing/maas_billing_tests_independent/reports/current.html --self-contained-html \
209+
--junitxml=testing/maas_billing_tests_independent/reports/current.xml
215210
```
216211

217212
Open `reports/current.html` in your browser.
@@ -235,21 +230,7 @@ Open `reports/current.html` in your browser.
235230

236231
---
237232

238-
## 6) Troubleshooting
239-
240-
- **401 Unauthorized** – ensure you exported `FREE_OC_TOKEN="$(oc whoami -t)"` in this shell.
241-
- **404 on chat** – the test already posts to **`<model-url>/v1/chat/completions`**.
242-
If you edited anything, make sure you didn’t send to `/maas-api/v1/chat/completions`.
243-
- **Burst test returns 429 too early** – your exported `RATE_LIMIT_BURST_FREE` is higher than
244-
the actual policy. Re-read the CR and export the real value (or lower `N_BURST` if you set it).
245-
- **Never see 429** – increase `N_BURST` or verify the RateLimitPolicy is **Accepted/Enforced**
246-
in the `openshift-ingress` project.
247-
- **WSL vs PowerShell** – they’re separate shells; log in and re-export vars in whichever one
248-
you use to run `pytest`.
249-
250-
---
251-
252-
## 7) PowerShell equivalents (Windows)
233+
## 6) PowerShell equivalents (Windows)
253234

254235
```powershell
255236
# venv
@@ -319,10 +300,10 @@ export TOKENS_PER_CALL_SMALL=16
319300
export BURST_SLEEP=0.05
320301

321302
# run a few
322-
pytest -q tests/test_tokens.py::test_minted_token_is_jwt
323-
pytest -q tests/test_models_user.py
324-
pytest -q tests/test_gateway_endpoints.py::test_chat_completion_works
325-
pytest -q tests/test_quota_global.py::test_rate_limit_burst
303+
pytest -q testing/maas_billing_tests_independent/tests/test_tokens.py::test_minted_token_is_jwt
304+
pytest -q testing/maas_billing_tests_independent/tests/test_models_user.py
305+
pytest -q testing/maas_billing_tests_independent/tests/test_gateway_endpoints.py::test_chat_completion_works
306+
pytest -q testing/maas_billing_tests_independent/tests/test_quota_global.py::test_rate_limit_burst
326307

327308
# report
328309
mkdir -p reports && pytest -q --html=reports/current.html --self-contained-html --junitxml=reports/current.xml
Lines changed: 59 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,51 +1,87 @@
1-
from conftest import bearer, parse_usage_headers, USAGE_HEADERS, ensure_free_key, ensure_premium_key
2-
import os, json, base64
1+
from conftest import bearer, parse_usage_headers, USAGE_HEADERS, ensure_free_key
2+
import json, base64
33

44
def _b64url_decode(s):
55
pad = "=" * (-len(s) % 4)
66
return base64.urlsafe_b64decode((s + pad).encode("utf-8"))
77

8-
def test_minted_token_is_jwt(http, base_url, maas_key):
8+
def test_minted_token_is_jwt(maas_key):
99
parts = maas_key.split(".")
1010
assert len(parts) == 3
1111
hdr = json.loads(_b64url_decode(parts[0]).decode("utf-8"))
1212
assert isinstance(hdr, dict)
1313

1414
def test_tokens_issue_201_and_schema(http, base_url):
15-
from conftest import FREE_OC_TOKEN, mint_maas_key
16-
key, body, _ = mint_maas_key(http, base_url, FREE_OC_TOKEN, minutes=10)
17-
assert isinstance(body, dict) and key and len(key) > 10
15+
from conftest import FREE_OC_TOKEN, mint_maas_key, bearer as bh
16+
# mint_maas_key returns a single string (the MaaS key)
17+
key = mint_maas_key(http, base_url, FREE_OC_TOKEN, minutes=10)
18+
assert isinstance(key, str) and len(key) > 10
19+
# prove the key works and don’t hang forever
20+
r_ok = http.get(f"{base_url}/v1/models", headers=bh(key), timeout=30)
21+
assert r_ok.status_code == 200
1822

1923
def test_tokens_invalid_ttl_400(http, base_url):
20-
from conftest import FREE_OC_TOKEN, http_post, bearer
24+
from conftest import FREE_OC_TOKEN, http_post
2125
url = f"{base_url}/v1/tokens"
22-
code, body, r = http_post(http, url, headers=bearer(FREE_OC_TOKEN), json={"ttl":"4hours"})
26+
code, body, r = http_post(
27+
http,
28+
url,
29+
headers=bearer(FREE_OC_TOKEN),
30+
json={"expiration": "4hours"},
31+
timeout=30, # add timeout so it can’t hang
32+
)
2333
assert code == 400
2434

2535
def test_tokens_models_happy_then_revoked_fails(http, base_url, model_name):
26-
from conftest import FREE_OC_TOKEN, mint_maas_key, revoke_maas_key, bearer as bh
27-
key, _, _ = mint_maas_key(http, base_url, FREE_OC_TOKEN, minutes=10)
28-
r_ok = http.get(f"{base_url}/v1/models", headers=bh(key))
29-
assert r_ok.status_code == 200
36+
from conftest import FREE_OC_TOKEN, mint_maas_key, revoke_maas_key, bearer
37+
38+
# 1) Mint a MaaS key from the current OC user token
39+
key = mint_maas_key(http, base_url, FREE_OC_TOKEN, minutes=10)
40+
41+
# 2) Discover the model URL
42+
models = http.get(f"{base_url}/v1/models", headers=bearer(key), timeout=30).json()
43+
items = models.get("data") or models.get("models") or []
44+
target = next((m for m in items if m.get("id")==model_name or m.get("name")==model_name), None)
45+
assert target and target.get("url"), "model not found or missing url"
46+
murl = target["url"]
47+
48+
payload = {"model": model_name,
49+
"messages":[{"role":"user","content":"hi"}],
50+
"max_tokens": 32}
3051

52+
# 3) Works before revoke
53+
r_ok = http.post(f"{murl}/v1/chat/completions", headers=bearer(key), json=payload, timeout=60)
54+
assert r_ok.status_code in (200, 201)
55+
56+
# 4) Revoke the key
3157
r_del = revoke_maas_key(http, base_url, FREE_OC_TOKEN, key)
32-
assert r_del.status_code in (200,202,204)
58+
assert r_del.status_code in (200, 202, 204)
3359

34-
r_again = http.get(f"{base_url}/v1/models", headers=bh(key))
35-
assert r_again.status_code in (401,403)
60+
# 5) Fails after revoke
61+
r_bad = http.post(f"{murl}/v1/chat/completions", headers=bearer(key), json=payload, timeout=60)
62+
assert r_bad.status_code in (401, 403)
3663

3764
def test_usage_headers_present(http, base_url, model_name):
65+
from conftest import bearer, ensure_free_key, parse_usage_headers
66+
3867
key = ensure_free_key(http)
68+
69+
# discover model URL
70+
models = http.get(f"{base_url}/v1/models", headers=bearer(key), timeout=30).json()
71+
items = models.get("data") or models.get("models") or []
72+
target = next((m for m in items if m.get("id")==model_name or m.get("name")==model_name), None)
73+
assert target and target.get("url"), "model not found or missing url"
74+
murl = target["url"]
75+
3976
r = http.post(
40-
f"{base_url}/v1/chat/completions",
77+
f"{murl}/v1/chat/completions",
4178
headers=bearer(key),
42-
json={
43-
"model": model_name,
44-
"messages": [{"role":"user","content":"Say hi"}],
45-
"temperature": 0,
46-
},
79+
json={"model": model_name, "messages":[{"role":"user","content":"Say hi"}], "temperature":0},
4780
timeout=60,
4881
)
49-
assert r.status_code in (200,201), f"unexpected {r.status_code}: {r.text[:200]}"
82+
assert r.status_code in (200, 201), f"unexpected {r.status_code}: {r.text[:200]}"
83+
5084
usage = parse_usage_headers(r)
51-
assert any(h in usage for h in USAGE_HEADERS), f"No usage headers: {dict(r.headers)}"
85+
# assert presence and non-negative total
86+
assert "x-odhu-usage-total-tokens" in usage, f"No usage headers present: {dict(r.headers)}"
87+
assert int(usage["x-odhu-usage-total-tokens"]) >= 0

0 commit comments

Comments
 (0)