Skip to content

Commit 34b5a97

Browse files
author
yingying0906
committed
Merge upstream/main into pr/qwen3vl-multitile-batching
2 parents 7744040 + ce2e3d0 commit 34b5a97

149 files changed

Lines changed: 10067 additions & 3577 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
---
2+
description: QVAC coding-agent integration stack — ai-sdk-provider, OpenCode plugin, CLI serve, models.dev, docs, layer ownership, releases
3+
globs:
4+
- packages/ai-sdk-provider/**
5+
- plugins/opencode/**
6+
- packages/cli/src/serve/**
7+
- packages/cli/docs/serve-openai.md
8+
- docs/website/content/docs/cli/http-server/**
9+
- docs/architecture/AGENT-INTEGRATIONS.md
10+
alwaysApply: false
11+
---
12+
13+
When working on QVAC coding-agent integrations, read `docs/architecture/AGENT-INTEGRATIONS.md` first. It is the detailed repo reference for `@qvac/ai-sdk-provider`, `@qvac/opencode-plugin`, `qvac serve openai`, QVAC docs, and models.dev provider metadata.
14+
15+
## Package Stack
16+
17+
```text
18+
OpenCode / coding agent
19+
-> @qvac/opencode-plugin # OpenCode-specific turnkey UX
20+
-> @qvac/ai-sdk-provider managed mode # spawn/reuse local qvac serve
21+
-> @qvac/cli qvac serve openai # OpenAI-compatible HTTP adapter
22+
-> @qvac/sdk # model loading, inference, registry
23+
```
24+
25+
Manual/custom-provider integrations skip the plugin and point directly at `qvac serve openai`.
26+
27+
## Layer Ownership
28+
29+
- `packages/sdk`: core inference semantics, model loading, registry downloads, tool-call parsers/dialects, cancellation primitives, structured errors, generated model constants.
30+
- `packages/cli/src/serve`: OpenAI-compatible HTTP API, route validation, request/response translation, same-model queueing, client-disconnect cancellation, CORS/auth/OpenAPI, generic OpenAI-client compatibility.
31+
- `packages/ai-sdk-provider`: Vercel AI SDK provider wrapper, `createQvac`, external/managed modes, managed serve lifecycle/reuse, friendly model catalog (`qvacCatalog`), typed model metadata exports.
32+
- `plugins/opencode`: OpenCode plugin hooks, provider injection, project default model selection, host process, OpenCode startup/TUI behavior, temporary OpenCode/OpenAI-compatible shims.
33+
- `models.dev`: external provider/model metadata only. Do not encode QVAC runtime behavior there.
34+
- QVAC docs/READMEs: plugin-first OpenCode setup, manual server setup as advanced/custom-provider path, model selection, release-relevant behavior.
35+
36+
Keep changes in the lowest correct layer. If a plugin shim becomes generally true for all OpenAI-compatible clients, move it down to CLI serve and remove it from the plugin.
37+
38+
## Model Naming
39+
40+
There are three model naming layers:
41+
42+
- Friendly id: `qwen3.5-9b` (`@qvac/ai-sdk-provider` `qvacCatalog`, mirrored in models.dev).
43+
- SDK constant: `GPT_OSS_20B_INST_Q4_K_M` / `GEMMA4_31B_MULTIMODAL_Q4_K_M` (generated SDK model constants).
44+
- Serve alias: HTTP `model` value used by clients; managed mode can use the friendly id or raw constant as the alias.
45+
46+
`@qvac/opencode-plugin` accepts both friendly ids and raw QVAC chat-model constants. Do not document the plugin as if only Qwen3.5 friendly ids are usable.
47+
48+
## OpenCode Documentation Rules
49+
50+
- Lead with `@qvac/opencode-plugin`.
51+
- Manual `qvac serve openai` and custom provider JSON are advanced paths.
52+
- Avoid "no provider block / no second terminal / no QVAC_MODEL prefix" framing. State positive behavior: the plugin starts managed QVAC serve, registers `qvac`, and selects the project model.
53+
- Mention stronger raw constants such as `GPT_OSS_20B_INST_Q4_K_M` when recommending agent-capable local models.
54+
- Keep npm README expectations clear: npmjs.com only updates after publishing a new package version.
55+
56+
## Release Guidance
57+
58+
Release dependency order for multi-layer changes:
59+
60+
1. `@qvac/sdk`
61+
2. `@qvac/cli`
62+
3. `@qvac/ai-sdk-provider`
63+
4. `@qvac/opencode-plugin`
64+
5. docs / models.dev as appropriate
65+
66+
If a fix is transitive through caret ranges, verify with a fresh install before deciding whether upper packages need re-release.
Lines changed: 241 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,241 @@
1+
name: SDK Device Farm - Wait For Running
2+
description: Wait for a scheduled SDK Device Farm run to enter RUNNING before starting the producer.
3+
4+
inputs:
5+
platform:
6+
description: "Human-readable platform name for logs (android or ios)."
7+
required: true
8+
run-arns-directory:
9+
description: "Directory containing one .txt file per scheduled Device Farm run ARN."
10+
required: true
11+
start-timeout-minutes:
12+
description: "Maximum minutes to wait for at least one run to enter RUNNING."
13+
required: false
14+
default: "90"
15+
poll-interval-seconds:
16+
description: "Seconds between get-run polling attempts."
17+
required: false
18+
default: "15"
19+
20+
runs:
21+
using: composite
22+
steps:
23+
- name: Wait for Device Farm runs
24+
shell: node {0}
25+
env:
26+
PLATFORM: ${{ inputs.platform }}
27+
RUN_ARNS_DIRECTORY: ${{ inputs.run-arns-directory }}
28+
START_TIMEOUT_MINUTES: ${{ inputs.start-timeout-minutes }}
29+
POLL_INTERVAL_SECONDS: ${{ inputs.poll-interval-seconds }}
30+
run: |
31+
const fs = require('fs');
32+
const path = require('path');
33+
const { execFileSync } = require('child_process');
34+
35+
const pendingStatuses = new Set([
36+
'PENDING',
37+
'PENDING_CONCURRENCY',
38+
'PENDING_DEVICE',
39+
'PROCESSING',
40+
'SCHEDULING',
41+
'PREPARING',
42+
'UNKNOWN',
43+
]);
44+
const terminalStatuses = new Set([
45+
'COMPLETED',
46+
'STOPPED',
47+
'ERRORED',
48+
'FAILED',
49+
]);
50+
51+
function parsePositiveInteger(rawValue, fallback, label) {
52+
const parsed = Number.parseInt(String(rawValue ?? ''), 10);
53+
if (Number.isFinite(parsed) && parsed > 0) {
54+
return parsed;
55+
}
56+
console.log(`[device-farm-wait] Invalid ${label}=${rawValue}; using ${fallback}`);
57+
return fallback;
58+
}
59+
60+
function sleep(ms) {
61+
return new Promise((resolve) => setTimeout(resolve, ms));
62+
}
63+
64+
function appendSummary(line) {
65+
if (!process.env.GITHUB_STEP_SUMMARY) {
66+
return;
67+
}
68+
fs.appendFileSync(process.env.GITHUB_STEP_SUMMARY, `${line}\n`, 'utf8');
69+
}
70+
71+
function readRunArns(directory) {
72+
if (!fs.existsSync(directory)) {
73+
throw new Error(`Run ARN directory does not exist: ${directory}`);
74+
}
75+
76+
const files = fs.readdirSync(directory)
77+
.filter((fileName) => fileName.endsWith('.txt'))
78+
.sort();
79+
80+
const runs = [];
81+
for (const fileName of files) {
82+
const filePath = path.join(directory, fileName);
83+
const arn = fs.readFileSync(filePath, 'utf8').trim();
84+
if (!arn) {
85+
console.log(`[device-farm-wait] Ignoring empty ARN file: ${filePath}`);
86+
continue;
87+
}
88+
runs.push({
89+
name: path.basename(fileName, '.txt'),
90+
arn,
91+
status: 'UNKNOWN',
92+
result: 'UNKNOWN',
93+
message: '',
94+
finishedBeforeProducer: false,
95+
});
96+
}
97+
98+
if (runs.length === 0) {
99+
throw new Error(`No Device Farm run ARN files found in ${directory}`);
100+
}
101+
102+
return runs;
103+
}
104+
105+
function awsJson(args) {
106+
const stdout = execFileSync('aws', args, {
107+
encoding: 'utf8',
108+
stdio: ['ignore', 'pipe', 'pipe'],
109+
});
110+
return JSON.parse(stdout);
111+
}
112+
113+
function getRun(run) {
114+
const response = awsJson([
115+
'devicefarm',
116+
'get-run',
117+
'--arn',
118+
run.arn,
119+
'--query',
120+
'run',
121+
'--output',
122+
'json',
123+
]);
124+
125+
return {
126+
status: response.status ?? 'UNKNOWN',
127+
result: response.result ?? 'UNKNOWN',
128+
message: response.message ?? '',
129+
};
130+
}
131+
132+
function stopRun(run, reason) {
133+
if (terminalStatuses.has(run.status)) {
134+
console.log(
135+
`[device-farm-wait] Not stopping ${run.name}; status=${run.status} reason=${reason}`,
136+
);
137+
return;
138+
}
139+
140+
try {
141+
console.log(`[device-farm-wait] Stopping ${run.name}: status=${run.status} reason=${reason}`);
142+
execFileSync('aws', ['devicefarm', 'stop-run', '--arn', run.arn], {
143+
encoding: 'utf8',
144+
stdio: ['ignore', 'pipe', 'pipe'],
145+
});
146+
} catch (error) {
147+
const stderr = error.stderr ? String(error.stderr).trim() : '';
148+
console.log(
149+
`[device-farm-wait] stop-run failed for ${run.name}: ${error.message}${stderr ? ` stderr=${stderr}` : ''}`,
150+
);
151+
}
152+
}
153+
154+
function stopActiveRuns(runs, reason) {
155+
for (const run of runs) {
156+
stopRun(run, reason);
157+
}
158+
}
159+
160+
function statusSummary(runs) {
161+
return runs
162+
.map((run) => `${run.name}:${run.status}${run.result !== 'UNKNOWN' ? `/${run.result}` : ''}`)
163+
.join(', ');
164+
}
165+
166+
async function main() {
167+
const platform = process.env.PLATFORM || 'mobile';
168+
const runArnsDirectory = process.env.RUN_ARNS_DIRECTORY || './run-arns';
169+
const timeoutMinutes = parsePositiveInteger(process.env.START_TIMEOUT_MINUTES, 90, 'START_TIMEOUT_MINUTES');
170+
const pollIntervalSeconds = parsePositiveInteger(process.env.POLL_INTERVAL_SECONDS, 15, 'POLL_INTERVAL_SECONDS');
171+
const timeoutMs = timeoutMinutes * 60 * 1000;
172+
const pollIntervalMs = pollIntervalSeconds * 1000;
173+
const startedAt = Date.now();
174+
const runs = readRunArns(runArnsDirectory);
175+
176+
console.log(`[device-farm-wait] Waiting for ${runs.length} ${platform} Device Farm run(s); producer starts when the first run enters RUNNING`);
177+
console.log(`[device-farm-wait] timeout=${timeoutMinutes}m pollInterval=${pollIntervalSeconds}s`);
178+
appendSummary(`### ${platform} Device Farm readiness`);
179+
appendSummary(`Waiting up to ${timeoutMinutes} minute(s) for at least one of ${runs.length} run(s) to enter \`RUNNING\`.`);
180+
181+
while (true) {
182+
for (const run of runs) {
183+
try {
184+
const latest = getRun(run);
185+
run.status = latest.status;
186+
run.result = latest.result;
187+
run.message = latest.message;
188+
} catch (error) {
189+
console.log(`[device-farm-wait] get-run failed for ${run.name}: ${error.message}`);
190+
run.result = 'UNKNOWN';
191+
run.message = error.message;
192+
}
193+
}
194+
195+
const elapsedSeconds = Math.round((Date.now() - startedAt) / 1000);
196+
console.log(`[device-farm-wait] ${elapsedSeconds}s status: ${statusSummary(runs)}`);
197+
198+
const runningRun = runs.find((run) => run.status === 'RUNNING');
199+
if (runningRun) {
200+
console.log(`[device-farm-wait] ${runningRun.name} entered RUNNING after ${elapsedSeconds}s; starting producer`);
201+
appendSummary(`${runningRun.name} reached \`RUNNING\` after ${elapsedSeconds}s; producer can start.`);
202+
return;
203+
}
204+
205+
for (const run of runs) {
206+
if (terminalStatuses.has(run.status) && !run.finishedBeforeProducer) {
207+
const message = run.message ? ` message=${run.message}` : '';
208+
run.finishedBeforeProducer = true;
209+
console.log(
210+
`[device-farm-wait] ${run.name} reached ${run.status} before producer start; result=${run.result}${message}`,
211+
);
212+
}
213+
}
214+
215+
const hasPendingRun = runs.some((run) => pendingStatuses.has(run.status));
216+
if (!hasPendingRun) {
217+
console.log(
218+
`[device-farm-wait] All Device Farm runs finished before any RUNNING state was observed: ${statusSummary(runs)}`,
219+
);
220+
appendSummary(`Device Farm readiness failed: all runs finished before a \`RUNNING\` state was observed.`);
221+
process.exit(1);
222+
}
223+
224+
const elapsedMs = Date.now() - startedAt;
225+
if (elapsedMs >= timeoutMs) {
226+
console.log(
227+
`[device-farm-wait] Timed out after ${timeoutMinutes}m before any run entered RUNNING: ${statusSummary(runs)}`,
228+
);
229+
stopActiveRuns(runs, `did not enter RUNNING within ${timeoutMinutes}m`);
230+
appendSummary(`Device Farm readiness timed out after ${timeoutMinutes} minute(s): ${statusSummary(runs)}.`);
231+
process.exit(1);
232+
}
233+
234+
await sleep(pollIntervalMs);
235+
}
236+
}
237+
238+
main().catch((error) => {
239+
console.error(`[device-farm-wait] ${error.stack || error.message}`);
240+
process.exit(1);
241+
});

.github/sdk-pod-checks.json

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,14 @@
66
"needs_bare": true,
77
"tests_bare": true
88
},
9+
{
10+
"package": "bare-sdk",
11+
"path": "packages/bare-sdk",
12+
"pkg_manager": "bun",
13+
"needs_bare": true,
14+
"tests_bare": true,
15+
"sdk_sources": ["workspace"]
16+
},
917
{
1018
"package": "cli",
1119
"path": "packages/cli",

.github/workflows/on-merge-transcription-whispercpp.yml

Lines changed: 0 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -337,25 +337,3 @@ jobs:
337337
with:
338338
repository: ${{ github.repository }}
339339
ref: ${{ github.sha }}
340-
341-
benchmark:
342-
name: Trigger Benchmark (Whispercpp)
343-
runs-on: ubuntu-latest
344-
environment: release
345-
needs:
346-
- post-build-gate
347-
- label-gate
348-
if: "needs.label-gate.outputs.authorised == 'true' && (!cancelled() && needs.post-build-gate.outputs.should_run_tests == 'true')"
349-
steps:
350-
- name: Trigger benchmark workflow
351-
env:
352-
GH_TOKEN: ${{ secrets.PAT_TOKEN }}
353-
run: |
354-
gh workflow run benchmark-transcription-whispercpp.yml \
355-
--repo ${{ github.repository }} \
356-
--ref ${{ github.ref_name }} \
357-
-f dataset_type=librispeech \
358-
-f language=english \
359-
-f model_size=tiny \
360-
-f streaming_mode=false \
361-
-f workdir=packages/transcription-whispercpp

0 commit comments

Comments
 (0)