-
Notifications
You must be signed in to change notification settings - Fork 33
Expand file tree
/
Copy pathREQUIREMENTS.html
More file actions
59 lines (59 loc) · 3.26 KB
/
REQUIREMENTS.html
File metadata and controls
59 lines (59 loc) · 3.26 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Distill Local Model Requirements</title>
<style>
body { font-family: system-ui, sans-serif; margin: 24px; line-height: 1.4; color: #17202a; }
header, section { max-width: 980px; margin: 0 auto 24px; }
.grid { display: grid; grid-template-columns: repeat(auto-fit, minmax(220px, 1fr)); gap: 12px; }
.card { border: 1px solid #d7dde5; border-radius: 8px; padding: 14px; background: #fbfcfe; }
.ok { border-left: 4px solid #217346; }
.risk { border-left: 4px solid #b25e09; }
table { border-collapse: collapse; width: 100%; }
th, td { border: 1px solid #d7dde5; padding: 8px; text-align: left; vertical-align: top; }
th { background: #f0f4f8; }
code { background: #eef2f7; padding: 1px 4px; border-radius: 4px; }
</style>
</head>
<body>
<header>
<h1>Distill Local Model Onboarding</h1>
<p><strong>TL;DR:</strong> make Distill default to its own local model, start the local OpenAI-compatible server on demand, and keep external APIs available through explicit config.</p>
</header>
<section class="grid">
<div class="card ok"><strong>Default</strong><br>Provider defaults to <code>local</code> for fresh and existing config resolution.</div>
<div class="card ok"><strong>Mac</strong><br>Apple Silicon uses MLX model <code>samuelfaj/distill-1.7B-4bit-MLX</code>.</div>
<div class="card ok"><strong>Other OS</strong><br>Linux, Windows, and non-Apple-Silicon Mac use llama.cpp GGUF.</div>
<div class="card risk"><strong>Concurrency</strong><br>Default max concurrent local requests is <code>5</code>, configurable.</div>
</section>
<section>
<h2>Acceptance Criteria</h2>
<table>
<tr><th>Requirement</th><th>Status</th><th>Proof Target</th></tr>
<tr><td>Onboarding offers local model and external API.</td><td>Confirmed</td><td>CLI entry test</td></tr>
<tr><td>Local model is default.</td><td>Confirmed</td><td>Config default test</td></tr>
<tr><td>External API still works when explicitly selected.</td><td>Confirmed</td><td>Config + LLM test</td></tr>
<tr><td>Local server starts before local LLM calls.</td><td>Confirmed</td><td>LLM/local-server tests</td></tr>
<tr><td>Runtime auto-install attempts when missing.</td><td>Confirmed</td><td>Local-server install tests</td></tr>
<tr><td>Mac uses MLX; Linux/Windows use llama.cpp.</td><td>Confirmed</td><td>Backend selection tests</td></tr>
<tr><td>Concurrency defaults to 5 and is configurable.</td><td>Confirmed</td><td>Config + args tests</td></tr>
</table>
</section>
<section>
<h2>Scope</h2>
<div class="grid">
<div class="card"><strong>In scope</strong><br>Config keys, env overrides, onboarding, local server manager, runtime command construction, unit/integration tests.</div>
<div class="card"><strong>Out of scope</strong><br>Bundling model weights, adding a UI, production service management, changing prompt rules.</div>
</div>
</section>
<section>
<h2>Decision Log</h2>
<ul>
<li>User chose MLX on Mac and llama.cpp elsewhere.</li>
<li>User chose auto-install when runtime is missing.</li>
<li>User chose forcing local default unless <code>provider=external</code>.</li>
</ul>
</section>
</body>
</html>