|
| 1 | +# LLM Research Profile Implementation Plan |
| 2 | + |
| 3 | +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. |
| 4 | +
|
| 5 | +**Goal:** Make the tested LLM provider order the default for discovery, rule promotion, and cross-platform verification without committing secrets or changing trading behavior. |
| 6 | + |
| 7 | +**Architecture:** Add one shell profile loader that fills missing `OPENAI_*` provider settings from the benchmark-derived profile. Source it from the three mainline LLM scripts after `.env.local` loads and before they snapshot provider variables. Keep all keys external and preserve explicit user overrides unless force mode is enabled. |
| 8 | + |
| 9 | +**Tech Stack:** Bash shell scripts, Python `pytest`, existing OpenAI-compatible client in `poly_strategy/openai_rules.py`. |
| 10 | + |
| 11 | +--- |
| 12 | + |
| 13 | +## Files |
| 14 | + |
| 15 | +- Create: `scripts/load_llm_research_profile.sh` |
| 16 | + - Single responsibility: export benchmark-derived provider defaults. |
| 17 | +- Create: `tests/test_llm_research_profile.py` |
| 18 | + - Single responsibility: run the loader in controlled shell environments and assert exported variables. |
| 19 | +- Modify: `scripts/refresh_discovery_watchlist.sh` |
| 20 | + - Source loader after `.env.local` and proxy setup, before provider variables are copied. |
| 21 | +- Modify: `scripts/run_rule_promotion_once.sh` |
| 22 | + - Same loader source position. |
| 23 | +- Modify: `scripts/run_cross_platform_scan_once.sh` |
| 24 | + - Same loader source position before cross-platform verifier command. |
| 25 | + |
| 26 | +## Task 1: Add Loader Tests First |
| 27 | + |
| 28 | +**Files:** |
| 29 | +- Create: `tests/test_llm_research_profile.py` |
| 30 | + |
| 31 | +- [ ] **Step 1: Write failing tests** |
| 32 | + |
| 33 | +Create `tests/test_llm_research_profile.py`: |
| 34 | + |
| 35 | +```python |
| 36 | +import os |
| 37 | +import subprocess |
| 38 | +from pathlib import Path |
| 39 | + |
| 40 | + |
| 41 | +ROOT = Path(__file__).resolve().parents[1] |
| 42 | +LOADER = ROOT / "scripts" / "load_llm_research_profile.sh" |
| 43 | + |
| 44 | + |
| 45 | +def run_loader(env): |
| 46 | + command = ( |
| 47 | + "set -euo pipefail; " |
| 48 | + f"source {LOADER}; " |
| 49 | + "printf 'OPENAI_MODEL=%s\n' \"${OPENAI_MODEL-}\"; " |
| 50 | + "printf 'OPENAI_BASE_URL=%s\n' \"${OPENAI_BASE_URL-}\"; " |
| 51 | + "printf 'OPENAI_API_MODE=%s\n' \"${OPENAI_API_MODE-}\"; " |
| 52 | + "printf 'OPENAI_SECONDARY_MODEL=%s\n' \"${OPENAI_SECONDARY_MODEL-}\"; " |
| 53 | + "printf 'OPENAI_SECONDARY_BASE_URL=%s\n' \"${OPENAI_SECONDARY_BASE_URL-}\"; " |
| 54 | + "printf 'OPENAI_SECONDARY_API_MODE=%s\n' \"${OPENAI_SECONDARY_API_MODE-}\"; " |
| 55 | + "printf 'OPENAI_BACKUP_MODEL=%s\n' \"${OPENAI_BACKUP_MODEL-}\"; " |
| 56 | + "printf 'OPENAI_BACKUP_BASE_URL=%s\n' \"${OPENAI_BACKUP_BASE_URL-}\"; " |
| 57 | + "printf 'OPENAI_BACKUP_API_MODE=%s\n' \"${OPENAI_BACKUP_API_MODE-}\"; " |
| 58 | + "printf 'OPENAI_FALLBACK_MODEL=%s\n' \"${OPENAI_FALLBACK_MODEL-}\"; " |
| 59 | + "printf 'OPENAI_FALLBACK_BASE_URL=%s\n' \"${OPENAI_FALLBACK_BASE_URL-}\"; " |
| 60 | + "printf 'OPENAI_FALLBACK_API_MODE=%s\n' \"${OPENAI_FALLBACK_API_MODE-}\"" |
| 61 | + ) |
| 62 | + clean_env = { |
| 63 | + "PATH": os.environ.get("PATH", ""), |
| 64 | + "OPENAI_API_KEY": "primary-key", |
| 65 | + "OPENAI_SECONDARY_API_KEY": "secondary-key", |
| 66 | + "OPENAI_BACKUP_API_KEY": "backup-key", |
| 67 | + "OPENAI_FALLBACK_API_KEY": "fallback-key", |
| 68 | + **env, |
| 69 | + } |
| 70 | + result = subprocess.run( |
| 71 | + ["/bin/bash", "-lc", command], |
| 72 | + cwd=ROOT, |
| 73 | + env=clean_env, |
| 74 | + text=True, |
| 75 | + capture_output=True, |
| 76 | + check=True, |
| 77 | + ) |
| 78 | + return dict(line.split("=", 1) for line in result.stdout.strip().splitlines()) |
| 79 | + |
| 80 | + |
| 81 | +def test_balanced_profile_exports_benchmark_provider_order(): |
| 82 | + values = run_loader({}) |
| 83 | + |
| 84 | + assert values["OPENAI_MODEL"] == "deepseek-v3-2-251201" |
| 85 | + assert values["OPENAI_BASE_URL"] == "https://windhub.cc/v1" |
| 86 | + assert values["OPENAI_API_MODE"] == "messages" |
| 87 | + assert values["OPENAI_SECONDARY_MODEL"] == "gemini-3.1-pro-preview" |
| 88 | + assert values["OPENAI_SECONDARY_BASE_URL"] == "https://api.xn--chy-js0fk50c.top/v1" |
| 89 | + assert values["OPENAI_SECONDARY_API_MODE"] == "chat" |
| 90 | + assert values["OPENAI_BACKUP_MODEL"] == "longcat-flash-chat" |
| 91 | + assert values["OPENAI_BACKUP_BASE_URL"] == "https://elysiver.h-e.top/v1" |
| 92 | + assert values["OPENAI_BACKUP_API_MODE"] == "chat" |
| 93 | + assert values["OPENAI_FALLBACK_MODEL"] == "gpt-5.4" |
| 94 | + assert values["OPENAI_FALLBACK_BASE_URL"] == "https://api.wwcloud.app" |
| 95 | + assert values["OPENAI_FALLBACK_API_MODE"] == "responses" |
| 96 | + |
| 97 | + |
| 98 | +def test_semantic_profile_uses_high_recall_primary_only(): |
| 99 | + values = run_loader({"LLM_RESEARCH_PROFILE": "semantic"}) |
| 100 | + |
| 101 | + assert values["OPENAI_MODEL"] == "doubao-seed-1-8-251228" |
| 102 | + assert values["OPENAI_BASE_URL"] == "https://windhub.cc/v1" |
| 103 | + assert values["OPENAI_API_MODE"] == "messages" |
| 104 | + assert values["OPENAI_BACKUP_MODEL"] == "longcat-flash-chat" |
| 105 | + |
| 106 | + |
| 107 | +def test_loader_preserves_explicit_values_without_force(): |
| 108 | + values = run_loader( |
| 109 | + { |
| 110 | + "OPENAI_MODEL": "manual-model", |
| 111 | + "OPENAI_BASE_URL": "https://manual.example/v1", |
| 112 | + "OPENAI_API_MODE": "chat", |
| 113 | + } |
| 114 | + ) |
| 115 | + |
| 116 | + assert values["OPENAI_MODEL"] == "manual-model" |
| 117 | + assert values["OPENAI_BASE_URL"] == "https://manual.example/v1" |
| 118 | + assert values["OPENAI_API_MODE"] == "chat" |
| 119 | + |
| 120 | + |
| 121 | +def test_force_replaces_explicit_values(): |
| 122 | + values = run_loader( |
| 123 | + { |
| 124 | + "LLM_RESEARCH_PROFILE_FORCE": "1", |
| 125 | + "OPENAI_MODEL": "manual-model", |
| 126 | + "OPENAI_BASE_URL": "https://manual.example/v1", |
| 127 | + "OPENAI_API_MODE": "chat", |
| 128 | + } |
| 129 | + ) |
| 130 | + |
| 131 | + assert values["OPENAI_MODEL"] == "deepseek-v3-2-251201" |
| 132 | + assert values["OPENAI_BASE_URL"] == "https://windhub.cc/v1" |
| 133 | + assert values["OPENAI_API_MODE"] == "messages" |
| 134 | + |
| 135 | + |
| 136 | +def test_off_profile_makes_no_changes(): |
| 137 | + values = run_loader({"LLM_RESEARCH_PROFILE": "off"}) |
| 138 | + |
| 139 | + assert all(value == "" for value in values.values()) |
| 140 | + |
| 141 | + |
| 142 | +def test_missing_role_key_skips_that_role(): |
| 143 | + values = run_loader({"OPENAI_BACKUP_API_KEY": ""}) |
| 144 | + |
| 145 | + assert values["OPENAI_MODEL"] == "deepseek-v3-2-251201" |
| 146 | + assert values["OPENAI_BACKUP_MODEL"] == "" |
| 147 | + assert values["OPENAI_BACKUP_BASE_URL"] == "" |
| 148 | + assert values["OPENAI_BACKUP_API_MODE"] == "" |
| 149 | + |
| 150 | + |
| 151 | +def test_verbose_output_does_not_print_keys(): |
| 152 | + env = { |
| 153 | + "PATH": os.environ.get("PATH", ""), |
| 154 | + "OPENAI_API_KEY": "primary-secret", |
| 155 | + "OPENAI_SECONDARY_API_KEY": "secondary-secret", |
| 156 | + "OPENAI_BACKUP_API_KEY": "backup-secret", |
| 157 | + "OPENAI_FALLBACK_API_KEY": "fallback-secret", |
| 158 | + "LLM_RESEARCH_PROFILE_VERBOSE": "1", |
| 159 | + } |
| 160 | + result = subprocess.run( |
| 161 | + ["/bin/bash", "-lc", f"source {LOADER}"], |
| 162 | + cwd=ROOT, |
| 163 | + env=env, |
| 164 | + text=True, |
| 165 | + capture_output=True, |
| 166 | + check=True, |
| 167 | + ) |
| 168 | + |
| 169 | + combined = result.stdout + result.stderr |
| 170 | + assert "primary-secret" not in combined |
| 171 | + assert "secondary-secret" not in combined |
| 172 | + assert "backup-secret" not in combined |
| 173 | + assert "fallback-secret" not in combined |
| 174 | + assert "deepseek-v3-2-251201" in combined |
| 175 | +``` |
| 176 | + |
| 177 | +- [ ] **Step 2: Run tests to verify they fail** |
| 178 | + |
| 179 | +Run: |
| 180 | + |
| 181 | +```bash |
| 182 | +/Users/ww/Project/poly_strategy/.venv/bin/python -m pytest tests/test_llm_research_profile.py -q |
| 183 | +``` |
| 184 | + |
| 185 | +Expected: fail because `scripts/load_llm_research_profile.sh` does not exist. |
| 186 | + |
| 187 | +## Task 2: Implement Profile Loader |
| 188 | + |
| 189 | +**Files:** |
| 190 | +- Create: `scripts/load_llm_research_profile.sh` |
| 191 | + |
| 192 | +- [ ] **Step 1: Implement minimal loader** |
| 193 | + |
| 194 | +Create `scripts/load_llm_research_profile.sh`: |
| 195 | + |
| 196 | +```bash |
| 197 | +#!/usr/bin/env bash |
| 198 | + |
| 199 | +llm_profile_is_true() { |
| 200 | + case "${1:-}" in |
| 201 | + 1|true|TRUE|yes|YES|on|ON) return 0 ;; |
| 202 | + *) return 1 ;; |
| 203 | + esac |
| 204 | +} |
| 205 | + |
| 206 | +llm_profile_set_default() { |
| 207 | + local name="$1" |
| 208 | + local value="$2" |
| 209 | + if llm_profile_is_true "${LLM_RESEARCH_PROFILE_FORCE:-0}" || [[ -z "${!name:-}" ]]; then |
| 210 | + export "$name=$value" |
| 211 | + fi |
| 212 | +} |
| 213 | + |
| 214 | +llm_profile_set_role() { |
| 215 | + local key_name="$1" |
| 216 | + local model_name="$2" |
| 217 | + local base_url_name="$3" |
| 218 | + local api_mode_name="$4" |
| 219 | + local model_value="$5" |
| 220 | + local base_url_value="$6" |
| 221 | + local api_mode_value="$7" |
| 222 | + [[ -n "${!key_name:-}" ]] || return 0 |
| 223 | + llm_profile_set_default "$model_name" "$model_value" |
| 224 | + llm_profile_set_default "$base_url_name" "$base_url_value" |
| 225 | + llm_profile_set_default "$api_mode_name" "$api_mode_value" |
| 226 | +} |
| 227 | + |
| 228 | +llm_profile_print_summary() { |
| 229 | + llm_profile_is_true "${LLM_RESEARCH_PROFILE_VERBOSE:-0}" || return 0 |
| 230 | + { |
| 231 | + echo "llm_research_profile profile=${LLM_RESEARCH_PROFILE:-balanced}" |
| 232 | + echo "llm_provider role=primary model=${OPENAI_MODEL:-} api_mode=${OPENAI_API_MODE:-} base_url=${OPENAI_BASE_URL:-}" |
| 233 | + echo "llm_provider role=secondary model=${OPENAI_SECONDARY_MODEL:-} api_mode=${OPENAI_SECONDARY_API_MODE:-} base_url=${OPENAI_SECONDARY_BASE_URL:-}" |
| 234 | + echo "llm_provider role=backup model=${OPENAI_BACKUP_MODEL:-} api_mode=${OPENAI_BACKUP_API_MODE:-} base_url=${OPENAI_BACKUP_BASE_URL:-}" |
| 235 | + echo "llm_provider role=fallback model=${OPENAI_FALLBACK_MODEL:-} api_mode=${OPENAI_FALLBACK_API_MODE:-} base_url=${OPENAI_FALLBACK_BASE_URL:-}" |
| 236 | + } >&2 |
| 237 | +} |
| 238 | + |
| 239 | +llm_research_profile="${LLM_RESEARCH_PROFILE:-balanced}" |
| 240 | +case "$llm_research_profile" in |
| 241 | + off|none|disabled|0) |
| 242 | + return 0 2>/dev/null || exit 0 |
| 243 | + ;; |
| 244 | + balanced) |
| 245 | + llm_profile_set_role OPENAI_API_KEY OPENAI_MODEL OPENAI_BASE_URL OPENAI_API_MODE \ |
| 246 | + "deepseek-v3-2-251201" "https://windhub.cc/v1" "messages" |
| 247 | + ;; |
| 248 | + semantic) |
| 249 | + llm_profile_set_role OPENAI_API_KEY OPENAI_MODEL OPENAI_BASE_URL OPENAI_API_MODE \ |
| 250 | + "doubao-seed-1-8-251228" "https://windhub.cc/v1" "messages" |
| 251 | + ;; |
| 252 | + *) |
| 253 | + echo "unsupported LLM_RESEARCH_PROFILE: $llm_research_profile" >&2 |
| 254 | + return 2 2>/dev/null || exit 2 |
| 255 | + ;; |
| 256 | +esac |
| 257 | + |
| 258 | +llm_profile_set_role OPENAI_SECONDARY_API_KEY OPENAI_SECONDARY_MODEL OPENAI_SECONDARY_BASE_URL OPENAI_SECONDARY_API_MODE \ |
| 259 | + "gemini-3.1-pro-preview" "https://api.xn--chy-js0fk50c.top/v1" "chat" |
| 260 | +llm_profile_set_role OPENAI_BACKUP_API_KEY OPENAI_BACKUP_MODEL OPENAI_BACKUP_BASE_URL OPENAI_BACKUP_API_MODE \ |
| 261 | + "longcat-flash-chat" "https://elysiver.h-e.top/v1" "chat" |
| 262 | +llm_profile_set_role OPENAI_FALLBACK_API_KEY OPENAI_FALLBACK_MODEL OPENAI_FALLBACK_BASE_URL OPENAI_FALLBACK_API_MODE \ |
| 263 | + "gpt-5.4" "https://api.wwcloud.app" "responses" |
| 264 | + |
| 265 | +llm_profile_print_summary |
| 266 | +``` |
| 267 | + |
| 268 | +- [ ] **Step 2: Run loader tests** |
| 269 | + |
| 270 | +Run: |
| 271 | + |
| 272 | +```bash |
| 273 | +/Users/ww/Project/poly_strategy/.venv/bin/python -m pytest tests/test_llm_research_profile.py -q |
| 274 | +``` |
| 275 | + |
| 276 | +Expected: all tests in `tests/test_llm_research_profile.py` pass. |
| 277 | + |
| 278 | +- [ ] **Step 3: Commit** |
| 279 | + |
| 280 | +```bash |
| 281 | +git add scripts/load_llm_research_profile.sh tests/test_llm_research_profile.py |
| 282 | +git commit -m "Add LLM research provider profile" |
| 283 | +git push |
| 284 | +``` |
| 285 | + |
| 286 | +## Task 3: Source Loader from Mainline Scripts |
| 287 | + |
| 288 | +**Files:** |
| 289 | +- Modify: `scripts/refresh_discovery_watchlist.sh` |
| 290 | +- Modify: `scripts/run_rule_promotion_once.sh` |
| 291 | +- Modify: `scripts/run_cross_platform_scan_once.sh` |
| 292 | + |
| 293 | +- [ ] **Step 1: Write integration test** |
| 294 | + |
| 295 | +Append to `tests/test_llm_research_profile.py`: |
| 296 | + |
| 297 | +```python |
| 298 | +def test_mainline_scripts_source_research_profile_loader(): |
| 299 | + scripts = [ |
| 300 | + ROOT / "scripts" / "refresh_discovery_watchlist.sh", |
| 301 | + ROOT / "scripts" / "run_rule_promotion_once.sh", |
| 302 | + ROOT / "scripts" / "run_cross_platform_scan_once.sh", |
| 303 | + ] |
| 304 | + |
| 305 | + for script in scripts: |
| 306 | + text = script.read_text(encoding="utf-8") |
| 307 | + assert "source scripts/load_llm_research_profile.sh" in text |
| 308 | +``` |
| 309 | + |
| 310 | +- [ ] **Step 2: Run integration test to verify it fails** |
| 311 | + |
| 312 | +Run: |
| 313 | + |
| 314 | +```bash |
| 315 | +/Users/ww/Project/poly_strategy/.venv/bin/python -m pytest tests/test_llm_research_profile.py::test_mainline_scripts_source_research_profile_loader -q |
| 316 | +``` |
| 317 | + |
| 318 | +Expected: fail because the scripts do not source the loader yet. |
| 319 | + |
| 320 | +- [ ] **Step 3: Modify each script** |
| 321 | + |
| 322 | +In each target script, add this after `.env.local` loading and proxy setup, before provider variables are assigned: |
| 323 | + |
| 324 | +```bash |
| 325 | +if [[ -f scripts/load_llm_research_profile.sh ]]; then |
| 326 | + # shellcheck disable=SC1091 |
| 327 | + source scripts/load_llm_research_profile.sh |
| 328 | +fi |
| 329 | +``` |
| 330 | + |
| 331 | +For `scripts/run_cross_platform_scan_once.sh`, add it after `.env.local` is sourced and before LLM verification command variables are consumed. |
| 332 | + |
| 333 | +- [ ] **Step 4: Run integration test** |
| 334 | + |
| 335 | +Run: |
| 336 | + |
| 337 | +```bash |
| 338 | +/Users/ww/Project/poly_strategy/.venv/bin/python -m pytest tests/test_llm_research_profile.py -q |
| 339 | +``` |
| 340 | + |
| 341 | +Expected: all profile tests pass. |
| 342 | + |
| 343 | +- [ ] **Step 5: Commit** |
| 344 | + |
| 345 | +```bash |
| 346 | +git add scripts/refresh_discovery_watchlist.sh scripts/run_rule_promotion_once.sh scripts/run_cross_platform_scan_once.sh tests/test_llm_research_profile.py |
| 347 | +git commit -m "Wire LLM profile into main scripts" |
| 348 | +git push |
| 349 | +``` |
| 350 | + |
| 351 | +## Task 4: Final Verification |
| 352 | + |
| 353 | +**Files:** |
| 354 | +- Verify all changed files. |
| 355 | + |
| 356 | +- [ ] **Step 1: Static shell syntax** |
| 357 | + |
| 358 | +Run: |
| 359 | + |
| 360 | +```bash |
| 361 | +bash -n scripts/load_llm_research_profile.sh |
| 362 | +bash -n scripts/refresh_discovery_watchlist.sh |
| 363 | +bash -n scripts/run_rule_promotion_once.sh |
| 364 | +bash -n scripts/run_cross_platform_scan_once.sh |
| 365 | +``` |
| 366 | + |
| 367 | +Expected: no output and exit code 0. |
| 368 | + |
| 369 | +- [ ] **Step 2: Python compile** |
| 370 | + |
| 371 | +Run: |
| 372 | + |
| 373 | +```bash |
| 374 | +/Users/ww/Project/poly_strategy/.venv/bin/python -m py_compile scripts/*.py poly_strategy/*.py |
| 375 | +``` |
| 376 | + |
| 377 | +Expected: no output and exit code 0. |
| 378 | + |
| 379 | +- [ ] **Step 3: Full test suite** |
| 380 | + |
| 381 | +Run: |
| 382 | + |
| 383 | +```bash |
| 384 | +/Users/ww/Project/poly_strategy/.venv/bin/python -m pytest -q |
| 385 | +``` |
| 386 | + |
| 387 | +Expected: all tests pass. |
| 388 | + |
| 389 | +- [ ] **Step 4: Inspect git status** |
| 390 | + |
| 391 | +Run: |
| 392 | + |
| 393 | +```bash |
| 394 | +git status --short --branch |
| 395 | +``` |
| 396 | + |
| 397 | +Expected: clean branch on `main...origin/main`. |
0 commit comments