Skip to content

Commit 9cf3bd1

Browse files
authored
Merge branch 'master' into fix/mcp-image-content-panic
2 parents 38952ed + 415b561 commit 9cf3bd1

653 files changed

Lines changed: 58361 additions & 3023 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.agents/backend-signing.md

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,8 @@ side (`pkg/oci/cosignverify` plus the gallery YAML).
1616
per-arch manifest before checking signatures.
1717
- **Storage:** Signatures are written as OCI 1.1 referrers
1818
(`--registry-referrers-mode=oci-1-1`) in the new Sigstore bundle format
19-
(`--new-bundle-format`). No `:sha256-<hex>.sig` tag clutter.
19+
(current cosign releases do this by default; no `--new-bundle-format`
20+
flag). No `:sha256-<hex>.sig` tag clutter.
2021
- **Consumer:** `pkg/oci/cosignverify` discovers the bundle via the
2122
referrers API, hands it to `sigstore-go`, and verifies it against the
2223
policy declared in the gallery YAML (`Gallery.Verification`).
@@ -33,22 +34,27 @@ to sign. The job needs:
3334

3435
- `permissions: { id-token: write, contents: read }` at the job level so
3536
the runner can exchange its GitHub OIDC token for a Fulcio cert.
36-
- `sigstore/cosign-installer@v3` step (cosign ≥ 2.2 for
37-
`--new-bundle-format`).
37+
- `sigstore/cosign-installer@v3` step (current cosign releases already
38+
default to the new bundle format).
3839
- After each `docker buildx imagetools create`, resolve the resulting
3940
list digest with `docker buildx imagetools inspect <tag> --format
4041
'{{.Manifest.Digest}}'` and sign:
4142

4243
```sh
4344
cosign sign --yes --recursive \
44-
--new-bundle-format \
4545
--registry-referrers-mode=oci-1-1 \
4646
"${REGISTRY_REPO}@${DIGEST}"
4747
```
4848

4949
Sign by digest, never by tag — signing by tag binds the signature to
5050
whatever the tag points at *now*, and a subsequent tag push orphans it.
5151

52+
`--registry-referrers-mode=oci-1-1` is still gated behind
53+
`COSIGN_EXPERIMENTAL=1` in cosign v2.4.x (set at the job env level in
54+
`backend_merge.yml`). Re-evaluate when bumping the pinned cosign release
55+
— newer versions are expected to graduate this flag and the env var can
56+
then be dropped.
57+
5258
`backend_build_darwin.yml` builds and pushes single-arch darwin images
5359
that bypass the manifest-list merge. If/when those entries get a gallery
5460
`verification:` policy, the equivalent cosign step has to land there

.agents/building-and-testing.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,3 +15,35 @@ Let's say the user wants to build a particular backend for a given platform. For
1515
- Unless the user specifies that they want you to run the command, then just print it because not all agent frontends handle long running jobs well and the output may overflow your context
1616
- The user may say they want to build AMD or ROCM instead of hipblas, or Intel instead of SYCL or NVIDIA insted of l4t or cublas. Ask for confirmation if there is ambiguity.
1717
- Sometimes the user may need extra parameters to be added to `docker build` (e.g. `--platform` for cross-platform builds or `--progress` to view the full logs), in which case you can generate the `docker build` command directly.
18+
19+
## Test coverage gate
20+
21+
The core Go suites (`./pkg`, `./core`, plus the in-process integration suite `./tests/e2e`) are covered by a **strict, monotonic coverage ratchet**:
22+
23+
- `make test-coverage` — runs the suites with `covermode=atomic` instrumentation and writes a merged profile to `coverage/coverage.out`. Uses the same prerequisites as `make test`.
24+
- **`--coverpkg` (`COVERAGE_COVERPKG = core/...,pkg/...`):** coverage is attributed to the core+pkg packages, not just the package under test. This is what lets the in-process `tests/e2e` suite (which drives the real HTTP server over loopback via `application.New`) credit the `core/http/endpoints/...` handlers it exercises — folding it in roughly doubled endpoint coverage (e.g. `endpoints/openai` 13.6% → 52%). The denominator is therefore *all* of `core`+`pkg` (minus generated proto, dropped via `COVERAGE_EXCLUDE_RE`), so the number isn't comparable to a plain per-package figure.
25+
- **Integration suites (`COVERAGE_E2E_ROOTS = ./tests/e2e`)** run non-recursively (excludes `tests/e2e/distributed`, which needs containers) with `--label-filter=!real-models` (those need a downloaded model) against the mock backend built by `prepare-test`. `tests/integration` is deliberately excluded — it needs `make backends/local-store`, which the coverage CI job doesn't build.
26+
- **Flake note:** folding integration tests into a *strict* gate means a hard e2e failure (or a spec that silently stops running) can fail the coverage gate, not just the test. `--flake-attempts` absorbs transient retryable failures; covermode=atomic keeps line coverage deterministic otherwise.
27+
- **Why one ginkgo run per root (`scripts/run-coverage.sh`):** passing several recursive roots to a *single* ginkgo invocation (e.g. `ginkgo -r ./pkg ./core`) only merges **one** root's coverprofile into `--output-dir`/`--coverprofile` — the others are silently dropped. Verified with ginkgo 2.29.0: `-r ./pkg ./core` yields only `./pkg` coverage, while `-r ./core` alone yields all 34 core packages. So the script runs each root separately and concatenates the (disjoint) profiles. Don't "simplify" it back to a single multi-root invocation — that's how `core/` (including all of `core/http`, ~7.4k statements) silently vanished from the number before.
28+
- **Build tags (`COVERAGE_TAGS`, passed via `GINKGO_TAGS`):** defaults to `debug auth`. The `auth` tag is required to compile the real (sqlite-backed) auth implementation and its ~150 `//go:build auth` tests — without it those files aren't built, the tests don't run, and the gate scores auth against a stub (~3.7% instead of ~38%). If you add new tag-gated tests, extend `COVERAGE_TAGS` or they won't count (and likely won't run in CI at all).
29+
- `make test-coverage-check` — runs `test-coverage`, then `scripts/coverage-check.sh` fails the build if total coverage is **below** the committed baseline in `coverage-baseline.txt`. The Linux job in `.github/workflows/test.yml` runs this instead of `make test`.
30+
- `make test-coverage-baseline` — regenerates and overwrites `coverage-baseline.txt` from the current run.
31+
- `make install-hooks` — sets `core.hooksPath` to the versioned `.githooks/`, whose `pre-commit` runs checks scoped to what's staged: Go changes → `make lint` + `make test-coverage-check`; `core/http/react-ui/` changes → `make test-ui-coverage-check` (Playwright e2e + UI coverage gate). A commit touching neither is skipped; bypass with `git commit --no-verify`. The hook resolves golangci-lint's new-from base to `upstream/master``origin/master``master`, so it works from a fork clone where `origin/master` is stale (passed to `make lint` via `LINT_NEW_FROM`).
32+
33+
### React UI coverage
34+
35+
The React UI (`core/http/react-ui/`) has **no component/unit tests** — its only tests are the Playwright e2e specs in `e2e/`, which run against the real app served by `tests/e2e-ui/ui-test-server` (the dist is `//go:embed`ed, so the server is rebuilt per coverage run). Those specs do genuinely exercise the UI (clicks, `fill`, `setInputFiles`, `getByRole`/`getByText`, visibility/value assertions).
36+
37+
- `make test-ui-coverage` — builds an istanbul-instrumented bundle (`COVERAGE=true`, via `vite-plugin-istanbul` with `forceBuildInstrument: true` — the plugin skips production builds otherwise), re-embeds it into `ui-test-server` (the dist is `//go:embed`ed), runs the Playwright specs, and writes an `nyc` report to `core/http/react-ui/coverage/`. The specs import `{ test, expect }` from `e2e/coverage-fixtures.js` (re-exports Playwright's, plus harvests `window.__coverage__` into `.nyc_output/` after each test). Instrumentation is off unless `COVERAGE=true`, so dev/prod builds and plain `make test-ui-e2e` are unaffected (the fixture no-ops when `window.__coverage__` is absent).
38+
- **Browser:** the flake dev shell ships `chromium` and exports `PLAYWRIGHT_CHROMIUM_PATH`; `playwright.config.js` uses it via `launchOptions.executablePath`, and the Makefile skips `playwright install` when it's set. This avoids Playwright's downloaded browser, which can't resolve system libs (`libglib-2.0`, …) on NixOS. In CI (no `PLAYWRIGHT_CHROMIUM_PATH`) the Makefile falls back to `playwright install --with-deps chromium`.
39+
- The app is a React SPA, so coverage accumulates across in-app navigation within a test; a full `page.goto`/reload resets it.
40+
- `.nycrc.json` uses `all: true`, so **every `src/**` file is in the report**, including 0%-coverage ones — that's how you spot features with no test at all (sort the HTML report or `coverage-summary.json` by line% ascending).
41+
- **UI coverage gate:** `make test-ui-coverage-check` runs the suite then `scripts/ui-coverage-check.sh`, failing if total line coverage drops more than `UI_COVERAGE_TOLERANCE` below `core/http/react-ui/coverage-baseline.txt`. `make test-ui-coverage-baseline` regenerates the baseline. Runs in CI (`tests-ui-e2e.yml`) and pre-commit on `core/http/react-ui/` changes.
42+
- **Why it has a tolerance (unlike the strict Go gate):** UI e2e coverage is *non-deterministic*. Specs that assert on state and end while async/lazy render work is still in flight collect those lines only when the render beats the coverage teardown — so the total drifts with machine speed/load (a fast local box reads higher than a slow CI runner), diffusely across many specs. The tolerance absorbs that drift, so set the baseline *below* the slow-CI floor, never to a fast-local `make test-ui-coverage-baseline` number, or CI flaps.
43+
- **Raising coverage is cheap:** a *render-smoke* spec (navigate to a route, assert its header renders) mounts a lazy page and runs its full render + initial effects, capturing most of its lines in a few lines of test — see `e2e/page-render-smoke.spec.js`. Auth is disabled in the test server (`isAdmin=true`), so `RequireAdmin`/`RequireFeature` routes render without a mock. The most *deterministic* win is removing a race: make a spec `await` a rendered element before ending (see `e2e/agents.spec.js` → AgentCreate) so its lines count every run.
44+
45+
Rules (both gates):
46+
- **Install the hooks:** `make install-hooks` once per clone so lint + coverage run pre-commit. Don't lean on CI for what the hook catches.
47+
- **Don't work around the gate:** never `git commit --no-verify`, and never hand-lower a baseline or widen a tolerance to turn a red gate green. The ratchet only moves up.
48+
- If a change drops coverage, **add tests** (sort `coverage-summary.json` by line% ascending to find untested code) rather than editing the baseline. When coverage legitimately rises, commit the regenerated baseline (`make test-coverage-baseline` / `test-ui-coverage-baseline`).
49+
- The Go gate is **strict — no tolerance**; `covermode=atomic` keeps it deterministic. The UI gate keeps a small tolerance only because its e2e coverage isn't.

.agents/coding-style.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,17 @@ Do not mix styles within a package. If you are extending tests in a package that
5050

5151
This is enforced by `golangci-lint` via the `forbidigo` linter (see `.golangci.yml`); calls like `t.Errorf` / `t.Fatalf` / `t.Run` / `t.Skip` / `t.Logf` are flagged. Run `make lint` locally before submitting; the same check runs in CI (`.github/workflows/lint.yml`).
5252

53+
## Outbound HTTP
54+
55+
All outbound HTTP must go through `github.com/mudler/LocalAI/pkg/httpclient` rather than the standard library's default client. Use `httpclient.New(...)` (no body deadline — safe for streaming/SSE) or `httpclient.NewWithTimeout(d, ...)` (simple request/response). Both **refuse redirects by default** and set a TLS 1.2 floor.
56+
57+
The reason is GHSA-3mj3-57v2-4636: the std default client follows redirects, and on a *cross-host* redirect Go forwards custom credential headers (e.g. Anthropic's `x-api-key`) to the redirect target, leaking the secret. `httpclient` fails closed instead.
58+
59+
- Need to follow redirects (download CDNs, registry blobs, GitHub asset URLs)? Pass `httpclient.WithFollowRedirects()` — it still strips credential headers on any cross-host hop.
60+
- Have a custom transport (IP-pinned dialer, HTTP/2 tuning, a credential-injecting `RoundTripper`)? Pass `httpclient.WithTransport(rt)`, basing the transport on `httpclient.HardenedTransport()` to keep the TLS floor. Handed a `*http.Client` by a library? `httpclient.Harden(c)` applies the policy in place.
61+
62+
This is enforced by `forbidigo` (see `.golangci.yml`): `http.DefaultClient` and `http.Get`/`Post`/`PostForm`/`Head` are flagged. The `&http.Client{}` composite literal can't be matched precisely by forbidigo without also flagging legitimate `*http.Client` type references, so that form is caught by review — don't construct raw clients.
63+
5364
## Documentation
5465

5566
The project documentation is located in `docs/content`. When adding new features or changing existing functionality, it is crucial to update the documentation to reflect these changes. This helps users understand how to use the new capabilities and ensures the documentation stays relevant.

.agents/ds4-backend.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,34 @@ go test -count=1 -timeout=30m -v ./tests/e2e-backends/...
6868

6969
CI does not load the model; the suite is opt-in via env vars.
7070

71+
## Distributed mode
72+
73+
ds4 supports **layer-split** distributed inference (a model too big for one host,
74+
split by transformer layer; the GGUF must be present on every machine, each loads
75+
only its slice). Topology is **inverted** vs llama.cpp: the coordinator listens,
76+
workers dial in.
77+
78+
- **`ds4-worker` binary**: built and packaged next to `grpc-server` (`package.sh`
79+
copies it into `package/`). Links the same engine objects plus `ds4_distributed.o`;
80+
**no gRPC/protobuf dependency** (speaks ds4's own TCP transport), so it builds
81+
even where `grpc-server` can't. Runs the worker serving loop (`ds4_dist_run`).
82+
- **Coordinator wiring**: the ds4 `grpc-server` acts as coordinator when `LoadModel`
83+
`ModelOptions.Options` (from model-YAML `options:`) carry:
84+
- `ds4_role:coordinator` (enables distributed mode; absent → single-node, back-compat)
85+
- `ds4_layers:0:19` (coordinator's own slice, inclusive; `N:output` includes the head)
86+
- `ds4_listen:0.0.0.0:1234` (address workers dial into)
87+
- `ds4_route_timeout:60` (optional; seconds Predict/PredictStream wait for the route
88+
to form before returning gRPC `UNAVAILABLE`; default 60)
89+
- **Worker CLI**: `local-ai worker ds4-distributed -- <ds4-worker args>` resolves the
90+
ds4 backend and execs the packaged `ds4-worker` (raw passthrough), e.g.
91+
`--role worker --model /models/ds4flash.gguf --layers 20:output --coordinator <host> 1234`.
92+
93+
Opt-in e2e in `tests/e2e-backends/backend_test.go`, gated by
94+
`BACKEND_TEST_DS4_DISTRIBUTED=1` (plus `BACKEND_TEST_DS4_WORKER_BINARY`,
95+
`BACKEND_TEST_DS4_WORKER_LAYERS`, `BACKEND_TEST_DS4_COORDINATOR_LAYERS`,
96+
`BACKEND_TEST_DS4_LISTEN`). Design spec:
97+
`docs/superpowers/specs/2026-05-30-ds4-distributed-inference-design.md`.
98+
7199
## Importer
72100

73101
`core/gallery/importers/ds4.go` (`DS4Importer`) auto-detects ds4 weights by

.dockerignore

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
.devcontainer
55
models
66
backends
7+
volumes
78
examples/chatbot-ui/models
89
backend/go/image/stablediffusion-ggml/build/
910
backend/go/*/build
@@ -21,3 +22,27 @@ __pycache__
2122
# backend virtual environments
2223
**/venv
2324
backend/python/**/source
25+
26+
# In-place llama.cpp clone + per-variant build copies. The Makefile
27+
# clones llama.cpp itself at the pinned LLAMA_VERSION; if a stale
28+
# local checkout is COPY'd into the image, the `llama.cpp:` target
29+
# sees the directory and skips re-cloning, so grpc-server.cpp ends
30+
# up compiled against whatever (likely older) commit the host had.
31+
backend/cpp/llama-cpp/llama.cpp
32+
backend/cpp/llama-cpp-*-build
33+
34+
# Rust backend build output (sources are tracked; target/ is generated)
35+
backend/rust/*/target
36+
37+
# Local-only artifacts that bloat the build context but the image never needs.
38+
# Saved image tarballs, locally-installed backends, the host-built binary, and
39+
# assorted tool/scratch dirs. None of these are git-tracked.
40+
backend-images
41+
local-backends
42+
local-ai
43+
.crush
44+
protoc
45+
tests
46+
47+
# Installed via npm inside the build stage; no need to ship the host copy.
48+
**/node_modules

.githooks/pre-commit

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
#!/usr/bin/env sh
2+
#
3+
# LocalAI pre-commit hook. Install it (once per clone) with:
4+
#
5+
# make install-hooks
6+
#
7+
# Runs only the checks relevant to what's staged:
8+
# - Go files -> make lint + make test-coverage-check
9+
# - core/http/react-ui -> make test-ui-coverage-check (Playwright e2e + gate)
10+
# A commit touching neither is skipped entirely (docs/YAML/etc. can't change
11+
# lint findings, Go coverage, or the UI).
12+
#
13+
# To bypass for a single commit (e.g. a WIP checkpoint): git commit --no-verify
14+
set -eu
15+
16+
repo_root="$(git rev-parse --show-toplevel)"
17+
cd "$repo_root"
18+
19+
staged="$(git diff --cached --name-only --diff-filter=ACMRD)"
20+
21+
go_changed=0
22+
ui_changed=0
23+
if echo "$staged" | grep -qE '\.go$'; then go_changed=1; fi
24+
if echo "$staged" | grep -qE '^core/http/react-ui/'; then ui_changed=1; fi
25+
26+
if [ "$go_changed" -eq 0 ] && [ "$ui_changed" -eq 0 ]; then
27+
echo "pre-commit: no Go or React UI changes staged — skipping."
28+
exit 0
29+
fi
30+
31+
if [ "$go_changed" -eq 1 ]; then
32+
# Resolve the ref golangci-lint's new-from-merge-base should compare
33+
# against. .golangci.yml pins origin/master, which is correct in CI
34+
# (origin == the canonical repo) but wrong from a fork clone, where
35+
# origin/master lags behind and lint would report the whole upstream
36+
# backlog. Prefer upstream/master, then origin/master, then master.
37+
lint_base=""
38+
for ref in upstream/master origin/master master; do
39+
if git rev-parse --verify --quiet "${ref}^{commit}" >/dev/null 2>&1; then
40+
lint_base="$ref"
41+
break
42+
fi
43+
done
44+
45+
echo "pre-commit ▶ golangci-lint (make lint${lint_base:+, new-from $lint_base})"
46+
make lint LINT_NEW_FROM="$lint_base"
47+
48+
echo "pre-commit ▶ coverage gate (make test-coverage-check) — builds and runs the"
49+
echo " pkg/core suites plus tests/e2e; can take a few minutes."
50+
make test-coverage-check
51+
fi
52+
53+
if [ "$ui_changed" -eq 1 ]; then
54+
echo "pre-commit ▶ React UI e2e + coverage gate (make test-ui-coverage-check) —"
55+
echo " rebuilds the UI + ui-test-server, runs the Playwright specs, and"
56+
echo " fails if line coverage regressed; can take a couple of minutes."
57+
make test-ui-coverage-check
58+
fi
59+
60+
echo "pre-commit ✓ all relevant checks passed"

0 commit comments

Comments
 (0)