QVAC-18793: split LlamaFinetuner out of LlamaModel by jpgaribotti · Pull Request #1996 · tetherto/qvac

jpgaribotti · 2026-05-12T09:48:28Z

🎯 What problem does this PR solve?

LlamaModel had grown to own both the inference path and the entire LoRA finetune pipeline (pause/resume checkpoint state, training loop, dataset prep, optimizer/scheduler), which made lifetime and locking hard to reason about.
Finetune-only members (checkpointStateMutex_, currentCheckpointState_, pausedCheckpointState_) and ~20 private helpers cluttered LlamaModel.hpp even though they're only relevant on the finetune path.
A latent deprecation warning from llama_adapter_lora_free (deprecated in current llama.cpp: "adapters are now freed together with the associated model") was building noise into compiles.

📝 How does it solve it?

Introduces a dedicated LlamaFinetuner class that owns the finetune pipeline and pause/resume state. The class holds a reference back to its owning LlamaModel (composition; lifetime guaranteed by member ordering — declared last so it's destroyed first).
LlamaModel exposes the finetuner via finetuner() accessors; process() delegates the finetune dispatch to finetuner_.finetune(...). friend class LlamaFinetuner lets the new class access the bits of LlamaModel it still needs (stateMtx_, state_, metadata_, reload(), getContext(), getModel()) — intended to be removed once those access points get small accessor helpers.
FinetuneTerminalResult and the ProgressCallback alias move with the implementation. FinetuneConfigOverrides stays on LlamaModel because reload() / tuneConfigMap() on the inference path consume it.
Pure move otherwise: method bodies, locking, and the STANDALONE_TEST_BUILD guards are preserved.
Drops the now-deprecated llama_adapter_lora_free deleter on the resume path. Adapters are tied to model lifetime in current llama.cpp, and the surrounding reload(FinetuneConfigOverrides{}) calls on both the happy and error paths destroy + rebuild the model.
AddonJs.hpp updated to call through llamaModel->finetuner().{isFinetuneRunning,requestPause,waitUntilFinetuningPauseComplete}() and to use LlamaFinetuner::ProgressCallback.
CMakeLists.txt compiles LlamaFinetuner.cpp into both the addon and cli_tool targets.

🧪 How was it tested?

Existing finetune integration tests in packages/llm-llamacpp/test/integration/ exercise the moved code paths (train + pause/resume + adapter save). No test signatures changed.
C++ unit tests build with STANDALONE_TEST_BUILD; LlamaFinetuner.cpp is fully guarded by #ifndef STANDALONE_TEST_BUILD so the test build sees only the inline ctor/dtor (which is all LlamaModel.cpp references in that mode).

Move the LoRA finetune pipeline and pause/resume checkpoint state into a dedicated `LlamaFinetuner` class that holds a reference to the owning `LlamaModel`. `LlamaModel` exposes the finetuner via a `finetuner()` accessor and delegates the in-`process()` finetune dispatch to it. `FinetuneTerminalResult` and the `ProgressCallback` alias move with the implementation; `FinetuneConfigOverrides` stays on `LlamaModel` since it is also consumed by `reload()` / `tuneConfigMap()` on the inference path. Pure move: all method bodies, locking, and the `STANDALONE_TEST_BUILD` guards are preserved. The split isolates finetune-only state from the inference path, making lifetime/locking easier to reason about and opening up follow-up simplifications (collapsible clear/pause helpers, dropping the `friend` once cache-flush is exposed on `LlamaModel`). Also drops the now-deprecated llama_adapter_lora_free deleter on the resume path — adapters are tied to the model lifetime in current llama.cpp, and the surrounding reload(FinetuneConfigOverrides{}) calls already destroy + rebuild the model on both happy and error paths.

gianni-cor · 2026-05-13T15:06:50Z

Could you also update packages/llm-llamacpp/docs/finetuning.md as part of this refactor?

A few sections still describe the old ownership/API shape: LlamaModel::finetune(), LlamaModel::requestPause(), checkpoint state stored in LlamaModel, and requestPause() directly calling llama_opt_request_stop(ctx). With this PR, LlamaModel::process() delegates to finetuner_.finetune(...), pause/resume state lives on LlamaFinetuner, and requestPause() only sets pauseRequested; the optimizer stop happens later from the finetuning helper callback path.

This is docs-only, but the current text will be misleading for anyone debugging pause/resume or following the backend flow diagrams after the split.

github-actions · 2026-05-13T15:07:33Z

Tier-based Approval Status

**PR Tier:** TIER1

**Current Status:** ✅ APPROVED

**Requirements:**
- 1 Team Member approval ✅ (1/1)
- 1 Team Lead OR Management approval ✅ (1/1)



---
*This comment is automatically updated when reviews change.*

Bump @qvac/llm-llamacpp from 0.20.0 to 0.21.0 and add the matching CHANGELOG.md entry for PR tetherto#1996 (LlamaFinetuner refactor). The 0.21.0 release is a pure internal C++ refactor of the addon: the LoRA finetuning pipeline now lives in its own LlamaFinetuner class instead of inside LlamaModel. No JS API change, no behaviour change. Also drops the deprecated llama_adapter_lora_free deleter on the resume path, eliminating the three -Wdeprecated-declarations warnings called out in 0.20.0.

Bring the implementation notes, UML sequence diagrams, and component tables in finetuning.md back in sync with the LlamaFinetuner refactor:

gianni-cor · 2026-05-14T18:28:05Z

/review

jpgaribotti requested review from a team as code owners May 12, 2026 09:48

jpgaribotti had a problem deploying to release May 12, 2026 09:48 — with GitHub Actions Failure

jpgaribotti had a problem deploying to release May 12, 2026 09:50 — with GitHub Actions Error

jpgaribotti temporarily deployed to release May 12, 2026 09:50 — with GitHub Actions Inactive

jpgaribotti had a problem deploying to release May 12, 2026 09:50 — with GitHub Actions Error

jpgaribotti temporarily deployed to release May 12, 2026 09:50 — with GitHub Actions Inactive

jpgaribotti self-assigned this May 12, 2026

fix: clean up finetuner refactor build issues

466cc8a

jpgaribotti requested a deployment to release May 12, 2026 10:25 — with GitHub Actions Waiting

jpgaribotti requested a deployment to release May 12, 2026 10:26 — with GitHub Actions Waiting

jpgaribotti added the verify label May 12, 2026

jpgaribotti had a problem deploying to release May 12, 2026 14:59 — with GitHub Actions Failure

jpgaribotti temporarily deployed to release May 12, 2026 15:01 — with GitHub Actions Inactive

jpgaribotti had a problem deploying to release May 12, 2026 15:01 — with GitHub Actions Error

jpgaribotti temporarily deployed to release May 12, 2026 15:01 — with GitHub Actions Inactive

jpgaribotti had a problem deploying to release May 12, 2026 15:27 — with GitHub Actions Error

jpgaribotti had a problem deploying to release May 12, 2026 15:27 — with GitHub Actions Failure

jpgaribotti temporarily deployed to release May 12, 2026 15:27 — with GitHub Actions Inactive

jpgaribotti had a problem deploying to release May 12, 2026 15:27 — with GitHub Actions Failure

jpgaribotti and others added 5 commits May 13, 2026 19:22

doc: align finetuning.md with LlamaFinetuner split

4886b4d

Bring the implementation notes, UML sequence diagrams, and component tables in finetuning.md back in sync with the LlamaFinetuner refactor:

Merge remote-tracking branch 'upstream/main' into llm-refactor-finetune

0b6fca8

Merge branch 'main' into llm-refactor-finetune

f3cba3a

Merge branch 'main' into llm-refactor-finetune

90fd1a6

gianni-cor approved these changes May 14, 2026

View reviewed changes

Merge branch 'main' into llm-refactor-finetune

02fce3f

dev-nid approved these changes May 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QVAC-18793: split LlamaFinetuner out of LlamaModel#1996

QVAC-18793: split LlamaFinetuner out of LlamaModel#1996
jpgaribotti wants to merge 10 commits into
tetherto:mainfrom
jpgaribotti:llm-refactor-finetune

jpgaribotti commented May 12, 2026

Uh oh!

gianni-cor commented May 13, 2026

Uh oh!

github-actions Bot commented May 13, 2026 •

edited

Loading

Uh oh!

gianni-cor commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jpgaribotti commented May 12, 2026

🎯 What problem does this PR solve?

📝 How does it solve it?

🧪 How was it tested?

Uh oh!

gianni-cor commented May 13, 2026

Uh oh!

github-actions Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tier-based Approval Status

Uh oh!

gianni-cor commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions Bot commented May 13, 2026 •

edited

Loading