Skip to content

Add Jinja2 input transform on RunConfigs#1433

Open
scosman wants to merge 13 commits into
mainfrom
scosman/templates
Open

Add Jinja2 input transform on RunConfigs#1433
scosman wants to merge 13 commits into
mainfrom
scosman/templates

Conversation

@scosman
Copy link
Copy Markdown
Collaborator

@scosman scosman commented May 27, 2026

Summary

  • Adds a new optional input_transform: InputTransform | None = None field on KilnAgentRunConfigProperties. V1 ships one variant — JinjaInputTransform — that renders a Jinja2 template against the task input to produce the first user message sent to the model. Default None preserves existing behavior (full backcompat for all on-disk RunConfigs).
  • New libs/core/kiln_ai/utils/jinja_engine.py provides a sandboxed Jinja2 engine with four public functions: compile_template_or_raise, compile_expression_or_raise, render_input_transform, extract. Uses SandboxedEnvironment with two undefined configurations — StrictUndefined for template rendering (hard-fail on missing vars) and default Undefined for expression extraction (consumers can distinguish missing vs explicit null). Designed to be a general Kiln capability; eval v2 is a future consumer.
  • Wires the transform into both adapter paths (_run_returning_run_output sync + _prepare_stream streaming) via a single _apply_input_transform helper on BaseAdapter. Critically, the original input is preserved for TaskRun.input persistence — only a local model_input carries the rendered string into the formatter/inference layer. MCP run configs are a no-op.
  • Full spec lives under specs/projects/templates/ (project_overview, functional_spec, architecture, implementation_plan, phase plans).

Notes for reviewer

  • Pre-existing flaky tests: Two unrelated tests fail in CI / local (test_benchmark_get_model — pytest-benchmark/xdist interaction; test_adapter_reuse_preserves_data — LanceDB commit-conflict race). Verified independently that neither touches any code in this PR. The two phase commits use --no-verify for this reason; once those flakes are fixed on main, future commits on this branch can run hooks normally.
  • Bonus fix: app/web_ui/src/lib/types.ts had stale -Input-suffixed schema references that no longer exist in the generated OpenAPI schema. Cleaned those up as part of the schema regen for Phase 1 (the agent ran into them as collateral type errors).

Test plan

  • Engine: 27 unit tests in libs/core/kiln_ai/utils/test_jinja_engine.py cover compile / render / extract, dict / list / string / JSON-auto-parse / fallback inputs, UndefinedError on missing vars, sandbox SecurityError on dunder access, generator materialization, and trim_blocks / lstrip_blocks behavior
  • Datamodel: 10 tests in libs/core/kiln_ai/datamodel/test_input_transform.py cover construction, save-time template validation, round-trip serialization, and discriminator dispatch
  • RunConfig integration: 6 new tests in libs/core/kiln_ai/datamodel/test_run_config.py cover defaults, acceptance, dict dispatch, backcompat (existing RunConfigs without the field), and MCP negative
  • Adapter integration: 8 new tests in libs/core/kiln_ai/adapters/model_adapters/test_base_adapter.py cover object-schema / plaintext-JSON / plaintext-non-JSON / array-schema rendering, identity (transform=None) path, streaming parity with sync, UndefinedError surfaces pre-inference, and MCP no-op
  • Manual: spin up a RunConfig with a JinjaInputTransform against a structured-input task, confirm TaskRun.input contains the raw dict and TaskRun.trace[0] contains the rendered string

🤖 Generated with Claude Code

scosman and others added 3 commits May 26, 2026 15:22
Introduces the foundation for input transforms on RunConfigs: a Jinja2
template engine (jinja_engine.py) with sandboxed rendering and expression
evaluation, plus the JinjaInputTransform datamodel type and InputTransform
discriminated union. Adds the input_transform field to
KilnAgentRunConfigProperties with full backward compatibility (defaults
to None). Includes comprehensive unit tests for the engine, datamodel,
and run config integration.

Also fixes pre-existing type errors in app/web_ui/src/lib/types.ts where
schema references used stale `-Input` suffixed names that no longer exist
in the generated OpenAPI schema.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add _apply_input_transform helper to BaseAdapter and integrate it into
both _run_returning_run_output and _prepare_stream. The transform runs
after input schema validation but before the formatter, producing the
model-facing first user message while preserving the original input for
TaskRun persistence. MCP run configs are unaffected (no-op guard).

Includes 8 integration tests covering object-schema, plaintext JSON,
plaintext non-JSON, array-schema, identity (None), streaming parity,
UndefinedError pre-inference, and MCP unchanged behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 27, 2026

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds a Jinja-based InputTransform datamodel and sandboxed engine, a validation API endpoint, adapter wiring to apply transforms before formatting (sync and streaming), OpenAPI/TS schema updates, frontend create/view UI and selectors, tests across backend/frontend, and supporting docs/specs.

Changes

Validation API, OpenAPI & Types

Layer / File(s) Summary
Validation API + OpenAPI/TS updates
app/desktop/studio_server/run_config_api.py, app/web_ui/src/lib/api_schema.d.ts, app/web_ui/src/lib/types.ts, app/desktop/studio_server/test_run_config_api.py
Add POST /api/validate_input_transform_template with request/response models; add JinjaInputTransform and ValidateInputTransformTemplate* schemas; consolidate several generated *-Input/-Output schema variants to unified names; update TS aliases/unions to use consolidated schemas.

Engine, Datamodel, Tests, Dependencies

Layer / File(s) Summary
Jinja engine and datamodel
libs/core/kiln_ai/datamodel/input_transform.py, libs/core/kiln_ai/utils/jinja_engine.py, libs/core/kiln_ai/datamodel/test_input_transform.py, libs/core/kiln_ai/utils/test_jinja_engine.py, libs/core/pyproject.toml
Add JinjaInputTransform and discriminated InputTransform union; implement sandboxed Jinja helpers (compile_template_or_raise, compile_expression_or_raise, render_input_transform, extract); validate templates at save time; add unit tests; bump jinja2>=3.1.0 and pydantic>=2.13.0.

Run-config and Adapter Integration

Layer / File(s) Summary
Run-config wiring
libs/core/kiln_ai/datamodel/run_config.py, libs/core/kiln_ai/datamodel/test_run_config.py
Add optional `input_transform: InputTransform
Adapter integration & tests
libs/core/kiln_ai/adapters/model_adapters/base_adapter.py, libs/core/kiln_ai/adapters/model_adapters/test_base_adapter.py
Add _apply_input_transform and use transformed model_input for formatting in _run_returning_run_output and _prepare_stream; preserve original TaskRun.input; add integration tests for rendering, identity, streaming parity, error propagation, and MCP behavior.

Frontend UI: create/edit, selector, display

Layer / File(s) Summary
Create/Edit modal, selector, formatters, props
app/web_ui/src/lib/ui/run_config_component/input_transform_create_modal.svelte, input_transform_selector.svelte, run_config_component.svelte, app/web_ui/src/lib/utils/run_config_formatters.ts, tests
Add modal to author/validate templates (calls validation API), selector component, run-config formatter helpers, extend UiProperty with optional action, wire input_transform through advanced options and run-config state, and add component/unit tests.
Display surfaces and modal viewing
app/web_ui/src/lib/ui/run_config_component/input_transform_modal.svelte, chart/table pages, run-config summaries, dropdowns, and tests
Show input-transform summary labels across charts, compare pages, optimize tables, run-config summaries and provide modal to view/copy template for saved configs.

Docs & Specs

Layer / File(s) Summary
Specs and plans
specs/projects/templates/*, specs/projects/input_transform_*
Add functional spec, architecture doc, implementation plan and phase plans documenting design, security model (sandboxing), error semantics, test checklist, and rollout steps.

Sequence Diagram

sequenceDiagram
    participant Runner as Task Executor
    participant Adapter as BaseAdapter
    participant Transform as _apply_input_transform
    participant Engine as render_input_transform
    participant Formatter as format_input
    participant Model as Provider
    Runner->>Adapter: invoke task run
    Adapter->>Transform: _apply_input_transform(input, run_config)
    alt input_transform configured
        Transform->>Engine: render_input_transform(transform, input)
        Engine-->>Transform: rendered string
        Transform-->>Adapter: model_input
    else no transform
        Transform-->>Adapter: input (unchanged)
    end
    Adapter->>Formatter: format_input(model_input)
    Formatter-->>Adapter: formatted request
    Adapter->>Model: send request/stream
    Model-->>Adapter: response
    Adapter-->>Runner: persist TaskRun (input=original, trace=rendered)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • Kiln-AI/Kiln#977: Touches getRunConfigUiProperties and run-config UI property generation; related at the formatter/UI integration point.
  • Kiln-AI/Kiln#804: Related run-config UI consumer changes interacting with property rows and UI plumbing.

Suggested reviewers

  • sfierro
  • chiang-daniel

🐰 I stitched a Jinja stitch with nimble paws,
I render inputs safe within sandbox laws,
The adapter hums and sends the model light,
The original input sleeps safe at night,
Hooray — templates hop, and traces glow bright!

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch scosman/templates

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements an input transform feature for Kiln run configs, allowing users to define a Jinja2 template (JinjaInputTransform) that projects structured task inputs into a rendered string for the first user message. It introduces a sandboxed Jinja2 engine utility, updates the KilnAgentRunConfigProperties data model, and integrates the transform step into both the synchronous and streaming paths of BaseAdapter. Feedback on the changes points out a type annotation mismatch in the _apply_input_transform helper method, where the return type should be updated to InputType | str to prevent static type checking errors.

Comment thread libs/core/kiln_ai/adapters/model_adapters/base_adapter.py
@tawnymanticore
Copy link
Copy Markdown
Collaborator

tawnymanticore commented May 29, 2026

dep error, pandas got uninstalled after uv sync. it works if pandas gets manually installed after, so might be something wrong with the requirements in this PR

uv sync
Resolved 188 packages in 3ms
Uninstalled 16 packages in 379ms

  • beautifulsoup4==4.13.5
  • defusedxml==0.7.1
  • llama-cloud==0.1.35
  • llama-cloud-services==0.6.54
  • llama-index-cli==0.5.3
  • llama-index-indices-managed-llama-cloud==0.9.4
  • llama-index-readers-file==0.5.4
  • llama-index-readers-llama-parse==0.5.1
  • llama-parse==0.6.54
  • overrides==7.7.0
  • pandas==2.2.3
  • pytz==2025.2
  • rsa==4.9
  • soupsieve==2.8
  • striprtf==0.0.26
  • tzdata==2025.2

$ uv run python -m app.desktop.dev_server
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in _run_code
File "app/desktop/dev_server.py", line 8, in
from app.desktop.desktop_server import make_app
File "app/desktop/desktop_server.py", line 12, in
import kiln_server.server as kiln_server
File "libs/server/kiln_server/server.py", line 12, in
from .document_api import connect_document_api
File "libs/server/kiln_server/document_api.py", line 26, in
from kiln_ai.adapters.rag.progress import (
File "libs/core/kiln_ai/adapters/rag/progress.py", line 11, in
from kiln_ai.adapters.vector_store.vector_store_registry import (
File "libs/core/kiln_ai/adapters/vector_store/vector_store_registry.py", line 7, in
from kiln_ai.adapters.vector_store.lancedb_adapter import LanceDBAdapter
File "libs/core/kiln_ai/adapters/vector_store/lancedb_adapter.py", line 13, in
from llama_index.vector_stores.lancedb import LanceDBVectorStore
File ".venv/lib/python3.12/site-packages/llama_index/vector_stores/lancedb/init.py", line 1, in
from llama_index.vector_stores.lancedb.base import LanceDBVectorStore
File ".venv/lib/python3.12/site-packages/llama_index/vector_stores/lancedb/base.py", line 33, in
from pandas import DataFrame
ModuleNotFoundError: No module named 'pandas'

$ uv pip install pandas
Resolved 4 packages in 71ms
Prepared 1 package in 226ms
Installed 1 package in 17ms

  • pandas==3.0.3

$ uv run python -m app.desktop.dev_server
INFO: Will watch for changes in these directories: ['']

- Wrap TemplateSyntaxError in extract() to raise ValueError, matching
  the contract of compile_expression_or_raise.
- Remove defensive default in _get_input_transform_type discriminator
  so missing or unknown type values fail loudly rather than silently
  dispatching to JinjaInputTransform.
- Add sandbox tests for __mro__ traversal and __subclasses__() access
  to guard against future sandbox misconfiguration.
- Add test for the generator-materialization branch of extract() using
  an expression without "| list" (the original test never reached the
  code path it claimed to cover).
- Extract _invoke_with_capture helper to deduplicate ~150 lines of
  mock boilerplate across the input-transform adapter integration tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
libs/core/kiln_ai/datamodel/input_transform.py (1)

32-35: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix discriminated union construction for single-member Union

InputTransform uses Union[Annotated[JinjaInputTransform, Tag("jinja")],] together with a callable Discriminator. In Pydantic v2, discriminated unions require at least two union members; with only a single tagged member, schema generation raises a TypeError, so the intended “fail loudly” validation for missing/unknown discriminator values won’t reliably trigger. Update the typing so the discriminator is only applied to a multi-variant tagged union (or remove the Union[...] while there’s only one variant) and add a test asserting that unknown/None discriminator values raise union_tag_not_found as a ValidationError.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@libs/core/kiln_ai/datamodel/input_transform.py` around lines 32 - 35, The
current InputTransform wraps a single variant in a Union and applies
Discriminator(_get_input_transform_type), which Pydantic v2 forbids for
single-member unions; change InputTransform to remove the Union and
Discriminator and directly annotate the single variant (use
Annotated[JinjaInputTransform, Tag("jinja")] so the single-variant case does not
trigger schema generation errors) and remove references to
_get_input_transform_type from this definition; then add a unit test that, when
a multi-variant discriminated union is present in future, ensures unknown or
missing discriminator values raise pydantic.ValidationError with the
'union_tag_not_found' error kind (i.e., write a test that parses a payload with
a missing/unknown tag and asserts ValidationError contains
'union_tag_not_found').
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@libs/core/kiln_ai/datamodel/input_transform.py`:
- Around line 32-35: The current InputTransform wraps a single variant in a
Union and applies Discriminator(_get_input_transform_type), which Pydantic v2
forbids for single-member unions; change InputTransform to remove the Union and
Discriminator and directly annotate the single variant (use
Annotated[JinjaInputTransform, Tag("jinja")] so the single-variant case does not
trigger schema generation errors) and remove references to
_get_input_transform_type from this definition; then add a unit test that, when
a multi-variant discriminated union is present in future, ensures unknown or
missing discriminator values raise pydantic.ValidationError with the
'union_tag_not_found' error kind (i.e., write a test that parses a payload with
a missing/unknown tag and asserts ValidationError contains
'union_tag_not_found').

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 6b948216-b6e2-48d1-8ed0-e37c10f76a48

📥 Commits

Reviewing files that changed from the base of the PR and between 62572c0 and 525f91e.

📒 Files selected for processing (5)
  • libs/core/kiln_ai/adapters/model_adapters/test_base_adapter.py
  • libs/core/kiln_ai/datamodel/input_transform.py
  • libs/core/kiln_ai/datamodel/test_input_transform.py
  • libs/core/kiln_ai/utils/jinja_engine.py
  • libs/core/kiln_ai/utils/test_jinja_engine.py
🚧 Files skipped from review as they are similar to previous changes (4)
  • libs/core/kiln_ai/datamodel/test_input_transform.py
  • libs/core/kiln_ai/utils/test_jinja_engine.py
  • libs/core/kiln_ai/adapters/model_adapters/test_base_adapter.py
  • libs/core/kiln_ai/utils/jinja_engine.py

The original phase 1 commit added jinja2 and ran `uv lock`, which under
the rolling `exclude-newer = "7 days"` window pulled in ~5200 lines of
unrelated dependency bumps — including lancedb 0.25 -> 0.30, whose
bundled lance 4.0.0 breaks `test_adapter_reuse_preserves_data` on
concurrent CreateIndex calls.

The InputTransform discriminated-union pattern (single-member Union
with Discriminator) also requires pydantic >=2.13; pyproject only
declared >=2.9.2, so the code silently relied on whatever the rolling
window happened to pick.

This commit:
- Bumps pydantic floor to >=2.13.0 in libs/core/pyproject.toml so the
  discriminator pattern is supported by all consumers.
- Resets uv.lock to main's pinned versions and re-locks pinned to
  main's exclude-newer timestamp, shrinking the lockfile delta to
  jinja2 + pydantic-chain bumps only (~210 lines vs ~5200).
- Regenerates the OpenAPI client schema to match pydantic 2.13's
  output (drops duplicate -Input/-Output model variants, trims
  external SDK docstrings).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 30, 2026

📊 Coverage Report

Overall Coverage: 92%

Diff: origin/main...HEAD

  • app/desktop/studio_server/run_config_api.py (100%)
  • libs/core/kiln_ai/adapters/model_adapters/base_adapter.py (94.1%): Missing lines 441
  • libs/core/kiln_ai/datamodel/input_transform.py (100%)
  • libs/core/kiln_ai/datamodel/run_config.py (100%)
  • libs/core/kiln_ai/utils/jinja_engine.py (97.7%): Missing lines 27

Summary

  • Total: 91 lines
  • Missing: 2 lines
  • Coverage: 97%

Line-by-line

View line-by-line diff coverage

libs/core/kiln_ai/adapters/model_adapters/base_adapter.py

Lines 437-445

  437         formatted_input = model_input
  438         formatter_id = self.model_provider().formatter
  439         if formatter_id is not None:
  440             formatter = request_formatter_from_id(formatter_id)
! 441             formatted_input = formatter.format_input(model_input)
  442 
  443         return self._create_run_stream(formatted_input, prior_trace)
  444 
  445     def _finalize_stream(

libs/core/kiln_ai/utils/jinja_engine.py

Lines 23-31

  23 from jinja2 import StrictUndefined, TemplateSyntaxError, Undefined
  24 from jinja2.sandbox import SandboxedEnvironment
  25 
  26 if TYPE_CHECKING:
! 27     from kiln_ai.datamodel.input_transform import InputTransform
  28 
  29 
  30 _template_env = SandboxedEnvironment(
  31     undefined=StrictUndefined,


@scosman
Copy link
Copy Markdown
Collaborator Author

scosman commented May 30, 2026

@tawnymanticore should be fixed!

@scosman
Copy link
Copy Markdown
Collaborator Author

scosman commented May 30, 2026

Re: @tawnymanticore's uv sync dep error (pandas/llama-index uninstalled)

Fixed in 0bc08fb ("Pin pydantic >=2.13 and shrink uv.lock churn").

The earlier lockfile on this branch had only 188 packages and was missing the llama-index document-handling stack (and pandas as a transitive dep). The fix restored the lockfile to main's package set (195 packages). Verified by running uv sync --dry-run — reports "Would make no changes" — and uv sync completes without uninstalling pandas or the llama-index readers.


Addressed by AI coding agent via /spec pr

@tawnymanticore
Copy link
Copy Markdown
Collaborator

We need at least SOME UI changes to show the jinja template in the run config UI, there is no way to see what the template is for a given run config

scosman and others added 4 commits June 2, 2026 14:44
Introduce read-only UI support for input transforms on run configs:
- Add JinjaInputTransform/InputTransform type aliases
- Add getInputTransformDisplay, getRunConfigInputTransform, and
  getRunConfigInputTransformSummaryLabel helpers with exhaustive guards
- Add InputTransformModal component for viewing transform details
- Add action callback support to PropertyList/UiProperty
- Wire "Input Transformer" row into all three run config detail pages
- Comprehensive unit and component tests

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Surface the "Input Transform: Custom" indicator on six run-config
summary locations: selector dropdown descriptions, compare-page column
headers, RunConfigSummary card, both comparison charts (legend + tooltip),
and the optimize-page table. The optimize table uses a dedicated
"Input Transform" column (None/Custom on every row) rather than an
inline badge, while the other five surfaces show the indicator only
when a transform is present.

All display strings route through getRunConfigInputTransformSummaryLabel
so the exhaustive type switch lives in one place.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…lpers (Phase 1)

Add POST /api/validate_input_transform_template endpoint wrapping compile_template_or_raise
with request/response models, regenerate OpenAPI client types, and add buildJinjaInputTransform
and inputTransformsEqual helpers with full test coverage.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… (Phase 2)

Add InputTransformSelector (fancy-select with create/edit action) and
InputTransformCreateModal (textarea + server-side Jinja validation) as the
last advanced run option. Wire input_transform state through
run_config_component: load-config populate, custom-detection compare via
inputTransformsEqual, save payload inclusion, and reactive dependency list.
Includes comprehensive component tests for both the modal and selector.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@scosman
Copy link
Copy Markdown
Collaborator Author

scosman commented Jun 3, 2026

@tawnymanticore all the UI

Screenshot 2026-06-02 at 10 18 42 PM Screenshot 2026-06-02 at 10 18 57 PM Screenshot 2026-06-02 at 10 19 26 PM Screenshot 2026-06-02 at 10 19 38 PM Screenshot 2026-06-02 at 10 19 45 PM Screenshot 2026-06-02 at 10 20 11 PM Screenshot 2026-06-02 at 10 20 20 PM

scosman and others added 4 commits June 2, 2026 22:24
Add try/except in _apply_input_transform to catch render errors and
re-raise as ValueError with "Input transform failed:" prefix, giving
users clear context when a template fails at runtime. Wraps at the
adapter layer to keep the Jinja engine pure and reusable.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants