TalkPipe LLM segments need two values for every call:
source— which backend provides the model (for exampleollama,openai, oranthropicfor chat).model— the model id on that backend (for examplellama3.2,gpt-4o, ormxbai-embed-large).
You can set these on each segment, in ~/.talkpipe.toml, via TALKPIPE_* environment variables, or through ChatterLang $key substitution. This guide explains how those layers interact. For logging, security, and general config mechanics, see Configuration architecture.
- Day-to-day usage
- Supported sources
- How values are resolved
- Configuration keys
- Segment parameters
- Examples
- Troubleshooting
- Related documentation
TalkPipe gives you several ways to configure model and source — segment parameters, ChatterLang $key substitution, TALKPIPE_* environment variables, and ~/.talkpipe.toml. The setup below is not the only valid one; it is intended to be the simplest, lowest-maintenance path for day-to-day usage. The rest of this guide details each layer so you can deviate when you need to.
The pattern assumes you have a single "main" LLM source for most calls and only reach for alternatives occasionally:
- Set provider credentials in your shell environment for whichever sources you actually use. These are SDK-level keys, not TalkPipe config keys:
- Ollama: nothing required if running on
localhost:11434; otherwise setTALKPIPE_OLLAMA_SERVER_URL. - OpenAI:
OPENAI_API_KEY. - Anthropic:
ANTHROPIC_API_KEY.
- Ollama: nothing required if running on
- Set the
default_*keys for the model and source you reach for most often (chat and embeddings) — once, in~/.talkpipe.tomlor asTALKPIPE_*environment variables. - Write pipelines without specifying
model/sourceby default. Only override on the specific segments where you genuinely want a different model.
To make that concrete, here is what an Ollama-first setup looks like end-to-end. Start by writing the defaults to ~/.talkpipe.toml once and forgetting about them:
default_model_name = "llama3.2"
default_model_source = "ollama"
default_embedding_model_name = "mxbai-embed-large"
default_embedding_model_source = "ollama"Then add provider credentials to your shell for any non-Ollama backends you might use:
export OPENAI_API_KEY=sk-...With those defaults in place, day-to-day ChatterLang lets a bare llmPrompt pick up the configured model and source — no per-call boilerplate:
INPUT FROM echo[data="Quick question"]
| llmPrompt
| print
When a specific call needs a stronger (or just different) model, override only on that segment with [model=..., source=...]:
INPUT FROM echo[data="Tougher question"]
| llmPrompt[model="gpt-4o", source="openai"]
| print
The same pattern works in the Python pipe API. Default usage relies on the config:
# skip-extract
from talkpipe.pipe import io
from talkpipe.llm.chat import LLMPrompt
pipeline = io.echo(data="Summarize the latest meeting notes.") | LLMPrompt()
list(pipeline.as_function(single_out=False)())And per-call overrides are the same as in ChatterLang — just constructor arguments:
# skip-extract
from talkpipe.pipe import io
from talkpipe.llm.chat import LLMPrompt
careful = io.echo(data="Draft a careful legal summary.") | LLMPrompt(
model="gpt-4o",
source="openai",
)
list(careful.as_function(single_out=False)())- Provider credentials belong in the environment because the underlying SDKs read them directly and they are sensitive.
default_*keys belong in~/.talkpipe.tomlbecause they are stable preferences, not secrets, and you want them shared across every shell, notebook, and script.- Per-segment overrides belong in code because the choice of model is usually tied to the specific task — and the segment parameter is the highest-precedence layer, so it always wins.
The remaining sections fill in the details behind that pattern: which source values are actually accepted, exactly how model and source are resolved when you omit them, every configuration key that participates, and the segment-by-segment parameter reference.
The first piece of the pattern above is the source value itself. Sources are registered in talkpipe.llm.config:
| Segment | Registered sources |
|---|---|
llmPrompt (chat) |
ollama, openai, anthropic |
llmVisionPrompt (multimodal chat) |
ollama, openai, anthropic |
llmEmbed (embeddings) |
ollama, openai, model2vec |
Additional sources can be registered at runtime with registerPromptAdapter or registerEmbeddingAdapter (see Extending TalkPipe).
Install optional provider dependencies as needed:
| Extra | Purpose |
|---|---|
talkpipe[ollama] |
Ollama chat and embeddings |
talkpipe[openai] |
OpenAI chat and embeddings |
talkpipe[anthropic] |
Anthropic chat |
talkpipe[model2vec] |
In-process static embeddings via model2vec (lightweight; included in [all]) |
talkpipe[all] |
All optional features including model2vec, providers, PDF, and images |
pip install talkpipe[ollama]
pip install talkpipe[openai]
pip install talkpipe[model2vec]
pip install talkpipe[all] # includes model2vecThe day-to-day pattern relies on TalkPipe quietly filling in model and source when you omit them. Here is the full rule it uses.
When LLMPrompt or LLMEmbed is constructed, TalkPipe fills in missing model / source from get_config() (merged ~/.talkpipe.toml plus TALKPIPE_* environment variables). If either is still missing, construction raises an error.
flowchart TD
segmentParams["Segment parameters model and source"]
chatterlangDollar["ChatterLang $key at parse time"]
getConfig["get_config: TALKPIPE env then talkpipe.toml"]
providerSdk["Provider SDK env OPENAI_API_KEY etc"]
segmentParams -->|"highest for LLM segments"| resolved["Resolved model and source"]
chatterlangDollar --> segmentParams
getConfig --> segmentParams
providerSdk -->|"credentials only"| adapters["OpenAI and Anthropic adapters"]
| Layer | How it applies | Example |
|---|---|---|
| Segment parameters | Explicit model / source on the segment always win |
llmPrompt[model="gpt-4o", source="openai"] |
ChatterLang $key |
Resolved at parse time from get_config() |
llmPrompt[model=$default_model_name, source=$default_model_source] |
| Environment variables | TALKPIPE_ + exact config key name |
export TALKPIPE_default_model_name=llama3.2 |
| Configuration file | ~/.talkpipe.toml |
default_model_name = "llama3.2" |
Within get_config(), environment variables override file values. ChatterLang $key precedence for CLI overrides is documented in Configuration architecture: command-line --key value beats TALKPIPE_key beats TOML.
Provider credentials (API keys) are separate: OpenAI and Anthropic adapters use their official SDKs, which read OPENAI_API_KEY and ANTHROPIC_API_KEY from the environment—not TalkPipe default_* keys.
The precedence table above refers to "config" generically. This section lists the specific keys TalkPipe looks for — the ones the Day-to-day section recommended setting in ~/.talkpipe.toml, plus the separate set used by the RAG CLIs.
Used by llmPrompt, llmVisionPrompt, and llmEmbed when model / source are omitted:
| Purpose | TOML / config key | Environment variable |
|---|---|---|
Default chat model (also used by llmVisionPrompt) |
default_model_name |
TALKPIPE_default_model_name |
Default chat source (also used by llmVisionPrompt) |
default_model_source |
TALKPIPE_default_model_source |
| Default embedding model | default_embedding_model_name |
TALKPIPE_default_embedding_model_name |
| Default embedding source | default_embedding_model_source |
TALKPIPE_default_embedding_model_source |
| Ollama server URL | OLLAMA_SERVER_URL |
TALKPIPE_OLLAMA_SERVER_URL |
llmVisionPrompt shares the chat defaults — there is no separate default_vision_model_* key. If your default_model_name is a text-only model (for example llama3.2), passing it to llmVisionPrompt will fail at the provider rather than at construction. In practice, set model (and usually source) explicitly on llmVisionPrompt, or set default_model_name to a vision-capable model and override text-only llmPrompt calls when you need a smaller model.
Example ~/.talkpipe.toml:
default_model_name = "llama3.2"
default_model_source = "ollama"
default_embedding_model_name = "mxbai-embed-large"
default_embedding_model_source = "ollama"
OLLAMA_SERVER_URL = "http://localhost:11434"makevectordatabase and serverag read these when you omit --embedding_model, --embedding_source, --completion_model, and --completion_source:
| Purpose | TOML / config key | Environment variable |
|---|---|---|
| Embedding model | DEFAULT_EMBEDDING_MODEL |
TALKPIPE_DEFAULT_EMBEDDING_MODEL |
| Embedding source | DEFAULT_EMBEDDING_SOURCE |
TALKPIPE_DEFAULT_EMBEDDING_SOURCE |
| Completion model | DEFAULT_LLM_MODEL |
TALKPIPE_DEFAULT_LLM_MODEL |
| Completion source | DEFAULT_LLM_SOURCE |
TALKPIPE_DEFAULT_LLM_SOURCE |
If a CLI flag is omitted and the matching DEFAULT_* key is unset, the value passed into the RAG pipeline may be None, and inner llmEmbed / llmPrompt segments fall back to the default_* keys above.
Recommendation: set default_* once for most workflows. Add DEFAULT_* only when you want different defaults specifically for the RAG commands. See makevectordatabase and serverag.
For llmPrompt, llmVisionPrompt, and llmEmbed, only model and source fall back to default_* config keys when omitted. Every other segment parameter must be set on the segment (ChatterLang or Python); it is not read from ~/.talkpipe.toml or TALKPIPE_* unless noted below for a specific higher-level segment.
Required (directly or via config): model, source.
INPUT FROM prompt[data="Summarize this:"]
| llmPrompt[model="llama3.2", source="ollama", field="data"]
| print
from talkpipe.llm.chat import LLMPrompt
segment = LLMPrompt(model="gpt-4o", source="openai", system_prompt="You are concise.")Memory and compaction options (memory_mode, context_token_trigger, etc.) are described in ChatterLang memory controls.
Required (directly or via config): model, source. Required as a segment parameter: image_field (the item field holding the image path, URL, bytes, or ImageResult).
llmVisionPrompt resolves model / source from the same default_model_name / default_model_source keys as llmPrompt. There is no separate vision-specific default, so if your chat default is text-only you should override model (and usually source) on llmVisionPrompt:
INPUT FROM loadImage[path="/path/to/diagram.png", set_as="image"]
| llmVisionPrompt[image_field="image", model="llava", source="ollama"]
| print
# skip-extract
from talkpipe.llm.vision import LLMVisionPrompt
segment = LLMVisionPrompt(
image_field="/path/to/image",
model="gpt-4o",
source="openai",
prompt="Describe the chart.",
)| Parameter | From config? | Notes |
|---|---|---|
model |
Yes — default_embedding_model_name |
Required if not passed on the segment |
source |
Yes — default_embedding_model_source |
Required if not passed on the segment |
field |
No | Text field to embed on structured items |
set_as |
No | Field on the item where the vector is stored |
batch_size |
No | Scalar items per provider call (default 1) |
fail_on_error |
No | Default true; applies to non-length failures (network, auth, etc.) |
on_token_overflow |
No | Default error — when embed fails as too long: error, truncate, or chunk_pool |
truncate_side |
No | For truncate: head, tail (default), or middle |
num_chunks |
No | For chunk_pool: segments to split into (default 2, minimum 2) |
Sizing text: Chunk or split documents before llmEmbed (e.g. splitText, processDocuments,
makevectordatabase --chunk_size). on_token_overflow is failure recovery when a chunk is
still too long for the model—not a substitute for upstream chunking.
Token overflow: TalkPipe classifies provider “too long” errors and applies on_token_overflow.
truncate retries with 20% shorter character slices per attempt; chunk_pool embeds num_chunks
contiguous parts and mean-pools to one vector per stream item. If a batch embed fails, TalkPipe
retries per item so you can see which chunk failed.
Estimated pre-truncation: set max_estimated_tokens on llmEmbed to truncate input before
calling the embedding provider. This uses a lightweight estimate, not the provider tokenizer; it
combines word count, character count, an intentionally conservative non-ASCII-heavy text estimate,
and a denser estimate for encoded-looking text such as PDF extraction artifacts.
on_token_overflow remains the fallback if the estimate is optimistic. truncate_side is shared
by both truncation paths:
estimated pre-truncation before the provider call, and
on_token_overflow="truncate" after the provider reports a token overflow.
Batching: set batch_size greater than 1 on llmEmbed to call the provider with multiple
texts per request. The stream still has one input item and one output item per document;
batching is internal only. llmEmbed does not accept list-shaped stream items (flatten or
emit items individually upstream). field and set_as follow
AbstractFieldSegment on each scalar item. With fail_on_error=False, non-length failures skip
items when per-item fallback runs after a batch failure.
INPUT FROM echo[data="Hello world"]
| llmEmbed[model="mxbai-embed-large", source="ollama", set_as="vector"]
| print
| llmEmbed[on_token_overflow="truncate", truncate_side="tail"]
| llmEmbed[on_token_overflow="chunk_pool", num_chunks=4]
| llmEmbed[max_estimated_tokens=8192, truncate_side="tail"]
Higher-level segments forward model settings to inner LLM segments:
| Segment / app | Parameters |
|---|---|
makeVectorDatabase, searchVectorDatabase |
embedding_model, embedding_source |
ragToText, ragToBinaryAnswer, etc. |
embedding_model, embedding_source, completion_model, completion_source |
makevectordatabase, serverag CLIs |
--embedding_model, --embedding_source, --completion_model, --completion_source |
Not a segment parameter by default. Set OLLAMA_SERVER_URL in config or TALKPIPE_OLLAMA_SERVER_URL in the environment when Ollama is not on localhost.
INPUT FROM prompt[data="Hello"]
| llmPrompt[model="llama3.2", source="ollama"]
| print
With default_model_name and default_model_source set in ~/.talkpipe.toml:
INPUT FROM prompt[data="Hello"]
| llmPrompt
| print
export TALKPIPE_default_model_name=llama3.2
export TALKPIPE_default_model_source=ollama
export TALKPIPE_default_embedding_model_name=mxbai-embed-large
export TALKPIPE_default_embedding_model_source=ollama
chatterlang_script --script 'INPUT FROM prompt[data="Hi"] | llmPrompt | print'Because TALKPIPE_* variables override the TOML file (see Precedence), this pattern is convenient for building a single generic TalkPipe container image that does not bake in any particular model or backend. The base image ships pipelines that omit model / source, and derived images — or runtime docker run -e ... / Kubernetes env — parameterize the deployment by setting TALKPIPE_default_model_name, TALKPIPE_default_model_source, and the embedding equivalents (plus OPENAI_API_KEY / ANTHROPIC_API_KEY / TALKPIPE_OLLAMA_SERVER_URL as needed). For example, two derived images can target different backends from the same base:
FROM my-org/talkpipe-pipelines:1.0
ENV TALKPIPE_default_model_name=llama3.2 \
TALKPIPE_default_model_source=ollama \
TALKPIPE_default_embedding_model_name=mxbai-embed-large \
TALKPIPE_default_embedding_model_source=ollama \
TALKPIPE_OLLAMA_SERVER_URL=http://ollama:11434FROM my-org/talkpipe-pipelines:1.0
ENV TALKPIPE_default_model_name=gpt-4o \
TALKPIPE_default_model_source=openai \
TALKPIPE_default_embedding_model_name=text-embedding-3-small \
TALKPIPE_default_embedding_model_source=openaiOPENAI_API_KEY is intentionally left to the runtime (a Docker secret, Kubernetes secret, or --env flag) rather than baked into the image.
chatterlang_script --script 'INPUT FROM prompt[data="Hi"] | llmPrompt[model=$default_model_name, source=$default_model_source] | print' \
--default_model_name llama3.2 \
--default_model_source ollamafrom talkpipe.llm.chat import LLMPrompt
# Uses default_model_name / default_model_source from config when omitted
segment = LLMPrompt(system_prompt="You are helpful.")| Symptom | What to check |
|---|---|
Model name and source must be provided |
Set model and source on the segment, or add default_model_name and default_model_source (or embedding equivalents for llmEmbed). |
Unknown source |
Chat / vision: use ollama, openai, or anthropic. Embeddings: use ollama or openai (or register additional adapters). |
llmVisionPrompt errors at the provider with model-not-found / unsupported-input |
llmVisionPrompt reads default_model_name / default_model_source (the chat defaults). Set model and source explicitly on the segment, or change the chat defaults to a vision-capable model. |
| Ollama connection refused | Run ollama serve or set OLLAMA_SERVER_URL / TALKPIPE_OLLAMA_SERVER_URL. |
| OpenAI / Anthropic auth errors | Set OPENAI_API_KEY or ANTHROPIC_API_KEY; these are not read from TALKPIPE_* model keys. |
| RAG CLI uses unexpected models | Check DEFAULT_* keys and CLI flags; then check segment default_* fallbacks. |
- Configuration architecture — full config system, precedence, and security
- ChatterLang — DSL syntax and
llmPromptmemory - makevectordatabase and serverag — RAG workflow
- Quickstart — first pipeline examples
- Developer handbook — standard
~/.talkpipe.tomlkeys