Standard Library

Core — always available

Function	Description
`load(path)`	Load CSV or JSON as a Context
`save(data, path)`	Write rows to CSV or JSON
`filter(pred)`	Keep rows matching condition — `it` is current row
`map(expr)`	Transform every element — `it` is current element
`mapi(expr)`	Map with index — `it` is `{ idx, val }`
`reduce(init, fn)`	Fold list into a single value
`add(field: expr)`	Add a new field to every row
`drop(field)`	Remove a field
`select(fields...)`	Keep only specified fields
`rename(old: new)`	Rename a field
`sort(by, dir?)`	Sort rows — `dir:` `"asc"` (default) or `"desc"`
`take(n)`	Keep first n rows
`unique(by?)`	Deduplicate rows — `by:` deduplicates on a single field
`each(by:, \|> ...)`	Run a sub-pipe per group, concatenate results. Accepts block or lambda form
`collapse(by:, ...)`	Aggregate rows, optionally grouped. Values can be agg fns or a lambda receiving the group
`join(other, on)`	Inner join on a shared key
`recover(field: expr)`	Move error rows back into data with a fallback value — use after a step that may fail
`sum(col.field)`	Sum — use in `collapse` or `add`. Handles vector columns element-wise
`mean(col.field)`	Mean — use in `collapse` or `add`. Handles vector columns element-wise
`count()`	Row count — use in `collapse`
`min(col.field)`	Minimum — use in `collapse` or `add`
`max(col.field)`	Maximum — use in `collapse` or `add`
`rank(col.field, by?, dir?)`	Rank rows by a column — use in `add`
`rolling(col.field, window, fn, by?)`	Rolling window aggregation — use in `add`
`len(list)`	Number of elements
`concat(a, b, ...)`	Concatenate lists
`slice(list, start, end)`	Slice a list (inclusive end)
`get(list, i)`	Get element at index — always positional
`find(table, col, value)`	Find first row where `col` equals `value` — returns `none` if not found
`print(value)`	Print and pass through
`halt(message?)`	Stop execution immediately with exit code 1
`str(value)`	Convert to string
`int(value)`	Convert to integer
`float(value)`	Convert to float

Annotations

Use annotations instead of kwargs for execution behavior:

Annotation	Description
`@concurrent(n)`	Run the step over each row using n threads. Works on any step or declaration
`@retry(n)`	Retry the step up to n times on exception
`@until(cond, max: n)`	Retry step or `( )` block on rows where condition is false, up to max rounds. Rows still failing go to `.errors`
`@cache`	Cache this step's result. Step-level for whole-dataframe ops; row-level for `ml.llm` and `ml.embed`

# Parallel embed with caching
|> add(embedding: ml.embed(...))
    @concurrent(50)
    @cache

# LLM with retry, until, and cache — the full pattern
|> add(label: ml.llm(...))
    @concurrent(10)
    @retry(3)
    @until(it.label != none, max: 5)
    @cache

# Cache a deterministic expensive step
|> ml.kmeans(k: 5..12, on: "embedding", out: "cluster")
    @cache

# On a declaration — applies every time it's used
gpt = ml.llm("classify: {it.title}", source: "openai", model: "gpt-4o", apikey: env.OPENAI_API_KEY)
  @concurrent(10)
  @retry(3)

`recover`

|> add(label: ml.llm(...))
|> recover(label: "unknown")         # literal fallback
|> recover(label: it.title)          # expression fallback

Moves all rows currently in .errors back into .data, applying the fallback expression per row. Clears .errors after recovery.

Context fields

After a pipe, the result is a Context. Access fields by dotting into the named assignment:

posts = load("data.csv") |> ml.kmeans(k: 3, out: "cluster")

posts.data      # the rows
posts.errors    # rows that failed any step
posts.kmeans    # { model, k } — written by ml.kmeans
posts.umap      # { model } — written by ml.umap
posts.viz       # { plot } — written by viz.*

`use env`

Function	Description
`env.KEY`	Read environment variable — errors if not set (preferred)
`env.get("KEY")`	Read environment variable — returns `Err` if not set

use env
key = env.OPENAI_API_KEY
val = env.get("OPTIONAL_KEY")

`use math`

Function	Description
`math.log(x)`	Natural log
`math.sqrt(x)`	Square root
`math.pow(x, exp)`	x raised to exp
`math.abs(x)`	Absolute value
`math.round(x)`	Round to nearest integer
`math.floor(x)`	Floor
`math.ceil(x)`	Ceiling
`math.clamp(x, lo, hi)`	Clamp x to [lo, hi]
`math.mean(list)`	Mean of a list
`math.median(list)`	Median of a list
`math.std(list)`	Standard deviation
`math.min(list)`	Minimum of a list
`math.max(list)`	Maximum of a list
`math.sum(list)`	Sum of a list

`use ml`

pip install peppermint-lang[ml]

Function	Description
`ml.embed(text, source:, model:, apikey?)`	Embed a single string — use inside `add` with `@concurrent(N)` for batch calls
`ml.llm(prompt, source:, model:, apikey?, format?)`	Single LLM call — use inside `add`. `source:` `"openai"`, `"anthropic"`, or `"deepinfra"`. `format: "json"` strips fences and parses response
`ml.kmeans(k:, on:, out:, method?, model?)`	K-means — `k:` accepts a range for auto-select; `method:` `"silhouette"` (default) or `"elbow"`; writes `.kmeans` artifact
`ml.umap(dims:, on:, out:, neighbors?, min_dist?, metric?, model?)`	Dimensionality reduction — writes `.umap` artifact
`ml.ols(on:, out:, model?)`	OLS regression — adds predicted and residual columns; writes `.ols` artifact
`ml.dist(a, b, metric?)`	Distance between two vectors — use inside `add`; `metric:` `"cosine"` (default) or `"euclidean"`
`ml.silhouette(on:)`	Score current clustering — prints silhouette score to stderr

model: shorthand on kmeans/umap/ols: loads from file if it exists, otherwise fits and saves.

ml.embed and ml.llm use row-level caching when --cache is enabled — only new rows hit the API on rerun.

`use viz`

pip install peppermint-lang[viz]

All viz functions write a .viz.plot artifact to the Context and open the plot immediately. Pass file: to also save to disk.

Function	Description
`viz.scatter(x:, y:, color?, size?, file?, display?)`	Scatter plot — `display: { label: "col", legend, axes, title: "...", dotsize: N \| "col" }`
`viz.line(x:, y:, color?, size?, file?, display?)`	Line chart — `display: { legend, axes, title: "...", dotsize: N }`
`viz.histogram(col:, file?)`	Histogram
`viz.heatmap(file?)`	Correlation heatmap of all numeric columns
`viz.plot(file?)`	Auto-plot based on data shape
`viz.grid(..., file?)`	Multiple plots side by side

`use text`

Function	Description
`text.parse(s)`	Parse a JSON string — useful for embedding columns loaded from CSV
`text.trim(s)`	Strip whitespace
`text.lower(s)`	Lowercase
`text.upper(s)`	Uppercase
`text.replace(s, old, new)`	Replace substring
`text.split(s, sep)`	Split into list
`text.join(parts, sep)`	Join list into string
`text.contains(s, sub)`	True if substring present
`text.starts_with(s, prefix)`	True if starts with prefix
`text.ends_with(s, suffix)`	True if ends with suffix
`text.length(s)`	String length
`text.match(s, pattern)`	True if regex matches
`text.slice(s, start, end?)`	Substring by index

Writing Python libs

Plain Python files work out of the box — Peppermint wraps public functions automatically. For more control, use peppermint.bridge decorators.

Simple case — no decorators needed

# mylib.py
def normalize(rows):
    total = sum(r["value"] for r in rows)
    return [{**r, "pct": r["value"] / total} for r in rows]

use "./mylib.py" as mylib
load("data.csv") |> mylib.normalize() |> print()

Functions receive plain Python values (list[dict], str, int, etc.). Exceptions become Err automatically.

Using decorators

from peppermint.bridge import pep_fn
from peppermint.stdlib.core import pep_signature

@pep_fn
@pep_signature("mylib.top(data, n: Int) -> List<Row>")
def top(data, n=10):
    """Return the top n rows by the first numeric column."""
    return sorted(data, key=lambda r: list(r.values())[0], reverse=True)[:n]

def build_mylib_env():
    return {"top": top}

Decorators

Decorator	Behavior
`@pep_fn`	Default. Auto-evaluates unevaluated kwargs. Exceptions become `Err`.
`@pep_fn_lazy`	Alias for `@pep_fn`.
`@pep_fn_static`	No evaluation step — args pass straight through.

@pep_signature("lib.fn(args) -> ReturnType") attaches the signature shown in LSP hover tooltips.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standard Library

Core — always available

Annotations

`recover`

Context fields

`use env`

`use math`

`use ml`

`use viz`

`use text`

Writing Python libs

Simple case — no decorators needed

Using decorators

Decorators

FilesExpand file tree

stdlib.md

Latest commit

History

stdlib.md

File metadata and controls

Standard Library

Core — always available

Annotations

recover

Context fields

use env

use math

use ml

use viz

use text

Writing Python libs

Simple case — no decorators needed

Using decorators

Decorators

`recover`

`use env`

`use math`

`use ml`

`use viz`

`use text`