Peppermint

A pipe-first language for data and ML work, running on top of Python. Every operation is a pipeline step and errors propagate automatically. The Python ecosystem (pandas, scikit-learn, or your own code) is accessible from within the language.

Install

pip install peppermint-lang
pip install peppermint-lang[ml]    # + scikit-learn, umap, openai
pip install peppermint-lang[lsp]   # + language server
pip install peppermint-lang[all]   # everything

Or from source:

git clone https://github.com/chayapatr/peppermint
cd peppermint
pip install -e ".[all]"

Run

pep file.pep  # run a file
pep           # interactive REPL
pep lsp       # start language server (stdio)

Examples

load("employees.csv")
  |> filter(it.age > 18)
  |> add(tax: it.salary * 0.2)
  |> sort(by: "salary", dir: "desc")
  |> print()

Each step prints a live summary:

|> filter    → List  843 rows × 5 cols  (157 dropped)
|> add       → List  843 rows × 6 cols  (+tax)
|> sort      → List  843 rows × 6 cols

use ml
use viz
use env

load("data.csv")
  |> add(embedding: ml.embed(it.text,
      source: "deepinfra", model: "Qwen/Qwen3-Embedding-4B",
      apikey: env.DEEPINFRA_TOKEN))
      @concurrent(10)
  |> ml.kmeans(k: 2..8, on: "embedding", out: "cluster")
  |> ml.umap(dims: 2, on: "embedding", out: "umap")
  |> viz.scatter(x: "umap_1", y: "umap_2", color: "cluster", display: { label: "text", legend })

Aggregate

load("sales.csv")
  |> collapse(by: "region",
      avg: mean(col.revenue),
      n:   count()
  )
  |> sort(by: "avg", dir: "desc")
  |> print()

Top N per group

load("sales.csv")
  |> each(by: "region",
      |> add(rank: rank(col.revenue, dir: "desc"))
      |> filter(it.rank <= 3)
      |> drop("rank")
  )
  |> print()

LLM enrichment with retry and caching

use ml
use env

load("posts.csv")
  |> add(label: ml.llm(it.text,
      source: "openai", model: "gpt-4o",
      apikey: env.OPENAI_API_KEY, format: "json"))
      @concurrent(10)
      @retry(3)
      @until(it.label != none, max: 5)
      @cache

match(len(result.errors),
  == 0: result.data |> save("output.csv"),
  _:    halt("rerun to retry {len(result.errors)} failed rows")
)

@cache on ml.llm caches each row by content hash. Failed rows are never cached, so rerunning retries them automatically.

Error handling

result = load("data.csv")
  |> filter(it.score > 0.5)

match(result,
  Ok(data): data |> print(),
  Err(msg):  print(msg)
)

Python bridge

use "./transforms.py" as t

load("data.csv")
  |> t.clean()
  |> print()

Python functions receive and return plain Python types. Conversion is automatic.

Editor support

The LSP server (pep lsp) provides diagnostics, hover docs, completions, and go-to-definition for any LSP-capable editor.

VSCode — install the extension from ecosystem/vscode-peppermint/. It auto-discovers pep via mise, pyenv, or Homebrew — no PATH setup needed.

Neovim:

vim.lsp.start({ name = "peppermint", cmd = { "pep", "lsp" }, root_dir = vim.fn.getcwd() })

Helix (~/.config/helix/languages.toml):

[language-server.peppermint-lsp]
command = "pep"
args = ["lsp"]

See docs/language.md for the language reference, docs/stdlib.md for all stdlib functions, and docs/ecosystem.md for full editor setup.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Peppermint

Install

Run

Examples

Editor support

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Peppermint

Install

Run

Examples

Editor support