This repository contains a Pi Coding Agent extension that sends Pi runtime telemetry to Langfuse. The code is small, event-driven, and stateful. Most changes affect lifecycle ordering, payload shaping, or session isolation rather than UI behavior.
Use this file as the working guide for code changes in this repo. For user-facing installation and
feature documentation, prefer README.md and README_CN.md.
index.ts: extension entrypoint; registers Pi commands and hooks all Pi lifecycle events.src/handlers/agent.ts: starts and finishes the root Langfuse agent observation and trace IO.src/handlers/generation.ts: tracks provider requests, response metadata, TTFT, and generation completion.src/handlers/tool.ts: tracks tool observations, correlates them bytoolCallId, and records tool error scores.src/handlers/turn.ts: creates turn-level span wrappers so generations and tools can nest under a turn.src/state.ts: session-scoped mutable runtime state built onAsyncLocalStorage.src/config.ts: config loading, first-run setup UI, and config persistence.src/langfuse.ts: Langfuse runtime bootstrap, score client, flush/shutdown, and REST fallback ingestion.src/utils.ts: payload shaping, truncation, extraction helpers, and defensive parsing.src/constants.ts: payload size and truncation limits.src/types.ts: shared runtime and observation typings.test/state.test.ts: verifies per-session state isolation and overlapping async session safety.test/utils.test.ts: verifies payload shaping limits and circular handling..agents/skills/langfuse/: local Langfuse skill docs and references used by agents working in this repo.
The extension maps Pi events onto one Langfuse trace tree:
- One Pi agent run becomes one
pi-agenttrace with a rootagentobservation. - Provider requests become
llm-generationobservations. - Tool calls become
toolobservations. - Turns become
spanobservations that can parent generations and tool calls. - Session-level bookkeeping is keyed by Pi session ID, not by global process state.
The main event flow is:
session_start: ensure config and reset run state for the session.before_agent_start/agent_start: create the root agent observation if missing.turn_start: open a turn span.before_provider_request: start a generation.after_provider_response: attach provider metadata and early error status.message_update: record TTFT and capture the latest assistant output.message_end: finalize the active generation.tool_execution_start/tool_call: start a tool observation.tool_result/tool_execution_end: finalize the matching tool observation.turn_end: close the turn and synthesize a fallback generation if Pi skipped normal generation events.agent_end: close the root observation, update trace IO, and send aggregate scores.session_shutdown: close dangling observations and flush Langfuse runtime state.
- Preserve session isolation.
src/state.tsusesAsyncLocalStorageso overlapping handlers do not leak counters or active observations across Pi sessions. - Preserve idempotency around lifecycle hooks.
before_agent_startandagent_start, and similarly tool/generation start-end pairs, may both fire; handlers are written to tolerate duplicate entry points. - Keep tool correlation keyed by
toolCallId. This is important for concurrent tool execution. - Maintain defensive payload shaping. Large objects, circular references, deep trees, and JSON-like strings are intentionally normalized before being sent to Langfuse.
- Do not bypass
shapePayload(),truncate(), or related helpers when adding new telemetry fields. - Treat config and credentials as sensitive. Never hardcode keys or commit local config artifacts.
- Prefer minimal metadata additions. Langfuse payloads should stay readable and bounded.
Config precedence is:
~/.pi/agent/pi-langfuse/config.jsonLANGFUSE_PUBLIC_KEY/LANGFUSE_SECRET_KEYLANGFUSE_BASE_URLorLANGFUSE_HOST- Interactive
/langfuse-setupin Pi UI when config is missing
Relevant implementation details:
src/config.tsloads saved config first, then env vars.- First-run setup is only attempted once per session via
state.setupAttemptedThisSession. - Manual
/langfuse-setupclears cached config and shuts down the runtime before reconfiguring.
- The runtime is created lazily in
src/langfuse.ts. - OpenTelemetry export is the primary path.
- If OTel accepts spans but the trace never becomes visible, the extension falls back to Langfuse REST ingestion.
- Scores are sent separately through the Langfuse client; they are not part of the OTel span export path.
- Root trace IO is mirrored from the root agent observation when
setTraceIO()is available.
When editing Langfuse integration code, be careful with:
- flush and shutdown ordering
- trace visibility polling and fallback ingestion
- observation parent/child nesting
- score attribution to trace vs observation IDs
- Read
index.tsand the affected handler together before making changes. - Keep fallback paths intact. Many branches exist because Pi events can arrive in different combinations.
- If adding a new event hook, make sure it behaves correctly for multi-session execution.
- Add logic in
src/utils.tsfirst, then consume it from handlers. - Favor tolerant extraction over strict schema assumptions because Pi/provider event payloads vary.
- Keep truncation limits centralized in
src/constants.ts.
- Update both the computation path and any documentation that describes the score names.
- Keep trace-level scores in
finishAgentRun()and tool-level error scores infinishToolObservation()unless there is a clear reason to move them.
- Keep tests focused on behavior that is easy to regress: session isolation, payload shaping, event-order safety, and truncation behavior.
- Avoid broad snapshot-style tests for Langfuse payloads unless a specific regression justifies them.
Run these checks after substantive changes:
npm run typecheck
node --test test/*.test.tsFor integration-sensitive changes, also run Pi with the extension enabled and confirm in Langfuse that:
- a trace is created for each prompt
- the root agent observation contains prompt input and final output
- generations and tool observations are nested correctly
- tool errors are marked as
ERROR - aggregate scores are attached to the trace
- Breaking session scoping by storing new mutable state outside
src/state.ts. - Ending observations twice or forgetting to mark them as ended.
- Losing fallback generation coverage when no normal provider lifecycle completes.
- Adding large raw payloads directly to metadata or output fields.
- Forgetting that self-hosted Langfuse may require the REST fallback path.
- Documenting behavior in
AGENTS.mdorREADME.mdthat no longer matches the actual handlers.
README.md: package usage, installation, configuration, and trace model.AGENTS_CN.md: Chinese version of the repo guide..agents/skills/langfuse/SKILL.md: local Langfuse skill entry..agents/skills/langfuse/references/: Langfuse CLI, instrumentation, migration, and troubleshooting notes.