|
| 1 | +# PTC-Lisp Design Guidelines |
| 2 | + |
| 3 | +Rules for deciding what belongs in PTC-Lisp and when it should diverge from Clojure. |
| 4 | + |
| 5 | +PTC-Lisp is a Clojure-shaped language for LLM-generated, sandboxed programs. Clojure conformance is valuable because models already know Clojure idioms and conformance tests catch subtle behavior bugs. It is not the top-level goal. The top-level goal is deterministic, bounded, recoverable data transformation inside an agent loop. |
| 6 | + |
| 7 | +See also: [PTC-Lisp Specification](ptc-lisp-specification.md), [Function Reference](function-reference.md), and [Clojure Conformance Gaps](clojure-conformance-gaps.md). |
| 8 | + |
| 9 | +## Design Priorities |
| 10 | + |
| 11 | +Apply these in order when adding syntax, functions, or interop: |
| 12 | + |
| 13 | +1. **Sandbox safety** - programs must be bounded in time, memory, and host access. |
| 14 | +2. **Recoverability for LLM code** - common bad inputs should produce guardable signal values when recovery is useful. |
| 15 | +3. **Clojure familiarity** - use Clojure names, arities, truthiness, collection behavior, and data idioms unless a higher priority overrides them. |
| 16 | +4. **Determinism** - avoid ambient state, uncontrolled time, random behavior, filesystem access, and network access except through explicit tools. |
| 17 | +5. **Small surface area** - prefer a compact set of predictable primitives over full language completeness. |
| 18 | +6. **Boundary clarity** - distinguish PTC-Lisp data functions, tool calls, and Java-named compatibility methods by name and behavior. |
| 19 | + |
| 20 | +## What To Include |
| 21 | + |
| 22 | +Include a feature when it: |
| 23 | + |
| 24 | +- Helps with data transformation, filtering, aggregation, validation, string processing, JSON, or tool-result shaping. |
| 25 | +- Runs eagerly within sandbox limits. |
| 26 | +- Is deterministic and has no hidden global state. |
| 27 | +- Does not expose host capabilities, filesystem I/O, arbitrary class access, or runtime code loading. |
| 28 | +- Can be documented with a few examples that LLMs are likely to generate correctly. |
| 29 | +- Preserves the Clojure contract or has an explicit `DIV-*` rationale. |
| 30 | + |
| 31 | +Good candidates: pure collection functions, predicates, threading forms, destructuring patterns, bounded regex helpers, JSON helpers, and small Java compatibility shims that models commonly generate. |
| 32 | + |
| 33 | +Poor candidates: lazy or infinite sequences, macros, `eval`, `read-string`, mutable references, arbitrary host interop, dynamic vars, filesystem I/O, exception machinery, protocols, multimethods, and large abstraction systems. |
| 34 | + |
| 35 | +## Clojure Conformance Rules |
| 36 | + |
| 37 | +Use Clojure behavior as the default for Clojure-named functions and forms: |
| 38 | + |
| 39 | +- Preserve truthiness: only `nil` and `false` are falsey. |
| 40 | +- Preserve return-value idioms: `and`/`or` return actual values, `seq` returns `nil` for empty collections, `some` returns the first truthy result. |
| 41 | +- Preserve nil-friendly data access: `get`, keyword lookup, `get-in`, `first`, `last`, and `nth` should remain easy to compose with `some->` and `when`. |
| 42 | +- Preserve names and arities when the Clojure contract is safe and bounded. |
| 43 | +- Test supported Clojure-compatible behavior against the conformance suite when practical. |
| 44 | + |
| 45 | +If a feature is marked supported in the audits but behaves differently from Clojure, either fix it or move it to an intentional `DIV-*` entry with rationale. |
| 46 | + |
| 47 | +## Intentional Divergence Rules |
| 48 | + |
| 49 | +Diverge from Clojure when matching Clojure would make LLM-generated sandbox code less safe, less bounded, or less recoverable. |
| 50 | + |
| 51 | +Prefer an intentional divergence when one of these applies: |
| 52 | + |
| 53 | +- **Clojure raises for bad input data.** PTC-Lisp has no `try`/`catch`; raising terminates the program. For Clojure-named helpers, prefer signal values such as `nil`, `""`, `false`, or an empty collection when the caller can reasonably continue. |
| 54 | +- **Clojure relies on laziness or infinity.** PTC-Lisp is eager and bounded. Require finite inputs and explicit limits. |
| 55 | +- **Clojure relies on mutable or global runtime state.** Omit it unless there is a narrow deterministic substitute. |
| 56 | +- **Clojure exposes host power.** Omit it or provide a tiny whitelisted compatibility surface. |
| 57 | +- **Clojure's exact behavior would create plausible wrong output.** Prefer a clean signal over a value that looks valid but means "miss". |
| 58 | + |
| 59 | +Do not signal when collapsing distinguishable failures would hide a likely code bug. The practical line is: properties of input data may signal; properties of the program should raise. For example, `(parse-long "abc")` returns `nil` because external text failed to parse, but `(+ 1 nil)`, invalid arity, or an unknown symbol raises because the generated program is wrong. |
| 60 | + |
| 61 | +Every intentional divergence must be documented in [Clojure Conformance Gaps](clojure-conformance-gaps.md#intentional-divergences--by-design-not-bugs) with: |
| 62 | + |
| 63 | +- the Clojure behavior, |
| 64 | +- the PTC-Lisp behavior, |
| 65 | +- the reason for diverging, |
| 66 | +- the expected caller idiom, |
| 67 | +- links from the function reference or spec when the function is user-visible, |
| 68 | +- a regression test that fails if the divergence is accidentally "fixed" back to Clojure behavior. |
| 69 | + |
| 70 | +## Signal Values |
| 71 | + |
| 72 | +Signal values are ordinary return values that let code keep running and branch explicitly: |
| 73 | + |
| 74 | +| Signal | Use when | |
| 75 | +|--------|----------| |
| 76 | +| `nil` | Missing value, parse failure, absent match, invalid non-critical input | |
| 77 | +| `""` | String extraction miss where the result is naturally string-shaped | |
| 78 | +| `false` | Predicate cannot prove the property or receives unsupported input | |
| 79 | +| `[]` / `{}` / `#{}` | Collection result is naturally empty and the operation succeeded | |
| 80 | + |
| 81 | +Use a signal value only when it is unlikely to hide a serious programmer fault. Arithmetic with `nil`, invalid arity, unknown symbols, and malformed tool calls should still raise because continuing would hide a bug in the generated program. |
| 82 | + |
| 83 | +When choosing between signals, keep the output type stable. A string function should usually return a string signal such as `""`; a lookup or parser should usually return `nil`; a predicate should return `false`. |
| 84 | + |
| 85 | +## Error Rules |
| 86 | + |
| 87 | +Raise an execution error for programmer faults: |
| 88 | + |
| 89 | +- syntax and parse errors, |
| 90 | +- invalid arity, |
| 91 | +- unknown symbols or tools, |
| 92 | +- non-callable values in call position, |
| 93 | +- type errors where no useful recovery signal exists, |
| 94 | +- invalid tool or catalog arguments, |
| 95 | +- sandbox limits such as timeout, memory, recursion, or iteration caps. |
| 96 | + |
| 97 | +Return signal values for world faults and expected data misses: |
| 98 | + |
| 99 | +- missing keys, |
| 100 | +- no regex match, |
| 101 | +- failed parse of external text, |
| 102 | +- absent JSON or malformed JSON in data supplied by a tool, |
| 103 | +- upstream failures that are explicitly modeled as recoverable. |
| 104 | + |
| 105 | +This split is part of the language contract: program bugs should be loud; messy data should be composable. |
| 106 | + |
| 107 | +Callers should convert a signal into `(fail ...)` when the missing or invalid value means the agent cannot complete the requested task. Otherwise, guard or filter the signal locally. See [Getting Started](guides/subagent-getting-started.md) for the multi-turn `return` / `fail` flow. |
| 108 | + |
| 109 | +## Java-Named Methods |
| 110 | + |
| 111 | +Java-named methods keep Java semantics. |
| 112 | + |
| 113 | +Dot-prefixed forms such as `.substring`, `.indexOf`, `.length`, and date/time methods exist because LLMs often generate Java-shaped code. The dot prefix signals that the caller opted into Java compatibility. These methods should follow Java's arity, index, sentinel, and error behavior unless a specific method is documented otherwise. |
| 114 | + |
| 115 | +This is intentionally different from Clojure-named helpers. For example: |
| 116 | + |
| 117 | +- `.indexOf` returns `-1` when not found, matching Java. |
| 118 | +- `index-of` returns `nil` when not found, matching the safer Clojure-shaped PTC-Lisp idiom. |
| 119 | +- `.substring` raises on invalid indices, matching Java. |
| 120 | +- `subs` returns string-shaped signal values for out-of-range cases, per `DIV-22`. |
| 121 | + |
| 122 | +Do not silently soften Java-named methods unless the method's name or docs make the new contract obvious. If safer behavior is needed, prefer an existing Clojure/PTC-named wrapper or choose a plain descriptive name that states the operation, not a `safe-*` prefix. |
| 123 | + |
| 124 | +## Feature Review Checklist |
| 125 | + |
| 126 | +Before adding a function or form, answer: |
| 127 | + |
| 128 | +- What LLM-generated task does this make easier? |
| 129 | +- Is the operation pure and deterministic? |
| 130 | +- Can it be bounded without lazy evaluation? |
| 131 | +- Does it expose host capabilities or new side effects? |
| 132 | +- If Clojure would raise, should PTC-Lisp raise or return a signal value? |
| 133 | +- What is the smallest useful arity set? |
| 134 | +- Does it fit the existing data model, especially vector-first sequential data and string-keyed tool boundaries? |
| 135 | +- Should it appear in `priv/functions.exs`, `priv/function_audit.exs`, the function reference, the spec, or the conformance gaps doc? |
| 136 | +- What regression test pins the intended divergence? |
| 137 | + |
| 138 | +If the answer depends on "models might generate it", prefer a narrow compatibility shim over a broad subsystem. |
| 139 | + |
| 140 | +## Conformance Workflow |
| 141 | + |
| 142 | +When checking Clojure conformance: |
| 143 | + |
| 144 | +1. Run the existing conformance tests or add a minimal reproducer against SCI, Babashka, Joker, or direct Clojure output. |
| 145 | +2. Classify the result as a bug, missing candidate, not relevant, or intentional divergence. |
| 146 | +3. Fix bugs that violate the design priorities or common Clojure idioms. |
| 147 | +4. Document intentional divergences as `DIV-*` entries. |
| 148 | +5. Add a regression test that encodes the PTC-Lisp contract, not only the Clojure comparison. |
| 149 | +6. Link user-visible divergences from the spec and generated registry metadata. |
| 150 | + |
| 151 | +Conformance should keep the language familiar. It should not force PTC-Lisp to inherit features that make sandboxed LLM programs harder to execute safely. |
0 commit comments