Implicit parameters for Unison — RFC + working prototype#6238
Open
runarorama wants to merge 21 commits into
Open
Implicit parameters for Unison — RFC + working prototype#6238runarorama wants to merge 21 commits into
runarorama wants to merge 21 commits into
Conversation
Polish on top of the implicits RFC commit so the full
`class Monoid m = { op, zero }` / `foldMap` example from the design
discussion typechecks, elaborates, runs, and round-trips through
`view`.
Surface syntax
--------------
* New `class T a = { f1 : T1, … }` keyword. Sugar over a record-style
`unique type`: generates only getters (no setters / modifiers), and
each getter's signature carries the class as an implicit (`=>`)
parameter so a call like `Monoid.op acc x` lets implicit
resolution thread the dictionary in.
Typechecker
-----------
* New `=>I` rule in `checkWanted`: when a `Lam'` is checked against
an `ImplicitArrow'`, bind the lambda variable and register it as a
lexical given for the body. Together with a parser pre-pass that
injects `\_implicit_<name>_<i> ->` for each leading `=>` on a
top-level (or `given`) binding, polymorphic constraints emitted
inside the body now resolve against the function's own
`=>`-parameter instead of failing for lack of an ambient instance.
* `peelLeadingImplicits` eagerly emits a `ConstraintGoal` per leading
`=>` when synthesising a `Var`/`Ref`. The matching dictionary
insertion is done by `GivenApply.wrapImplicitLeaves`, which
consults a per-location decision map (so multi-binding files
elaborate without bindings stealing each other's decisions). The
resolver's `UnresolvedMetavarInGoal` short-circuit is dropped:
goal-side inference variables are treated as flexible in
head-unification.
* `GivenApply.buildDictionary` substitutes a `Term.var` when the
chosen given is a file-local one (`Local.given.<name>`) — closes
the runtime gap where the synthetic builtin reference couldn't be
evaluated.
Codebase plumbing
-----------------
* `ambientGivensFromBranch` now walks sub-branches recursively so
nested given-tagged refs (e.g. `Monoid.nat` under a `Monoid`
namespace) are found by the L1 ambient pool.
* `Result.UnresolvedImplicit` is surfaced by the file loader as
`Output.UnresolvedImplicits` with a rendered diagnostic. Previously
the file silently failed to load.
* `TypecheckedUnisonFile` carries the parser's `givenBindings` set
through to `update`/`add`, which auto-marks each `given`-tagged
binding in namespace metadata — no follow-up `mark.given` needed.
Pretty-printer (ADR-015 elide-mode)
-----------------------------------
* New `Ann.Synthetic` variant flags elaborator-inserted terms in
memory; `stripSyntheticArgs` removes them before watch-output
printing.
* For `view`, where annotations have been stripped by codebase
storage, `stripImplicitArgsByType` derives implicit slots from
the function's declared type, and `stripLeadingImplicitLambdas`
removes the parser-injected `\_implicit_*` lambdas from the
binding term. Result: `view foo` renders exactly as written, but
`dependencies foo` still lists `Monoid.nat` — the hash of the
chosen given remains baked into `foo`'s hash.
Tests
-----
* The override-detection test is flipped: a compound-with-widened
annotation is no longer treated as an `@`-override (it was
conflicting with list literals and other natural-widening forms;
the parser-emitted `@`-override path remains gated by the leaf
predicate).
* The resolver's `UnresolvedMetavarInGoal` test is reworked to
assert successful resolution against a unique candidate.
* The let-given-shadows-ambient test no longer expects the ambient
given to appear elsewhere in the decision list — the spurious
goal at the definition site that produced that decision is gone
now that `=>I` consumes the binding's leading lambda.
…essors
The class-accessor annotation was using the field type as it appeared in
the raw 'SynDataDecl' (parser output), where external references like
'Text' are still 'Var's. Checking the generated accessor body against
that annotation would fail with
I don't know about the type Text
Fix: pull each field's type out of the resolved 'DataDeclaration' in
'UF.datas env' instead. 'environmentFor' has already run 'bindNames'
over those constructor types, so every external reference is bound to
its namespace ref. Records have exactly one constructor whose type is
'forall tyvars. T1 -> ... -> Tn -> Self'; the input chain gives the
field types in declaration order, which we zip back against the
parsed field names.
Surface additions:
* `give f dict args` lowers f's leading `=>` arrows to `->` at the
use site so the caller supplies dictionaries positionally instead
of via implicit resolution. Carried through the typechecker as
an `Ann.Lowered` marker on the head and demoted in synthesis via
`lowerLeadingImplicit`.
* `class T tyvars = { f1 : T1, ... }` parses identically to a
record-style data declaration but the parser also records the
type name in a side channel; `add` / `update` then marks the
type's namespace metadata with the `##Builtin.Class` sentinel.
`view` walks the sentinel, emits the `class` keyword, and renders
the constructor with `{ field : type, ... }` syntax. Hash
invariance is preserved: re-parsing the rendered output produces
the same hash.
Notable pieces:
* `Unison.Codebase.Classes` (new module) mirrors `Givens` for the
`types_` namespace Star.
* `DeclPrinter.IsClassRef` predicate + `hashClassFieldAccessors` /
`getClassFieldNames` discover class accessors by reconstructing
the `=>`-bearing accessor type and looking up its hash.
* `Output.Typechecked` carries a `Set Name` of class bindings so
the slurp summary renders `+ class T` rather than `+ type T`.
* `GivenApply.rewriteApply`'s `lowerAll` now descends through
`Forall` quantifiers; previously a head with type
`forall a. C a => …` fell through unchanged and `interleave`
popped a sibling binding's locally-bound dictionary off the FIFO
queue, leaking it as a free var and crashing `hashTermComponents`.
* `prettyUnisonFile` consumes the file's `classBindings` so the
dependents-update re-render path keeps the `class` keyword.
Catches up `unison-src/transcripts/idempotent/` files that drifted
from their `.output.md` rendering since the implicits prototype
landed. The drift is the disappearing transient region message
("Okay, I'm searching the branch…") and the `-- given` comment
scaffold being replaced by the now-emitted `given` keyword.
Verified idempotent on a second pass.
The implicits prototype accumulated a lot of in-source pointers
to design ADRs, phase/chunk tracking, and historical implementation
state. None of it is useful to a reviewer reading the code today;
all of it is now removed.
* Deleted `docs/architecture/decisions/` (ADR-001 through ADR-023,
README, TEMPLATE) and the planning docs `docs/implicits-plan.md`,
`docs/implicits-phase-2-chunks.md`.
* Stripped `ADR-NNN`, `chunk Xn`, `Phase-2 chunk`, and `chunks.md`
pointers from comments throughout `parser-typechecker`,
`unison-cli`, `unison-syntax`, `unison-core`, `unison-runtime`,
`codebase2`, and the test suite. Substantive technical claims
that the pointer was attached to are kept; pointer-only comments
are dropped.
* Removed implementation-diary commentary ("previously", "this
used to", "before X landed", "spike port", "carry-over from
chunk N").
93a648d to
fbe07a3
Compare
* `class` and `give` are reserved keywords for parsing source code,
but a definition stored in the namespace may still have a name
that collides with a keyword (e.g. an `@unison/base` term named
`class`). `Name.parseTextEither` / `HashQualified.parseText` now
fall back to escaping reserved-word segments with backticks
before retrying, so namespace deserialization succeeds for these
names.
* `unison-src/transcripts/idempotent/pattern-match-coverage.md`
defined an ability constructor `give : a -> {Give a} Unit`; that
name is now reserved, so renamed to `giv` throughout the file.
* `unison-syntax/test/Unison/Test/Unison.hs`: `summon` is no longer
a reserved keyword (it's a regular builtin term reference), so
the lexer test expectations are updated to expect a wordy
identifier instead of `Reserved \"summon\"`.
* Re-ran all idempotent transcripts via the `transcripts` binary,
which writes corrected output in-place. Several transcripts had
drifted from their current rendering and are now caught up.
A function declared with an implicit parameter — e.g. `mpower : Monoid a => Nat -> a -> a` — could not call itself recursively: typechecking would report "no matching given for Monoid a" at the recursive call site even though the surrounding binding's own `=>` slot was in scope. Four interacting fixes were required. 1. `Unison.Typechecker.Context.subtype` had no clauses for `ImplicitArrow'`, so any subsumption involving a `=>` arrow fell through to the catchall `TypeMismatch`. Three clauses now handle contravariant input + covariant output (parallel to the existing `Arrow'` clause) plus the two "drop a leading implicit on either side" cases for subtyping `(C => T) <: T`. 2. `GivenResolver.matchHead`'s flex set previously contained only the candidate's freshened type variables plus goal-side `Var.Inference` metavars. A lexical given like `_implicit_ident_0 : Box a` — registered by `=>I` inside the surrounding binding — has `a` free in its conclusion referring to an outer universal, but when the recursive call site emits its goal, surrounding inference may resolve that universal through a *different* freshening, so the two `a`s have distinct ids even though they refer to the same logical variable. Adding the candidate's outer-scope free vars to flex lets the unifier bind them to whatever the goal exposes. Ambient givens are unaffected: their universally quantified tyvars are already in `fresh`. 3. Goal-side typechecker existentials weren't reaching the resolver. `Context.InfoNote.ConstraintGoal` carries a `TypeVar`-wrapped goal type, but `TypeVar.lowerType` strips the `Existential`/`Universal` distinction before the resolver sees it. `extractConstraintGoals` now collects the existential vars *before* lowering and threads them through `PendingGoal.pgExistentials` to a new `resolveWithExistentials` entry point, which adds them to `matchHead`'s flex set so the unifier can bind them. 4. When `wrapImplicitParams` injects `\_implicit_<name>_0 -> body` for a binding with an explicit signature, the term reaching `checkWanted` is `Ann (Lam _ ...) (forall .. => ..)`. After `checkScoped` strips the `Forall`, the type is `ImplicitArrow'` but the term is still `Ann`-wrapped, so the `Lam' + ImplicitArrow'` `=>I` clause does not fire and the goal is emitted by the apply-shaped fallback with an empty lexical scope — the lambda binder is never registered as a local given. A new narrow clause strips the redundant annotation specifically for `Ann (Lam _ ...) _` against `ImplicitArrow' _ _`, letting `=>I` fire and register the binder.
The previous commit added the candidate's outer-scope free type
variables to the unifier's flex set in 'matchHead', motivated by the
self-recursive '=>' case where the lexical given's universal had a
different fresh id than the goal's universal. That change is too
permissive: in a multi-constraint binding like
'Monoid.tuple : forall a b. (Monoid a, Monoid b) => Monoid (a,b)',
both '_implicit_Monoid.tuple_0 : Monoid a' and
'_implicit_Monoid.tuple_1 : Monoid b' end up matching any 'Monoid X'
goal, so resolving the binding's body's 'Semigroup'-derived 'Monoid'
premise becomes ambiguous.
The other three pieces of the previous commit — the subtype clauses
for 'ImplicitArrow'', the 'pgExistentials' threading, and the
'Ann' (Lam _ ...)' strip in 'checkWanted' — turn out to be enough on
their own. The Ann'-strip in particular eliminates the double-ForallI
that was causing the goal's universal id to drift away from the
lexical given's, so structural unification now succeeds in the
self-recursive case without the resolver having to be made
permissive.
Both the self-recursive case ('mpower : Monoid a => Nat -> a -> a')
and the multi-constraint algebra library typecheck cleanly with this
revert.
|
The ambiguity errors need a bit of work : once several Additionally, if I have a given instance in my codebase and an identical given instance in my scratch file, the elaborator seems to be unable to disambiguate the two : |
Two bugs visible to users of the implicits feature:
1. 'view' on a 'given' binding printed two stanzas — a signature
line ('given name : T') followed by a separate binding line
('name = body') — which is invalid surface syntax. The parser's
'givenBindingBody' accepts only the single-declaration form
'given name : T = body', so the printed output failed to
re-parse with "I was surprised to find an end of stanza here".
'Unison.Syntax.TermPrinter.prettyGivenBinding' now produces the
single-stanza form. It strips any leading synthetic
'\\_implicit_*' lambdas (inserted by 'wrapImplicitParams' for
bindings whose declared type contains '=>' arrows) before
rendering the body, so what comes out matches what the user
originally typed.
2. Hovering on a function call inside a 'given' binding's body
showed the type of the parser-injected '_implicit_<base>_<i>'
binder instead of the function being hovered. Those binders
wrap the entire body and so intersect every cursor position in
the body, shadowing real in-scope variables in the LSP's
per-position interval map.
'Unison.LSP.FileAnalysis' now skips 'VarBinding' / 'VarMention'
notes for variables whose name begins with '_implicit_', and
'Unison.LSP.Hover' refuses to surface them even if they're
reached via 'nodeAtPositionMatching'. The hover for the actual
function being called is restored.
Two related rough edges in given resolution that were surfaced by real use: 1. The 'Ambiguous' diagnostic rendered each candidate as the raw hash of its reference rather than the surface name. The resolver carries 'Reference' values that don't have a name on their own; pass them through 'PPE.termName' so users see 'Monoid.nat' (or its hash-prefixed fallback) instead of the bare unsuffixifiable hash. 2. When a scratch file redefined a 'given' that already existed in the codebase — i.e. the user was effectively running 'update' on an instance in place — both definitions ended up in the resolver's candidate pool and the typechecker reported an ambiguous resolution. A file-level definition should always shadow the codebase definition of the same name during the typecheck, just as ordinary term names do. 'AmbientGiven' now carries the candidate's surface 'Name' (when the caller can produce one). 'ambientGivensFromBranch' tracks the namespace path while walking and records the full name for each given. 'computeTypecheckingEnvironment' filters the ambient pool to drop any entry whose name matches a top-level 'given' binding in the scratch 'UnisonFile' before threading the pool to the typechecker.
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR proposes adding Scala-3-style implicit parameters (
given/=>) to Unison and includes a working prototype. It exists to anchor RFC discussion: read the design, run the examples below, push back on anything that surprises you.The end-to-end example from the design discussion typechecks, elaborates, runs, and round-trips through
view:UCM after
add:A polymorphic-with-premise given also works end-to-end:
Show.list's premiseShow ais discharged against the function's own lexical=>parameter while its body is checked, and atfoo's call site the resolver chainsShow.list(premiseShow a) →Show.natto buildShow [Nat]. The chosen dictionaries are wired intofoo's hash, andview fooround-trips to exactly what the user wrote.dependencies foolistsShow.natandShow.list— the resolved givens are permanently wired intofoo's hash. The same scratch works split across files: dropclass Show/given Show.natinto one file,add, then write the rest in a new scratch and resolution finds the givens from the namespace automatically.How implicits/givens fit into Unison
Five principles are load-bearing:
Givens are sugar for ordinary parameters. After elaboration, a constraint is
an ordinary positional argument. Runtime, codegen, ABT, and term hashing learn
nothing new — the entire feature lives in the elaborator.
Resolution is baked into the term hash. A term that referenced
Show Natfreezes the chosen dictionary's hash at typecheck time. Old terms never change
instance. Unison gets something Scala and Haskell can't: implicits that are
immune to "did adding this instance silently change my program?" — already-elaborated
terms are pinned by content addressing.
Namespace = instance set. No orphan rules, no global registry. Two libraries
that both define
given Show.natproduce ambiguity at any call site that seesboth, surfaced through the same machinery TDNR uses today.
Givenness is namespace metadata, not a term-level concept. Marking a
definition
givenupdates the namespace; term hashes are unchanged. The branch(causal) hash reflects the change. Aliasing, importing, forking work without
rehashing terms.
Givens extend TDNR rather than replace it. TDNR resolves "which named
definition fits this type"; given resolution adds "which value of this type
exists, with chaining." Same elaborator pass, same ambiguity reporting.
Surface syntax
A class declaration generates getter accessors whose dictionary parameter is an
=>arrow (no setters, no modifiers):Plain records are still available with
type Show a = Show (a -> Text); the difference is thatclassaccessors carry the constraint implicitly, so callers writeshow xinstead ofShow.show dict x.Declaring givens:
The
givenkeyword tags the binding so that, onadd/update, namespace metadata is automatically updated — no follow-upmark.givenneeded (but the explicit command is still there if you want to flip an existing definition).Consuming givens:
The
summonbuiltin retrieves the unique resolvable dictionary at the surrounding context's expected type. Its declared type isforall a. a => a— the leading=>makes the typechecker emit a constraint goal at every reference, which the resolver fills from lexical / ambient givens. At runtimesummonis the identity function. Use it when you want to bind a dictionary explicitly:summonis an ordinary term reference, so user code may shadow it.The
givekeyword is the dual:give fdemotes every leading=>inf's declared type to->, so the dictionaries become ordinary positional arguments at the call site. Use it for testing, mocking, or any time you want to bypass the resolver entirely:givehas to be a keyword (not a builtin) because the type transformation can't be expressed as a regular polymorphic signature — the existing implicit-resolution rule would fire first on a builtin reference and the dictionary would be auto-filled beforegivecould see it. Partial overrides (override one dict, let the resolver pick the rest) use the existinglet givenshadowing.Resolution
For each constraint hole the elaborator searches visible givens whose conclusion
unifies with the goal type, recursively resolving each candidate's premises.
Ambiguity surfaces as the existing TDNR-style error. Cycles fail per-branch (the
candidate, not the whole search); a global depth limit (default 50) catches
runaway chains. Resolutions are memoized per top-level invocation, caching
successes and failures.
Specificity: A is strictly more specific than B iff some σ satisfies
σ(B's declared conclusion) = A's declared conclusion, with σ binding only B'squantified variables. Local
givenbeats outergiven. Otherwise → ambiguity.Coherence by content-addressing
The interplay between resolution and Unison's hashes is what makes this design
distinctive. A term containing a constraint hole elaborates to one containing
the chosen dictionary's hash. If the namespace later gains a second matching
given, the elaborated term is unchanged — its hash already commits to the
original. Only new terms see ambiguity.
This is stronger than Haskell's coherence (depends on global instance discovery
and orphan rules) and stronger than Scala's (can silently change on recompile).
Unison's content addressing makes coherence a non-issue for already-elaborated
code, by construction.
What's not in v1
deriving/ macro derivationType,Ability, andarrow kinds)
What's in this PR
=>constraints,givendeclarations,let given,classdeclarations,@-positional explicit overrides.##Builtin.Givenand##Builtin.Classsentinels +MdValuesplumbing on the term and type Stars respectively(
Unison.Codebase.Givens/Unison.Codebase.Classes); UCM commandsmark.given,unmark.given,givens; sharing-API round-trip;unison-mergeconflict resolution for given-set diffs.given-declared bindings andclass-declared types are auto-markedon
add/update, and the slurp summary surfaces theclass/givenkeyword in place oftype/ the plain term signature.Type.Fextended withImplicitArrow; lexical givenenvironment threaded through every binder; constraint goals emitted at apply
sites and at any
Var/Refuse of a function with leading=>;=>Irulerecognises the parser-injected leading lambda for a binding's implicit
parameters and registers it as a lexical given for the body, so a polymorphic
function's own
=>-param resolves constraints raised inside its own body(this is what makes
foldMapand polymorphicShow.listabove work).given-taggedbindings are exposed to every goal in the file via the resolver's ambient
pool, sidestepping
minimize's reordering of independent SCCs (which couldotherwise leave a sibling
giveninvisible to an earlier user). Lexicalregistration is skipped for top-level givens so they don't tie with a
function's own
=>-parameter on theLexical 0scope tag.annotateLetRecBindings'sannotation-less redundancy pass and
noteTopLevelType's redundancy-checksynthesis both would leak
ConstraintGoalnotes with incomplete lexicalscopes (because
=>Inever fires without the user's annotation). AdiscardInfoNotescombinator wraps both passes so only theofficial-typecheck constraint goals reach the resolver.
summonbuiltin —summon : forall a. a => a, runtime identity. The=>on its declared type drives implicit resolution through the normalmachinery; the user writes
(summon : T)(or relies on context) and getsback the unique resolvable given of type
T.givekeyword —give fis a prefix syntactic transformation thatdemotes every leading
=>inf's declared type to->. Tagged withAnn.Loweredat parse time; the typechecker swapspeelLeadingImplicitsfor
lowerLeadingImplicitat lowered references;GivenApply.rewriteApplypre-demotes the type — descending through any leading
forallquantifiersfirst — before
interleavesees it, so the decision queue stays alignedeven for generalized signatures like
forall a. C a => a -> a.Type.existentializeArrowsno longer inserts anEffectrow betweenadjacent
=>s — implicit arrows carry no abilities. This was breakingthe
=>IcheckWanted pattern on chained constraints likeC1 => C2 => T.TDNR fixed point; resolved decisions applied in the term as ordinary
Appnodes against dictionary hashes. Decisions are looked up by source location
to keep multi-binding files coherent.
decomposeGivenTypestrips the emptyeffect wrapper that generalization leaves around a constraint's conclusion,
so a generalized polymorphic given unifies with the plain-shape goal at the
call site.
NoGiven/Ambiguous/DepthExceeded/Cyclerendering. Resolution failures now surface through the fileloader (previously the file silently failed to add).
diagnostics and code actions for resolution failures.
=>round-trips throughview/edit/update. Elide-mode default: dictionary arguments inserted by theelaborator are not shown in surface output (so
view foomatcheswhat the user wrote); the corresponding leading
\_implicit_*lambdas on the binding are also elided. The hash dependency is
preserved —
dependencies foostill lists the chosen given.viewforclass-tagged types emits theclasskeywordand renders the constructor with
{ field : type, ... }record syntax(driven by an
IsClassRefpredicate threaded throughDeclPrinter,and
hashClassFieldAccessorswhich reproduces the=>-bearingaccessor hashes to recover field names from the namespace).
viewforgiven-tagged definitions emits thegivenkeyword;viewfor terms that usedgiveemits thegivekeyword with theuser-supplied dictionary preserved (detected by walking each apply
chain whose head has a
=>-prefixed type and checking whether thedictionary argument is a namespace-tagged given or a local lexical
given — anything else is treated as user-supplied via
giveand getsa sentinel
Type.giveMarkerRefascription that the surface printerrecognises). The dependents-update path (
prettyUnisonFile) consumesthe file's
classBindingsso aclassre-printed for re-typecheckingkeeps the
classkeyword and re-parses to the same hash.=>/given/summon/@); printer round-trip property tests; typecheckerintegration; full resolver test matrix including diamond dependencies
and HKT; source-level end-to-end test pinning the parser → resolver
wiring; UCM transcripts covering the update-semantics scenarios
(updating a given changes the dependent's hash; breaking a given
leaves old code valid; a new ambiguous given doesn't disturb old
code).
stack build --fastclean.stack test --fast unison-parser-typecheckerpasses(452/452).
Not ready yet
givenandgivekeyword migration — the prototype hard-breaks identifiers usingthe
givenandgivenames. The deprecation cycle is a release-engineering concern.unison-mergepipeline integration — the given-set conflict moduleis sidecar; wiring it into the merge engine is a follow-on.
override-leaf parens diagnostic, partial-application handling, and more.