Date: 2023-12-25
Last updated: 2025-12-28
This is the first entry in what I intend as a multi-part series.
1ML is a type system designed by Andreas Rossberg and described in a collection of papers by him. When it comes to integrating an ML-style module system into a new language, I think 1ML currently provides the best overall story (although it's not perfect).
Unfortunately, Rossberg's papers are pretty technical. He is a good writer, but I feel that right now there is still a communication barrier between academics who are in a position to discuss 1ML in depth and people who are in a position to write new compilers. I see a lot more discussion online that merely references 1ML than I do discussion that seems to deeply understand it.
I've been studying 1ML for the last several months, and working on my own prototype implementation (which unfortunately isn't yet in a publishable state). In an effort to help these ideas percolate down towards mainstream adoption, I want to help demystify some things that I found confusing.
This series will not substitute for reading the paper, but I hope it can be a useful companion. I'll be working from the version of the paper published in the Journal of Functional Programming in 2018. It'll be very helpful to be able to look at that paper as you read this. In fact, you can press this button to pull it up on the side in this page: Open (It probably doesn't look great on mobile, unfortunately.)
In writing this, I assume the reader:
- Has some familiarity with inference rule notation and how type systems are defined using a combination of grammars and inference rules.
- Has some familiarity with the module system of OCaml or Standard ML, from a user's perspective (not necessarily knowing their internals).
- Has familiarity with 1ML's design goals as outlined in the paper introduction.
- Has not necessarily read the prequel, F-ing modules, Rossberg's earlier paper on compiling module calculi to Fω.
By the end of this series of articles, the reader will have the tools they need to understand every single inference rule, including those of the type inference algorithm described in Section 7.
I also want to discuss some implementation concerns that are not mentioned in the paper (such as how to integrate type inference with levels).
A decent portion of the paper is dedicated to meta-theory (e.g. the proofs that the type system is sound, etc.). I have not read these sections in depth, and will not be discussing them here. It is reasonable for compiler authors to take these results as given and not focus on their details.
In some places, I will be criticizing the way the paper presents its ideas. Please do not take these as prescriptions for how it should have been written. I do not have the standing to make prescriptions like that; the paper was written with considerations other than the ones most important to me (such as brevity of the rules, ease of meta-theory, & an intended audience of other academics).
At a basic level, 1ML is really just a sophisticated syntactic sugar for a version of the lambda calculus called System Fω. If you're not familiar, this is basically the core type system of Haskell, including higher-kinded type constructors, with all polymorphism being explicit. 1ML terms (I'll use term and expression interchangeably) are converted into Fω terms and 1ML types are converted into Fω types.
The specific structure of this translation is best saved for a later article.
For now, I'll say that whenever a type
This may seem a little unfamiliar; I was used to seeing type systems that don't distinguish between the surface syntax for types and the internal representation of types. But of course, a routine like this exists in any real-world type checker, it's just that many papers would consider it "out of scope" and not discuss it.
Technical note: order of record fields
Whenever a type system defines a class of objects
Unfortunately, when it comes to how Fω's record field labels work, the paper isn't so clear. Fig. 3, which defines the syntax of Fω, has only this to say:
$$ \tau ::= \dots \mid { \overline{ l : \tau } } \qquad e ::= \dots \mid { \overline{ l = e } } $$
(See below for an explanation of the overline notation.)
However, Fig. 4, which provides typing rules, has this rule for typing field access:
$$ \frac{ \Gamma \vdash e : { l : \tau, \overline{ l' : \tau' } } }{ \Gamma \vdash e.l : \tau } $$
This rule only makes sense if we understand record fields to be intrinsically unordered.
I think we can also reasonably assume that
I want to linger on
the type system of Fω
for a bit longer.
We can think of Fω type expressions
The key rule is this one:
In words: "whenever an Fω term
The type equivalence rules are presented as two-directional equalities (
Figure 6 defines the grammar for semantic types. It's important, so I'll reproduce it here.
Note: the meaning of overline notation, as in $\overline{l : \Sigma}$
This is meta-theoretic syntax that just means "zero or more"
(sometimes including delimiting commas).
For those familiar with Rust's macro system,
it is comparable to something like { $($l:label : $sigma:large_typ),* }.
The first thing we should observe is that large and small types aren't really defining a new grammar,
but instead a subgrammar of Fω types
(so we have
The only wrinkle is the new type formers:
the singleton
The path grammar is a little wonky. It allows repetition of
I honestly find the whole idea of a path a little poorly motivated.
My best attempt to articulate the role paths play in the rest of the paper
is that they enforce that a semantic type is (roughly) in normal form
(since type-level function applications always have a variable at their head),
and that only small types can be applied as arguments to functions.
"Path" is really a misnomer from this perspective.
(The fact that a set of paths is an input to the subtyping relation
The abstracted type grammar T where type t = bool.
As an illustration of this, you can look at the desugaring rules
on page 25,
which say that the surface signature { type t; x : t } gets desugared
to
In my opinion, writing
-
Don't create a separate data type for Fω types
$\tau$ ; instead, create a data type that represents large types$\Sigma$ . -
Contexts
$\Gamma$ contain large types. -
Don't have a separate data type for small types; it will result in too much code duplication. Instead, smallness should be an invariant not tracked by the type system.
-
Do create a separate data type for
$\Xi$ , along the lines of:type signat = (type_variable * kind) list * large_type
-
Introduce a dedicated constructor for
$[= \Xi]$ in your large type ADT. -
Either have a dedicated constructor for
$\forall \overline\alpha. \Sigma \to_\eta \Xi$ , or separate it out based on$\eta$ to model the syntactic invariant for pure functions ($\forall \overline\alpha. \Sigma \to_{\texttt{I}} \Xi$ and$\forall \overline\alpha. \Sigma \to_{\texttt{P}} \Sigma$ ).
That's all I have time for today.
In the next entry, I'll cover the surface syntax of 1ML,
the desugaring judgement