Skip to content

[Feature] Implicit generic arguments #126

@faiface

Description

@faiface

Generics in Par are fairly powerful. There are three places that involve type variables:

  • Type aliases themselves: type List<a> = ...
  • Forall types: dec List.Map : [type a, b] ...
  • Exists types: type SetModule<a> = (type set) ...

The case of type aliases is different than the latter two. A generic type alias is not a first-class citizen, a List is nothing on its own. Instead, it's a meta-programming feature, allowing you to share a common template for many types. Unless we involve higher kinds, they are fine as they are.

The latter two, forall and exists types are something else. They are types of real values, and they are first-class. Unlike in many languages, it's perfectly fine to store a generic value in a variable, or pass it as an argument to a function:

let id = box [type a] [value: a] value

let str = id(type String)("Hey")  // = "Hey"
let num = id(type Int)(42)        // = 42

From a theory perspective, Par's forall and exists types are the famous System F with existential types. Which is really neat and powerful.

However, using them in practice often proves unwieldy and cluttered. Let's take the usual list map function as an example:

dec List.Map : [type a, b] [List<a>, box [a] b] List<b>

Using it becomes very wordy:

let numbers: List<Int> = *(1, 2, 3, 4, 5, 6, 7, 8, 9)
let strings = List.Map(type Int, String)(numbers, box Int.ToString)

And that's a mild example! Let's take the GrepFile function from examples/MiniGrep.par:

dec GrepFile : [String, Os.Reader] Iterator<Os.Error, (Nat) String>
def GrepFile = [needle, reader] do {
  let parser = String.ParserFromReader(type Os.Error)(reader)
  let lines = Lines(type Os.Error)(parser)
  let lines = Enumerate1(type Os.Error, String)(lines)
} in Filter(type Os.Error, (Nat) String)(lines, box [(_) line] Contains(needle, line))

Now let's take a look what it could look like without all the boilerplate:

dec GrepFile : [String, Os.Reader] Iterator<Os.Error, (Nat) String>
def GrepFile = [needle, reader] do {
  let parser = String.ParserFromReader(reader)
  let lines = Lines(parser)
  let lines = Enumerate1(lines)
} in Filter(lines, box [(_) line] Contains(needle, line))

So much more readable!

What's so hard? Hindley-Milner solved it decades ago!

Let's entertain the usual approach for a moment:

  • Generic types are attached to the definition.
  • They are resolved by unification at usage.

Instead of:

dec List.Map : [type a, b] [List<a>, box [a] b] List<b>

We'd have:

dec List.Map<a, b> : [List<a>, box [a] b] List<b>

The problem here is that this approach fundamentally contradicts how Par does type checking.

Par uses so called bi-directional type checking. It's a combination of type checking and inference that has four benefits and one downside.

The downside:

  • More type annotations are needed.

The benefits:

  • More local type errors.
  • It's always clear where an annotation is needed.
  • Easy to extend to advanced features, such as Par's first-class generics.
  • It's fast.

To be able to do bi-directional type checking, one invariant must be maintained: At any point in code, there may be at most one variable with an unknown type.

For example, this is fine:

let x = chan y {
  // OK, `y` is a single unknown here.
  // `x` is in the outer scope, and irrelevant here.
  y <> 5
}
// `x` inferred as `Nat`

But this isn't:

let f = [x] Int.ToString(x)
//         ^
// NOT OK. At this point, both `x` and the return type are
// unknown. Bi-directional type checking can't handle that.

To fix it, we need a type annotation:

let f = [x: Int] Int.ToString(x)
// OK now.

So, back to this:

dec List.Map<a, b> : [List<a>, box [a] b] List<b>

Here's what's wrong with that, from the point of the way Par does type checking:

let numbers: List<Int> = *(1, 2, 3, 4, 5, 6, 7, 8, 9)
let mapper = List.Map(numbers)

What's the type of mapper now?

[box [Int] b] List<b>

b is an unknown type here to be solved. When? Later, based on usage. Par's type system was happily handling a single unknown variable, but now it has to deal with an unknown type as well. Such an unknown type would necessitate inference across variables, with constraints placed on them.

That would contradict at least two of the main benefits of Par's type checking:

  • More local type errors.
  • It's always clear where an annotation is needed.

I'm sure you know violations of these from Rust. You remove a line and suddenly you need a type annotation somewhere. Maybe 100 lines above.

We want to avoid that in Par. We want to keep the errors local, and keep it clear where annotations are needed.

The proposal: Implicit type arguments bound to [...] and (...)

Instead of "spawning" unknown types to solve later, we can combine their declaration and inference into a single construct. I propose two new types:

<a, ...>[A] B
<a, ...>(A) B

They're just like function and pair types, except with implicit generic types attached.

Let's take a look at them separately.

Implicitly generic functions: <a, ...>[A] B

The type <a, ...>[A] B is functionally equivalent to [type a, ...] [A] B.

Note, that because it's a function, the generic type is automatically treated as a "forall", even though its the same syntax used for implicitly generic pairs.

It consists of three parts:

  • <a, ...> declares the type variables, which will be bound universally, as "forall".
  • [A] is the argument to the function, and may use the type variables. This is the part used to infer them!
  • B at the end is the result of the function, and also may use the type variables. It's not used to infer the type variables.

The main point is that the type variables will only be inferred from the argument.

An important difference from normal functions is that the [A, B] C syntax sugar is not valid for implicitly generic functions. <a>[A, B] C is not allowed. Instead, you must write <a>[A] [B] C. That is to highlight that the variable a will only be inferred from the type A.

Let's see some examples from examples/MiniGrep.par:

dec Concat     : [type a] [List<a>, List<a>] List<a>
dec Lines      : [type e] [String.Parser<e>] Iterator<e, String>
dec Filter     : [type e, a] [Iterator<e, box a>, box [a] Bool] Iterator<e, a>
dec Enumerate1 : [type e, a] [Iterator<e, a>] Iterator<e, (Nat) a>

With implicitly generic functions, their types can instead be:

dec Concat     : <a>[List<a>] [List<a>] List<a>
dec Lines      : <e>[String.Parser<e>] Iterator<e, String>
dec Filter     : <e, a>[Iterator<e, box a>] [box [a] Bool] Iterator<e, a>
dec Enumerate1 : <e, a>[Iterator<e, a>] Iterator<e, (Nat) a>

Note that we had to change [List<a>, List<a>] ... to <a>[List<a>] [List<a>] ..., breaking the single [...] apart. That now clearly highlights that the type variable a will be inferred from the first list only.

Construction (lambda)

Almost nothing changes about the construction of these functions compared to the explicit way, except for the syntax for the type variables, and needing to separate the first argument. What was previously:

def F = [type a] [arg1, arg2] ...

Now becomes:

def F = <a>[arg1] [arg2] ...

Note, that the type variables have to be mentioned on construction. This is to allow their usage inside the function's body.

For example, previously:

dec Concat : [type a] [List<a>, List<a>] List<a>
def Concat = [type a] [left, right] left.begin.case {
  .end! => right,
  .item(x) xs => .item(x) xs.loop,
}

Now:

dec Concat : <a>[List<a>] [List<a>] List<a>
def Concat = <a>[left] [right] left.begin.case {
  .end! => right,
  .item(x) xs => .item(x) xs.loop,
}

Destruction (calling)

With calling, there are two important distinctions. Let's take an example:

let left:  List<Int> = *(1, 2, 3)
let right: List<Int> = *(4, 5, 6)

let all = Concat(left, right)

Note that we can put all the arguments in the same parentheses when calling.

Obviously, the first distinction is that we don't mention any types when calling! Instead, the type system takes the type of the first argument, List<Int> and matches it against the declared type List<a>, inferring that a is Int.

In fact, there is no way to manually specify the type variables. The function type itself decides if the type should be always explicit, or always implicit.

The second distinction is that the argument type itself cannot be inferred. What that means is that the type of the argument you pass into the function must be fully deducible from the expression itself, with no help from the type of the function.

Let's see where this makes a difference, the infamous Map function:

dec List.Map : [type a, b] [List<a>, box [a] b] List<b>

Let's rewrite it to use implicit generics:

dec List.Map : <a>[List<a>] <b>[box [a] b] List<b>

Now after the first argument, all is good:

let numbers: List<Int> = *(1, 2, 3, 4, 5, 6, 7, 8, 9)
let mapper = List.Map(numbers)

The type of mapper is now simply <b>[box [Int] b] List<b>. That's a first-class, fully qualified type with no variables to solve.

But trouble arrives here:

let strings = mapper(box [num] Int.ToString(num))
// ERROR, type of `num` must be known

Unfortunately, this doesn't work. mapper is an implicitly generic function, and so the type of its argument must be fully deducible from its expression. But, box [num] Int.ToString(num) is not something Par's type checker can figure out.

Annotating inside the function works:

let strings = mapper(box [num: Int] Int.ToString(num))

Another solution to this problem is to make the second type variable explicit:

dec List.Map : [type b] <a>[List<a>] [box [a] b] List<b>

Yes, it is a little unusual that the second type argument comes first, but it calls quite naturally:

let strings = List.Map(type String)(numbers, box [num] Int.ToString(num))

Which isn't so bad at all!

Implicitly generic pairs: <a, ...>(A) B

Thanks to duality, implicitly generic pairs must automatically come with implicitly generic functions. I'll spend less time explaining them, but here are the basics.

A <a, ...>(A) B is functionally equivalent to (type a, ...) (A) B. Note, that the type variables are existentially quantified here.

For example:

type ImplicitAny = <a>(a)!

The construction is the same as normal pairs:

let any: ImplicitAny = (5)!

Of course, the overall type must be known, otherwise this would be inferred as a regular pair.

The destruction is just like for pairs, except it requires binding the type variables:

let <p>(x)! = any  // x: p

And the same in commands:

any<p>[x]
// x:   p
// any: !

Conclusion

We found a way to introduce implicit type arguments, while maintaining Par's bi-directional type checking and avoiding solving unknown types at arbitrary points in code.

The functions themselves decide if their generics are explicit or implicit. It's always clear how and when the type variables are inferred. Furthermore, the implicitly generic functions are just as first-class as their explicit counterparts, and maintain their genericity until usage. True System F style.

There are a couple of limitations, like separating the first argument, or needing to partially fall back to explicit generics like with the List.Map function. But, I feel like these are worth it, overall.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions