Auto-cancellation of `ordinal . unsafe_from_ordinal`

**Why**

Notionally, an `Ix a` instance is supposed to define a bijection between `a` and the first `size a` natural numbers.  In particular, on those natural numbers, `ordinal` and `unsafe_from_ordinal` are supposed to be inverses.  Do we want to start using that in the compiler by replacing that composition with `id` when it is known to occur?

To wit, by the time we get to the hardware, `for` loops are counting integers, and arrays are indexed by integers.  It seems nice to try to avoid round-tripping through the `a` value, if the main thing we need is its ordinal.

This should speed up both the generated code and the compiler, the former because it avoids running useless compositions of `ordinal . unsafe_from_ordinal`, and the latter because it avoids inlining and repeatedly compiling them.  It should also simplify the compiler by better controlling how far user-defined `Ix` instances can travel.

**What**

In detail, Dex `for` loops are implemented thus:  When the compiler sees `for idx:a. body` (where `body` has type `b`), it eventually generates imperative code that basically looks like this (with some abuse of notation):
```
dest : Ref (a => b) = <allocate or reuse a destination>
for_ i:(Fin (size a)).
  idx : a = unsafe_from_ordinal i  -- Except the instance is resolved and the method is inlined
  ithdest : Ref b = indexDest dest (ordinal idx)
  place ithdest <compile body>
```

Also, when the compiler sees array indexing (i.e., `xs[idx]`), it generates
```
i' = ordinal idx
indexValue xs i'
```

Seems like maybe we should just remember that `i` is in scope, and instead of constructing `ordinal idx`, just insert a reference to `i`.

Now, we do still need to generate `idx`, because the `body` may use it for something other than indexing as well (e.g., unpacking a tuple-typed index), but in the case of a pure map, dead code elimination should remove it.

(For what it's worth, something similar happens with table literals, except that we know the body doesn't read the index at all.)

**Why not**

The main problem I can imagine with doing this is that it assumes that `ordinal` really is a left inverse of `unsafe_from_ordinal` in all `Ix` instances, which is not something we have machinery to check.  The optimization is observable if `ordinal` has side-effects (e.g., via `unsafe_io`).

**How**

Probably the simplest way to do it would be to write a new pass that changes the types of all arrays to `Fin <something>`.
- The environment the pass carries would need to map every name in the input scope to an ADT describing how to reconstruct that value (at its original type) from names in the output scope.
- Going under a `for` binder <orig> would (i) create a Nat-typed new binder `<new>` (ii) create an originally-typed binder `<recon>` by emitting a call to the `unsafe_from_ordinal` method of the original type on `<new>`, and (iii) add a mapping for `<orig>` saying "if you really want the original type, use `<recon>`, but if the underlying Nat is fine, use `<new>`".
- Exiting a `for` would add a mapping for the array saying "coerce the array `<new binder>` to `<original array type>`" (because arrays are represented as flat buffers regardless of the type of their index).
- Array indexing (`TabApp`) would insert a call to `ordinal` at the array's original type in general, but would also check whether the index has a "the underlying Nat is `<new>`" binding and in that case just use it and skip `ordinal`.
- It would also be nice if we could pick up user-written calls to `ordinal`, but I think we currently can't because `Simplify.applyDictMethod` already inlines them.
- Other forms that need to consume an index can always just use the `<recon>` binder.
- If `<recon>` ends up being unused (as in a pure map that doesn't read the index except to index arrays with it), DCE should just clean it up.
- Attempts to return an index out of the scope in which `<new>` is valid could also fall back to just using the `<recon>` binder, or we could try to play some game where the form returns `<new>` and creates another binding that reconstructs the `<recon>` again.
- The same sort of thing needs to be decided for standalone functions that accept and return indices---either accept and return the Nat and reconstruct as needed, or accept and return the reconstruction and miss potential opportunities for `ordinal` elimination.
- There's also a choice of whether to create the `<recon>` binder "eagerly", as described, or "lazily", by inserting `unsafe_from_ordinal <new>` in the places where `<recon>` would otherwise be needed.  That tradeoff is entirely about how many times `unsafe_from_ordinal` gets inlined during this pass.

We would presumably only want to run this pass after we were done using semantic index information for occurrence analysis and any future loop splitting or such; and it has to run before (or within) Imp, because the Imp representation no longer has interesting index sets.

**Thoughts?**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto-cancellation of `ordinal . unsafe_from_ordinal` #1283

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Auto-cancellation of ordinal . unsafe_from_ordinal #1283

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Auto-cancellation of `ordinal . unsafe_from_ordinal` #1283