Skip to content

Commit 0443ca7

Browse files
committed
Tuple projection draft
1 parent 308ebe9 commit 0443ca7

File tree

1 file changed

+251
-0
lines changed

1 file changed

+251
-0
lines changed

rfcs/tuple_projections.md

Lines changed: 251 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,251 @@
1+
# Tuple Projections
2+
3+
## Overview
4+
5+
This RFC proposes a quality-of-life improvement to OCaml's tuples, adding
6+
support for labeled and unlabeled tuple projections.
7+
8+
## Proposed change
9+
10+
The idea is to allow users to directly project elements from a tuple using
11+
labels or indicies, as opposed to patterns:
12+
13+
```ocaml
14+
# let x = (~tuple:42, ~proj:1337, "is", 'c', 00, 1);;
15+
val x : (tuple: int * proj:int * string * char * int * int) =
16+
(~tuple:42, ~proj:1337, "is", 'c', 0, 1)
17+
# x.tuple;;
18+
- : int = 42
19+
# x.3;;
20+
- : char = 'c'
21+
```
22+
23+
Here, we're able to project out of a 6-tuple (containing both labeled and
24+
unlabeled components) simply by writing `x.j` (for an index `j`) or `x.l` (for
25+
a label `l`). Tuple indices are 0-indexed, to match the existing indexing
26+
convention for arrays, lists, etc.
27+
28+
This is useful for a couple reasons:
29+
- Conciseness: avoids unnecessary boilerplate pattern matches / projection
30+
functions e.g. `Tuple2.fst, Tuple3.fst, ...`. Additionally, with the advent
31+
of labeled tuples, such functions are less useful[^1].
32+
- Clarity: `x.1` is more readable than `let _, x1, _ = x in ...`
33+
- Parity with records: complements record field projection and aligns tuple
34+
uses with other ML-like languages.
35+
36+
Occurrences of explicit pattern matching for tuple projection are
37+
reasonably frequent where tuple projections could otherwise be used
38+
(found by doing a quick Sherlocode[^2]):
39+
- ~6k uses of projections on pairs,
40+
- ~1.15k for triples,
41+
- ~210 for quadruples.
42+
43+
44+
## Previous work
45+
46+
Many other strongly-typed languages support built-in tuple projections.
47+
48+
### SML
49+
50+
SML models tuples as records with integer field names (1-indexed), so projection uses
51+
record selection syntax:
52+
```sml
53+
> val x = (1, "hi", 42);;
54+
val x = (1, "hi", 42): int * string * int
55+
> val y = #1 x;;
56+
val y = 1: int
57+
> val z = #3 x
58+
val z = 42: int
59+
```
60+
61+
### Rust
62+
63+
Rust supports tuple projections (0-indexed) for ordinary tuples (and tuple structs):
64+
```rust
65+
let x = (42, "is", 'c');
66+
let y = x.0; // 42
67+
let z = x.1; // "is"
68+
69+
struct Point(i32, i32);
70+
let p = Point(3, 4);
71+
let x_coord = p.0; // 3
72+
```
73+
74+
Record structs also use the same syntax for projection:
75+
```rust
76+
struct Point { x : i32, y : i32 };
77+
let p = Point { x = 3, y = 4 };
78+
let x_coord = p.x; // 3
79+
```
80+
81+
82+
### Swift
83+
84+
Swift supports tuple projections via both positional indicies and labels (as in this proposal):
85+
```swift
86+
import Foundation
87+
88+
let x = (tuple: 42, proj: 1337, "is", 'c', 00, 1);
89+
90+
print(x.tuple) // 42
91+
print(x.5) // 1
92+
```
93+
94+
## Implementation
95+
96+
An experimental implementation is available at [PR 14257](https://github.com/ocaml/ocaml/pull/14257).
97+
98+
### Parsetree changes
99+
100+
Given the syntax for labeled tuple projection is overloaded with record field
101+
projection, i.e. there is no syntactic distinction between the projections in:
102+
```ocaml
103+
let x = { foo = 1; bar = 2 } in x.foo;;
104+
```
105+
and
106+
```ocaml
107+
let x = ~foo:1, ~bar:2 in x.foo;;
108+
```
109+
110+
111+
The proposed parsetree additions will represent all projections using `Pexp_field`:
112+
```ocaml
113+
type field =
114+
| Pfield_record_or_tuple_label of Longident.t loc
115+
| Pfield_tuple_index of int loc
116+
117+
and expression_desc =
118+
...
119+
| Pexp_field of expression * field
120+
...
121+
```
122+
123+
124+
### Typechecking
125+
126+
While typechecking, when encountering a field projection in expressions,
127+
128+
- If the field is a tuple index `j`, type as a unlabeled tuple projection.
129+
130+
Check to see whether the expected type is known (principally known, if in `-principal` mode).
131+
Then:
132+
* If the type is not known: raise an error stating that the projection is ambiguous.
133+
* If the type is known to be `(?l0:ty0 * ... * tyj * ... * ?ln:tyn)`: type the projection as `tyj`
134+
135+
- If the field is a record or tuple label `l`.
136+
137+
Check to see whether the expected type is known:
138+
- If the type is not known: typecheck the projection `e.l` as a record projection
139+
- If the type is known to be `(ty0, ..., tyn) t`: ditto
140+
- If the type is known to be `(?l0:ty0 * ... * l:tyl * ... * ?ln:tyn)`: type the projection
141+
as `tyl`.
142+
143+
## Considerations
144+
145+
### Limitations of type-based disambiguation
146+
147+
OCaml's current type-based disambiguation mechanism is relatively weak. As a result,
148+
many of the patterns that tuple projections are intended to replace would be ill-typed under
149+
today's implementation. For instance:
150+
```ocaml
151+
# List.map (fun x -> x.1) [42, "Hello"; 1337, "World"];;
152+
Error: The type of the tuple expression is ambiguous.
153+
Could not determine the type of the tuple projection.
154+
```
155+
156+
That said, this limitation does not arise from the feature itself, but from the
157+
weaknesses in OCaml's type propagation. Improving type propagation (separately)
158+
would benefit not only tuple projections, but other features that rely on
159+
type-based disambiguation (e.g. constructors and record fields). As such, we
160+
argue that tuple projections should not be rejected on this point alone, and
161+
that the broader issues of type propagation and disambiguation be addressed
162+
separately.
163+
164+
### Syntactic overloading
165+
166+
This proposal reuses the existing projection syntax `e.l` for both record
167+
fields and labeled tuples. The primary motivator behind this is to avoid
168+
introducing new operators and keeps projection syntax uniform.
169+
170+
The downside is that it increases reliance on type-based disambiguation.
171+
172+
### Diagnostic quality of error messages
173+
174+
Type errors surrounding unknown fields will need to be refined.
175+
In particular, when the compiler defaults a labeled projection to a record
176+
field (even though it might also have been a labeled tuple projection),
177+
the diagnostic report ought to make this clear.
178+
179+
Otherwise, programs like the following may yield cryptic messages:
180+
```ocaml
181+
# let is_ill_typed_due_to_defaults x =
182+
let y = x.tuple_label_a in
183+
ignore (x : (tuple_label_a:int * string * bool));
184+
(y, x.2)
185+
Error: Unbound record field `tuple_label_a`
186+
```
187+
188+
A clearer diagnostic could be:
189+
```
190+
Error: The field `tuple_label_a` is unknown.
191+
The projection `x.tuple_label_a` was interpreted as a record field,
192+
but no such record field exists.
193+
194+
Hint: Did you mean to project from a labeled tuple instead?
195+
If so, add an annotation to disambiguate the projection.
196+
```
197+
198+
Other problematic examples include conflicts with existing records:
199+
```ocaml
200+
# type discombobulating_record = { tuple_label_a : int };;
201+
type discombobulating_record = { tuple_label_a : int }
202+
# let is_ill_typed_due_to_defaults x =
203+
let y = x.tuple_label_a in
204+
ignore (x : (tuple_label_a:int * string * bool));
205+
(y, x.2)
206+
Error: The value `x` has type `discombobulating_record` but an expression was
207+
expected of type `tuple_label_a:int * string * bool`
208+
```
209+
Here the error conflates record and tuple typing, which is misleading.
210+
A more informative report could combine a warning with the final error:
211+
```ocaml
212+
Warning: The projection `x.tuple_label_a` could refer either to a record field
213+
or a labeled tuple component. It was resolved as a record field of
214+
`discombobulating_record`.
215+
Please disambiguate if this is wrong.
216+
Error: The value `x` has type `discombobulating_record` but an expression
217+
was expected of type
218+
`tuple_label_a:int * string * bool`
219+
```
220+
221+
222+
### Row-polymorphic tuples
223+
224+
Unlabeled tuple projections can naturally (and efficiently) be typed using row polymorphism, in
225+
the same way object fields are typed today:
226+
```ocaml
227+
# let snd x = x.2;;
228+
val snd : 'a * 'b * .. -> 'b
229+
```
230+
231+
This generalises to tuples of arbitrary arity (and is strictly more powerful than this proposal).
232+
However, extending the same mechanism to labeled tuples is significantly more difficult (without incurring
233+
runtime overhead or using monomorphisation).
234+
235+
From a language design perspective, though, we would ideally want the typing of projections
236+
to behave uniformly across both labeled and unlabeled tuples. Moreover, the typing behaviour
237+
discussed in this RFC is compatible with this use of row polymorphism.
238+
239+
240+
### Non-principal defaults
241+
242+
For type-based disambiguation, OCaml usually implements a 'default' behaviour when
243+
the type is unknown. For instance, in record field / variant overloading, if the type is
244+
not known to be a nominal type `(ty0, ..., tyn) t`, the lexically-closed matching record field / variant
245+
is used.
246+
247+
We could have a similar default rule for tuple projections `e.j`, if the type is unknown: type `e` as `min 2 (j + 1)`-ary tuple.
248+
249+
[^1]: Without relying on coercions between labeled and unlabeled tuples, as in [PR 14180](https://github.com/ocaml/ocaml/pull/14180). And even then, it would be shorter to write a pattern than the types for `:>`.
250+
251+
[^2]: The following family of patterns were used to derive estimates: [`fun (\w\+, _)`](https://sherlocode.com/?q=fun%20(%5Cw%5C%2B%2C%20_)), [`fun (\w\+, _, _)`](https://sherlocode.com/?q=fun%20(%5Cw%5C%2B%2C%20_%2C%20_)), [`fun (\w\+, _, _, _)`](https://sherlocode.com/?q=fun%20(%5Cw%5C%2B%2C%20_%2C%20_%2C%20_)), [`fun (_, \w\+)`](https://sherlocode.com/?q=fun%20(_%2C%20%5Cw%5C%2B)), [`fun (_, \w\+, _)`](https://sherlocode.com/?q=fun%20(_%2C%20%5Cw%5C%2B%2C%20_)), [`fun (_, \w\+, _, _)`](https://sherlocode.com/?q=fun%20(_%2C%20%5Cw%5C%2B%2C%20_%2C%20_)), etc.

0 commit comments

Comments
 (0)