|
| 1 | +# Tuple Projections |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +This RFC proposes a quality-of-life improvement to OCaml's tuples, adding |
| 6 | +support for labeled and unlabeled tuple projections. |
| 7 | + |
| 8 | +## Proposed change |
| 9 | + |
| 10 | +The idea is to allow users to directly project elements from a tuple using |
| 11 | +labels or indicies, as opposed to patterns: |
| 12 | + |
| 13 | +```ocaml |
| 14 | +# let x = (~tuple:42, ~proj:1337, "is", 'c', 00, 1);; |
| 15 | +val x : (tuple: int * proj:int * string * char * int * int) = |
| 16 | + (~tuple:42, ~proj:1337, "is", 'c', 0, 1) |
| 17 | +# x.tuple;; |
| 18 | +- : int = 42 |
| 19 | +# x.3;; |
| 20 | +- : char = 'c' |
| 21 | +``` |
| 22 | + |
| 23 | +Here, we're able to project out of a 6-tuple (containing both labeled and |
| 24 | +unlabeled components) simply by writing `x.j` (for an index `j`) or `x.l` (for |
| 25 | +a label `l`). Tuple indices are 0-indexed, to match the existing indexing |
| 26 | +convention for arrays, lists, etc. |
| 27 | + |
| 28 | +This is useful for a couple reasons: |
| 29 | +- Conciseness: avoids unnecessary boilerplate pattern matches / projection |
| 30 | + functions e.g. `Tuple2.fst, Tuple3.fst, ...`. Additionally, with the advent |
| 31 | + of labeled tuples, such functions are less useful[^1]. |
| 32 | +- Clarity: `x.1` is more readable than `let _, x1, _ = x in ...` |
| 33 | +- Parity with records: complements record field projection and aligns tuple |
| 34 | + uses with other ML-like languages. |
| 35 | + |
| 36 | +Occurrences of explicit pattern matching for tuple projection are |
| 37 | +reasonably frequent where tuple projections could otherwise be used |
| 38 | +(found by doing a quick Sherlocode[^2]): |
| 39 | +- ~6k uses of projections on pairs, |
| 40 | +- ~1.15k for triples, |
| 41 | +- ~210 for quadruples. |
| 42 | + |
| 43 | + |
| 44 | +## Previous work |
| 45 | + |
| 46 | +Many other strongly-typed languages support built-in tuple projections. |
| 47 | + |
| 48 | +### SML |
| 49 | + |
| 50 | +SML models tuples as records with integer field names (1-indexed), so projection uses |
| 51 | +record selection syntax: |
| 52 | +```sml |
| 53 | +> val x = (1, "hi", 42);; |
| 54 | +val x = (1, "hi", 42): int * string * int |
| 55 | +> val y = #1 x;; |
| 56 | +val y = 1: int |
| 57 | +> val z = #3 x |
| 58 | +val z = 42: int |
| 59 | +``` |
| 60 | + |
| 61 | +### Rust |
| 62 | + |
| 63 | +Rust supports tuple projections (0-indexed) for ordinary tuples (and tuple structs): |
| 64 | +```rust |
| 65 | +let x = (42, "is", 'c'); |
| 66 | +let y = x.0; // 42 |
| 67 | +let z = x.1; // "is" |
| 68 | + |
| 69 | +struct Point(i32, i32); |
| 70 | +let p = Point(3, 4); |
| 71 | +let x_coord = p.0; // 3 |
| 72 | +``` |
| 73 | + |
| 74 | +Record structs also use the same syntax for projection: |
| 75 | +```rust |
| 76 | +struct Point { x : i32, y : i32 }; |
| 77 | +let p = Point { x = 3, y = 4 }; |
| 78 | +let x_coord = p.x; // 3 |
| 79 | +``` |
| 80 | + |
| 81 | + |
| 82 | +### Swift |
| 83 | + |
| 84 | +Swift supports tuple projections via both positional indicies and labels (as in this proposal): |
| 85 | +```swift |
| 86 | +import Foundation |
| 87 | + |
| 88 | +let x = (tuple: 42, proj: 1337, "is", 'c', 00, 1); |
| 89 | + |
| 90 | +print(x.tuple) // 42 |
| 91 | +print(x.5) // 1 |
| 92 | +``` |
| 93 | + |
| 94 | +## Implementation |
| 95 | + |
| 96 | +An experimental implementation is available at [PR 14257](https://github.com/ocaml/ocaml/pull/14257). |
| 97 | + |
| 98 | +### Parsetree changes |
| 99 | + |
| 100 | +Given the syntax for labeled tuple projection is overloaded with record field |
| 101 | +projection, i.e. there is no syntactic distinction between the projections in: |
| 102 | +```ocaml |
| 103 | +let x = { foo = 1; bar = 2 } in x.foo;; |
| 104 | +``` |
| 105 | +and |
| 106 | +```ocaml |
| 107 | +let x = ~foo:1, ~bar:2 in x.foo;; |
| 108 | +``` |
| 109 | + |
| 110 | + |
| 111 | +The proposed parsetree additions will represent all projections using `Pexp_field`: |
| 112 | +```ocaml |
| 113 | +type field = |
| 114 | + | Pfield_record_or_tuple_label of Longident.t loc |
| 115 | + | Pfield_tuple_index of int loc |
| 116 | +
|
| 117 | +and expression_desc = |
| 118 | + ... |
| 119 | + | Pexp_field of expression * field |
| 120 | + ... |
| 121 | +``` |
| 122 | + |
| 123 | + |
| 124 | +### Typechecking |
| 125 | + |
| 126 | +While typechecking, when encountering a field projection in expressions, |
| 127 | + |
| 128 | +- If the field is a tuple index `j`, type as a unlabeled tuple projection. |
| 129 | + |
| 130 | + Check to see whether the expected type is known (principally known, if in `-principal` mode). |
| 131 | + Then: |
| 132 | + * If the type is not known: raise an error stating that the projection is ambiguous. |
| 133 | + * If the type is known to be `(?l0:ty0 * ... * tyj * ... * ?ln:tyn)`: type the projection as `tyj` |
| 134 | + |
| 135 | +- If the field is a record or tuple label `l`. |
| 136 | + |
| 137 | + Check to see whether the expected type is known: |
| 138 | + - If the type is not known: typecheck the projection `e.l` as a record projection |
| 139 | + - If the type is known to be `(ty0, ..., tyn) t`: ditto |
| 140 | + - If the type is known to be `(?l0:ty0 * ... * l:tyl * ... * ?ln:tyn)`: type the projection |
| 141 | + as `tyl`. |
| 142 | + |
| 143 | +## Considerations |
| 144 | + |
| 145 | +### Limitations of type-based disambiguation |
| 146 | + |
| 147 | +OCaml's current type-based disambiguation mechanism is relatively weak. As a result, |
| 148 | +many of the patterns that tuple projections are intended to replace would be ill-typed under |
| 149 | +today's implementation. For instance: |
| 150 | +```ocaml |
| 151 | +# List.map (fun x -> x.1) [42, "Hello"; 1337, "World"];; |
| 152 | +Error: The type of the tuple expression is ambiguous. |
| 153 | + Could not determine the type of the tuple projection. |
| 154 | +``` |
| 155 | + |
| 156 | +That said, this limitation does not arise from the feature itself, but from the |
| 157 | +weaknesses in OCaml's type propagation. Improving type propagation (separately) |
| 158 | +would benefit not only tuple projections, but other features that rely on |
| 159 | +type-based disambiguation (e.g. constructors and record fields). As such, we |
| 160 | +argue that tuple projections should not be rejected on this point alone, and |
| 161 | +that the broader issues of type propagation and disambiguation be addressed |
| 162 | +separately. |
| 163 | + |
| 164 | +### Syntactic overloading |
| 165 | + |
| 166 | +This proposal reuses the existing projection syntax `e.l` for both record |
| 167 | +fields and labeled tuples. The primary motivator behind this is to avoid |
| 168 | +introducing new operators and keeps projection syntax uniform. |
| 169 | + |
| 170 | +The downside is that it increases reliance on type-based disambiguation. |
| 171 | + |
| 172 | +### Diagnostic quality of error messages |
| 173 | + |
| 174 | +Type errors surrounding unknown fields will need to be refined. |
| 175 | +In particular, when the compiler defaults a labeled projection to a record |
| 176 | +field (even though it might also have been a labeled tuple projection), |
| 177 | +the diagnostic report ought to make this clear. |
| 178 | + |
| 179 | +Otherwise, programs like the following may yield cryptic messages: |
| 180 | +```ocaml |
| 181 | +# let is_ill_typed_due_to_defaults x = |
| 182 | + let y = x.tuple_label_a in |
| 183 | + ignore (x : (tuple_label_a:int * string * bool)); |
| 184 | + (y, x.2) |
| 185 | +Error: Unbound record field `tuple_label_a` |
| 186 | +``` |
| 187 | + |
| 188 | +A clearer diagnostic could be: |
| 189 | +``` |
| 190 | +Error: The field `tuple_label_a` is unknown. |
| 191 | + The projection `x.tuple_label_a` was interpreted as a record field, |
| 192 | + but no such record field exists. |
| 193 | +
|
| 194 | +Hint: Did you mean to project from a labeled tuple instead? |
| 195 | + If so, add an annotation to disambiguate the projection. |
| 196 | +``` |
| 197 | + |
| 198 | +Other problematic examples include conflicts with existing records: |
| 199 | +```ocaml |
| 200 | +# type discombobulating_record = { tuple_label_a : int };; |
| 201 | +type discombobulating_record = { tuple_label_a : int } |
| 202 | +# let is_ill_typed_due_to_defaults x = |
| 203 | + let y = x.tuple_label_a in |
| 204 | + ignore (x : (tuple_label_a:int * string * bool)); |
| 205 | + (y, x.2) |
| 206 | +Error: The value `x` has type `discombobulating_record` but an expression was |
| 207 | + expected of type `tuple_label_a:int * string * bool` |
| 208 | +``` |
| 209 | +Here the error conflates record and tuple typing, which is misleading. |
| 210 | +A more informative report could combine a warning with the final error: |
| 211 | +```ocaml |
| 212 | +Warning: The projection `x.tuple_label_a` could refer either to a record field |
| 213 | + or a labeled tuple component. It was resolved as a record field of |
| 214 | + `discombobulating_record`. |
| 215 | + Please disambiguate if this is wrong. |
| 216 | +Error: The value `x` has type `discombobulating_record` but an expression |
| 217 | + was expected of type |
| 218 | + `tuple_label_a:int * string * bool` |
| 219 | +``` |
| 220 | + |
| 221 | + |
| 222 | +### Row-polymorphic tuples |
| 223 | + |
| 224 | +Unlabeled tuple projections can naturally (and efficiently) be typed using row polymorphism, in |
| 225 | +the same way object fields are typed today: |
| 226 | +```ocaml |
| 227 | +# let snd x = x.2;; |
| 228 | +val snd : 'a * 'b * .. -> 'b |
| 229 | +``` |
| 230 | + |
| 231 | +This generalises to tuples of arbitrary arity (and is strictly more powerful than this proposal). |
| 232 | +However, extending the same mechanism to labeled tuples is significantly more difficult (without incurring |
| 233 | +runtime overhead or using monomorphisation). |
| 234 | + |
| 235 | +From a language design perspective, though, we would ideally want the typing of projections |
| 236 | +to behave uniformly across both labeled and unlabeled tuples. Moreover, the typing behaviour |
| 237 | +discussed in this RFC is compatible with this use of row polymorphism. |
| 238 | + |
| 239 | + |
| 240 | +### Non-principal defaults |
| 241 | + |
| 242 | +For type-based disambiguation, OCaml usually implements a 'default' behaviour when |
| 243 | +the type is unknown. For instance, in record field / variant overloading, if the type is |
| 244 | +not known to be a nominal type `(ty0, ..., tyn) t`, the lexically-closed matching record field / variant |
| 245 | +is used. |
| 246 | + |
| 247 | +We could have a similar default rule for tuple projections `e.j`, if the type is unknown: type `e` as `min 2 (j + 1)`-ary tuple. |
| 248 | + |
| 249 | +[^1]: Without relying on coercions between labeled and unlabeled tuples, as in [PR 14180](https://github.com/ocaml/ocaml/pull/14180). And even then, it would be shorter to write a pattern than the types for `:>`. |
| 250 | + |
| 251 | +[^2]: The following family of patterns were used to derive estimates: [`fun (\w\+, _)`](https://sherlocode.com/?q=fun%20(%5Cw%5C%2B%2C%20_)), [`fun (\w\+, _, _)`](https://sherlocode.com/?q=fun%20(%5Cw%5C%2B%2C%20_%2C%20_)), [`fun (\w\+, _, _, _)`](https://sherlocode.com/?q=fun%20(%5Cw%5C%2B%2C%20_%2C%20_%2C%20_)), [`fun (_, \w\+)`](https://sherlocode.com/?q=fun%20(_%2C%20%5Cw%5C%2B)), [`fun (_, \w\+, _)`](https://sherlocode.com/?q=fun%20(_%2C%20%5Cw%5C%2B%2C%20_)), [`fun (_, \w\+, _, _)`](https://sherlocode.com/?q=fun%20(_%2C%20%5Cw%5C%2B%2C%20_%2C%20_)), etc. |
0 commit comments