|
| 1 | +# DSLX: Type System |
| 2 | + |
| 3 | +The DSL (frontend) performs a few primary tasks: |
| 4 | + |
| 5 | +1. Parsing text files to an AST representation. |
| 6 | +2. Typechecking the AST representation. |
| 7 | +3. Conversion of the AST to bytecode that can be interpreted. |
| 8 | +4. Conversion of the AST to XLS IR (from which it can be interpreted or |
| 9 | + optimized or scheduled or code generated, etc.) |
| 10 | + |
| 11 | +Note that step #2 is an essential component for steps #3 and #4 -- the type |
| 12 | +information computed in the type checking process is used by the bytecode |
| 13 | +emission/evaluation and IR conversion processes. |
| 14 | + |
| 15 | +(You could imagine a bytecode interpreter that did not use any pre-computed type |
| 16 | +information and tried to derive it all dynamically, but that is not how the |
| 17 | +system is set up -- using the type information from typechecking time avoids |
| 18 | +redundant work and replication of similar code in a way that could get out of |
| 19 | +sync.) |
| 20 | + |
| 21 | +Aside: bytecode emission/interpretation may also be necessary for constexpr |
| 22 | +(compile-time constant) evaluation, and so #2 and #3 will be interleaved to some |
| 23 | +degree. |
| 24 | + |
| 25 | +## Parametric Instantiation |
| 26 | + |
| 27 | +The most interesting thing that happens in the type system, and one of the main |
| 28 | +things that the DSL provides as a useful abstraction on top of XLS IR, is |
| 29 | +*parametric instantiation*. This is where users write a parameterized function |
| 30 | +(or proc) and instantiate it with particular concrete parmeters; e.g. |
| 31 | + |
| 32 | +```dslx |
| 33 | +// A parametric (identity) function. |
| 34 | +fn p<N: u32>(x: bits[N]) -> bits[N] { x } |
| 35 | +
|
| 36 | +fn main() -> u8 { |
| 37 | + p(u8:42) // Instantiates p with N=8 |
| 38 | +} |
| 39 | +
|
| 40 | +#[test] |
| 41 | +fn test_main() { |
| 42 | + assert_eq(main(), u8:42) |
| 43 | +} |
| 44 | +``` |
| 45 | + |
| 46 | +This allows us to write more generic code as library-style functions, which |
| 47 | +dovetails nicely with the facilities that XLS core has to schedule and optimize |
| 48 | +across cycles. |
| 49 | + |
| 50 | +With parametric instantiation as a feature, several questions around the nature |
| 51 | +of the parameterized definition are raised; e.g. |
| 52 | + |
| 53 | +* When `p` is defined with a parametric `N`, should we check that the |
| 54 | + definition has no type errors "for all N"? (Note: we do not.) |
| 55 | +* If `p` is not instantiated anywhere, do we check that `p` has no type errors |
| 56 | + for "there exists some N"? (Note: we do not.) |
| 57 | + |
| 58 | +These relate to the "laziness" of parametric instantiation. As a historical |
| 59 | +example for comparison, C++ originally had template definitions as token |
| 60 | +substitutions, not even ensuring that the definition could parse, more akin to |
| 61 | +syntactic macros. |
| 62 | + |
| 63 | +## Typechecked on Instantiation |
| 64 | + |
| 65 | +In the DSL, as noted above, the definitions of parametric instances are parsed, |
| 66 | +but not typechecked until instantiation, and errors are raised if the concrete |
| 67 | +set of parameterized values cause a type error during instantiation. That is: |
| 68 | + |
| 69 | +```dslx |
| 70 | +fn p<N: u32>() -> bits[N] { |
| 71 | + N() // type error on instantiation: cannot invoke a number |
| 72 | +} |
| 73 | +``` |
| 74 | + |
| 75 | +If there is no instantiation of `p`, this definition will parse, but the type |
| 76 | +error will go unreported, because it is never instantiated, and thus never type |
| 77 | +checked. If another function were to instantiate it by calling `p`, however, a |
| 78 | +type error would occur due to that instantiation. |
| 79 | + |
| 80 | +Similarly, we can consider a definition that does not work "for all N", but |
| 81 | +works "for one N", and that's the only `N` we instantiate it with. |
| 82 | + |
| 83 | +```dslx |
| 84 | +fn p<N: u32>() -> bits[N] { |
| 85 | + const_assert!(N == u32:42); |
| 86 | + u42:64 |
| 87 | +} |
| 88 | +fn main() -> u42 { |
| 89 | + p<u32:42>() // this is fine |
| 90 | +} |
| 91 | +``` |
| 92 | + |
| 93 | +## Parametric Evaluation Ordering |
| 94 | + |
| 95 | +There are three components to parametric invocations. (Note: "binding" refers to |
| 96 | +the assignment of a value to each named parameter.) |
| 97 | + |
| 98 | +1. Binding explicit values (given in angle brackets, i.e. `<>`) given in the |
| 99 | + caller |
| 100 | +2. Binding actual arguments (passed by the caller) against the parametric |
| 101 | + bindings |
| 102 | +3. Filling in any "remaining holes" in the parametric bindings using *default |
| 103 | + expressions* in the parametric bindings |
| 104 | + |
| 105 | +The three components are performed in that order. |
| 106 | + |
| 107 | +### 1: Binding Explicit Values |
| 108 | + |
| 109 | +In this example: |
| 110 | + |
| 111 | +```dslx |
| 112 | +fn p<A: u32, B: u32>() -> (bits[A], bits[B]) { |
| 113 | + (bits[A]:42, bits[B]:64) |
| 114 | +} |
| 115 | +
|
| 116 | +fn main() -> (u8, u16) { |
| 117 | + p<u32:8, u32:16>() |
| 118 | +} |
| 119 | +``` |
| 120 | + |
| 121 | +The caller `main` explicitly binds the parametrics `A` and `B` by supplying |
| 122 | +arguments in the angle brackets. |
| 123 | + |
| 124 | +### 2: Binding Actual Arguments |
| 125 | + |
| 126 | +In this example: |
| 127 | + |
| 128 | +```dslx |
| 129 | +fn p<A: u32>(x: bits[A]) -> bits[A] { |
| 130 | + x + bits[A]:1 |
| 131 | +} |
| 132 | +fn main() -> u13 { |
| 133 | + p(u13:42) |
| 134 | +} |
| 135 | +
|
| 136 | +#[test] |
| 137 | +fn test_main() { |
| 138 | + assert_eq(main(), u13:43) |
| 139 | +} |
| 140 | +``` |
| 141 | + |
| 142 | +`main` is implicitly saying what `A` must be by passing a `u13` -- we know that |
| 143 | +the parameter to `p` is declared to be a `bits[A]`, so we know that `A` must be |
| 144 | +`13` since a `u13` was passed as the "actual" argument (i.e. argument value from |
| 145 | +the caller). |
| 146 | + |
| 147 | +Note that if you contradict an explicit binding and a binding from actual |
| 148 | +arguments, you will get a type error; e.g. this will cause a type error: |
| 149 | + |
| 150 | +```dslx-snippet |
| 151 | +fn main() -> u13 { |
| 152 | + p<u32:14>(u13:42) // explicit says 14 bits, actual arg is 13 bits |
| 153 | +} |
| 154 | +``` |
| 155 | + |
| 156 | +### 3: Default Expressions |
| 157 | + |
| 158 | +In this example: |
| 159 | + |
| 160 | +```dslx |
| 161 | +fn p<A: u32, B: u32 = {A+A}>(x: bits[A]) -> bits[B] { |
| 162 | + x as bits[B] |
| 163 | +} |
| 164 | +
|
| 165 | +fn main() -> u32 { |
| 166 | + p(u16:42) |
| 167 | +} |
| 168 | +
|
| 169 | +#[test] |
| 170 | +fn test_main() { |
| 171 | + assert_eq(main(), u32:42); |
| 172 | +} |
| 173 | +``` |
| 174 | + |
| 175 | +`main` is implicitly saying what `A` must be by passing a `u16`; however, `B` is |
| 176 | +not specified; neither by an explicit parametric value (i.e. in `<>` in the |
| 177 | +caller), nor implicitly by an actual arg that was passed. As a result, we go |
| 178 | +evaluate the *default expression* for the parametric, and populate `B` with the |
| 179 | +result of the expression `A+A`. Since `A` is `16`, `B` is `32`. |
| 180 | + |
| 181 | +### Aside: Earlier Bindings in Later Types |
| 182 | + |
| 183 | +Note: this is not generally necessary to know to use parametrics effectively, |
| 184 | +but is useful in thinking through the design and power of parametric |
| 185 | +instantiation. |
| 186 | + |
| 187 | +One consequence of the ordering defined is that earlier parametric bindings can |
| 188 | +be used to define the types of later parametric bindings; e.g. |
| 189 | + |
| 190 | +```dslx |
| 191 | +fn p<A: u32, B: bits[A] = {bits[A]:0}>() -> bits[A] { |
| 192 | + B |
| 193 | +} |
| 194 | +
|
| 195 | +fn main() -> u8 { p<u32:8>() } |
| 196 | +
|
| 197 | +#[test] |
| 198 | +fn test_main() { |
| 199 | + assert_eq(main(), u8:0) |
| 200 | +} |
| 201 | +``` |
| 202 | + |
| 203 | +Note that `main` uses an explicit parametric to define `A` as `8`. Then the |
| 204 | +*type* of `B` is defined based on the *value* of `A`; i.e. the type of B is |
| 205 | +defined to be `u8` as a result, and the value of `B` is defined to be a `u8:0`. |
| 206 | +This is interesting because we used an earlier parametric binding to define a |
| 207 | +later parametric binding's type. |
0 commit comments