|
| 1 | +.. _understanding_typevars: |
| 2 | + |
| 3 | +Understanding TypeVars |
| 4 | +====================== |
| 5 | + |
| 6 | +Static type checking is by definition a static system of checks, which means |
| 7 | +anything that is determined at runtime is irrelevant to the type checker! |
| 8 | +However it can be necessary for the types discovered by the linter to depend on |
| 9 | +how a piece of code is called. |
| 10 | + |
| 11 | +For example, it can be common to define code that supports many different |
| 12 | +types but only operates on one of those types at a time and this is the situation |
| 13 | +where a TypeVar becomes very useful. |
| 14 | + |
| 15 | +Essentially, a TypeVar is an annotation where the concrete type it represents |
| 16 | +is determined by the caller rather than by the implementation. |
| 17 | + |
| 18 | +For example |
| 19 | + |
| 20 | +.. code-block:: python |
| 21 | +
|
| 22 | + import attrs |
| 23 | +
|
| 24 | + @attrs.frozen |
| 25 | + class Container[T_Item]: |
| 26 | + item: T_Item |
| 27 | +
|
| 28 | + def get_item_twice(self) -> tuple[T_Item, T_Item]: |
| 29 | + return (self.item, self.item) |
| 30 | +
|
| 31 | + # Revealed type is "tuple[builtins.str, builtins.str]" |
| 32 | + reveal_type(Container("asdf").get_item_twice()) |
| 33 | +
|
| 34 | + # Revealed type is "tuple[builtins.int, builtins.int]" |
| 35 | + reveal_type(Container(1).get_item_twice()) |
| 36 | +
|
| 37 | + # Revealed type is "tuple[builtins.bool, builtins.bool]" |
| 38 | + reveal_type(Container(True).get_item_twice()) |
| 39 | +
|
| 40 | +Think of it like simple algebra where we need to "solve for X". |
| 41 | + |
| 42 | +.. note:: |
| 43 | + |
| 44 | + ❗ A TypeVar needs to be **bound** which means any time there is a TypeVar in |
| 45 | + an output, there needs to be a chance for the caller to provide the specific |
| 46 | + type that the type var represents in that context. |
| 47 | + |
| 48 | + So either as part of the definition of the enclosing class: |
| 49 | + |
| 50 | + .. code-block:: python |
| 51 | +
|
| 52 | + class MyProtocol[T_Item](Protocol): |
| 53 | + @property |
| 54 | + def item(self) -> T_Item: ... |
| 55 | + |
| 56 | + |
| 57 | + @attrs.frozen |
| 58 | + class MyBaseClass[T_Item]: |
| 59 | + item: T_Item |
| 60 | +
|
| 61 | + Or as an input to the function returning the TypeVar: |
| 62 | + |
| 63 | + .. code-block:: python |
| 64 | +
|
| 65 | + class MyProtocol(Protocol): |
| 66 | + def process[T_Item](self, item: T_Item) -> T_Item: ... |
| 67 | + |
| 68 | + |
| 69 | + def process[T_Item](item: T_Item) -> T_Item: |
| 70 | + raise NotImplementedError() |
| 71 | +
|
| 72 | + def extract[T_Item](container: Container[T_Item]) -> T_Item: |
| 73 | + raise NotImplementedError() |
| 74 | +
|
| 75 | +TypeVars also have a concept called "variance". This is relevant when you have |
| 76 | +a bound type variable (like ``T_Vehicle``) and some specific implementations |
| 77 | +of it (like ``Car`` or ``Bicycle``), which, alongside the basic features |
| 78 | +guaranteed by ``Vehicle``, also have some "extra" attributes and methods; |
| 79 | +an extra "API surface", if you will. |
| 80 | + |
| 81 | +There are three types of this variance: |
| 82 | + |
| 83 | +- **covariant** - extra API surface is kept |
| 84 | +- **contravariant** - extra API surface is forgotten |
| 85 | +- **invariant** - is neither just covariant or just contravariant |
| 86 | + |
| 87 | +When you aren't using `PEP 695 <https://peps.python.org/pep-0695/>`_ |
| 88 | +(inline squarebrackets) syntax or ``infer_variance=True`` when creating the |
| 89 | +TypeVar, mypy will enforce the variance of the TypeVar based on: |
| 90 | + |
| 91 | +- Inputs are always contravariant |
| 92 | +- Outputs are always covariant |
| 93 | + |
| 94 | +This means: |
| 95 | + |
| 96 | +- A type var that's only ever an output must be defined with ``covariant=True`` |
| 97 | +- A type var that's only ever an input must be defined with ``contravariant=True`` |
| 98 | +- A type var that appears as an input and as an output cannot be covariant |
| 99 | + or contravariant, so it must be invariant. |
| 100 | + |
| 101 | +.. note:: |
| 102 | + |
| 103 | + 🤔 Note that a TypeVar is only necessary when the API surface of an object |
| 104 | + changes based on the implementation. If a different implementation doesn't |
| 105 | + change what attributes or methods are available, then it can be represented |
| 106 | + as a Protocol. |
| 107 | + |
| 108 | +Giving an upper bound to a ``TypeVar`` |
| 109 | +-------------------------------------- |
| 110 | + |
| 111 | +By default a TypeVar statically has no methods or attributes on it at the point |
| 112 | +it appears to the linter as a TypeVar. |
| 113 | + |
| 114 | +.. code-block:: python |
| 115 | +
|
| 116 | + def pass_through[T_Item](item: T_Item) -> T_Item: |
| 117 | + # at this point item statically has no attributes or methods on it |
| 118 | + return item |
| 119 | + |
| 120 | + my_variable: int = 1 |
| 121 | + # my_variable has all the attributes/methods that an int has |
| 122 | +
|
| 123 | + after_pass_through = pass_through(my_variable) |
| 124 | + # after_pass_through is typed as the type use for the input, so it also |
| 125 | + # statically has all the attributes/methods that an int has |
| 126 | +
|
| 127 | +It's possible to make it so that the code thinking about the TypeVar has |
| 128 | +specific attributes and methods that are statically guaranteed to be available: |
| 129 | + |
| 130 | +.. code-block:: python |
| 131 | +
|
| 132 | + from typing import Protocol |
| 133 | +
|
| 134 | + class HasProcess(Protocol): |
| 135 | + def process(self) -> None: ... |
| 136 | + |
| 137 | + def pass_through[T_Item: HasProcess](item: T_Item) -> T_Item: |
| 138 | + # Our type var is bound to ``HasProcess`` |
| 139 | + # This means whilst the object passed in is allowed to have any number |
| 140 | + # of additional attributes and methods, we are guaranteed that it at least |
| 141 | + # has a "process" method that takes in no arguments and returns None |
| 142 | + item.process() |
| 143 | + return item |
| 144 | + |
| 145 | + # Error, int has no "process" on it! |
| 146 | + pass_through(1) |
| 147 | +
|
| 148 | + @attrs.define |
| 149 | + class MyProcessor: |
| 150 | + number: int |
| 151 | + |
| 152 | + def process(self) -> None: |
| 153 | + print(self.number) |
| 154 | + |
| 155 | + # Valid because an instance of MyProcessor has a method on it |
| 156 | + # called "process" that can be called with no paramaters and returns a None |
| 157 | + pass_through(MyProcessor(10)) |
| 158 | +
|
| 159 | +There is also the ability to constrain the TypeVar to exact types rather than |
| 160 | +subtypes of the upper bound (as long as you have at least two constraints): |
| 161 | + |
| 162 | +.. code-block:: python |
| 163 | +
|
| 164 | + from typing import TypeVar |
| 165 | +
|
| 166 | + T_Item = TypeVar("T_Item", str, int) |
| 167 | +
|
| 168 | + # Or with PEP 695 |
| 169 | + def item_twice[T_Item: (str, int)](item: T_Item) -> T_Item: |
| 170 | + return item + item |
| 171 | +
|
| 172 | +The reason to do this rather than just saying the function takes and returns |
| 173 | +``str | int`` is that this way we know that if we pass in eg a ``str`` we |
| 174 | +definitely get a ``str`` back. |
| 175 | + |
| 176 | +Prefer to not use contravariant type vars |
| 177 | +----------------------------------------- |
| 178 | + |
| 179 | +As a general rule contravariant type vars impose restrictions when extending |
| 180 | +classes and can be avoided by replacing: |
| 181 | + |
| 182 | +.. code-block:: python |
| 183 | +
|
| 184 | + from typing import Protocol |
| 185 | +
|
| 186 | + class Thing[T_Item: Item](Protocol): |
| 187 | + def do_something(self, item: T_Item) -> None: ... |
| 188 | +
|
| 189 | +With: |
| 190 | + |
| 191 | +.. code-block:: python |
| 192 | +
|
| 193 | + from typing import Protocol |
| 194 | +
|
| 195 | + class Thing[T_Item: Item](Protocol): |
| 196 | + @property |
| 197 | + def item(self) -> T_Item: ... |
| 198 | +
|
| 199 | + def do_something(self) -> None: ... |
| 200 | +
|
| 201 | +Such that the changeable API surface being acted on is separate from the |
| 202 | +signature of the function doing the action. |
| 203 | + |
| 204 | +As mentioned in :ref:`understanding_annotations`, a contravariant type var is |
| 205 | +used to represent a value where extra API surface is always dropped: |
| 206 | + |
| 207 | +.. code-block:: python |
| 208 | +
|
| 209 | + import dataclasses |
| 210 | + from typing import TYPE_CHECKING, Protocol, TypeVar, cast |
| 211 | +
|
| 212 | + T_COT_Item = TypeVar("T_COT_Item", contravariant=True) |
| 213 | +
|
| 214 | +
|
| 215 | + class Recorder(Protocol[T_COT_Item]): |
| 216 | + def record(self, item: T_COT_Item) -> None: ... |
| 217 | +
|
| 218 | +
|
| 219 | + @dataclasses.dataclass(frozen=True) |
| 220 | + class ItemA: |
| 221 | + a: int |
| 222 | +
|
| 223 | +
|
| 224 | + @dataclasses.dataclass(frozen=True) |
| 225 | + class ItemB(ItemA): |
| 226 | + b: int |
| 227 | +
|
| 228 | +
|
| 229 | + class RecorderA: |
| 230 | + def record(self, item: ItemA) -> None: |
| 231 | + print(item.a) |
| 232 | +
|
| 233 | +
|
| 234 | + class RecorderB: |
| 235 | + def record(self, item: ItemB) -> None: |
| 236 | + print(item.a, item.b) |
| 237 | +
|
| 238 | +
|
| 239 | + def record_things(recorder: Recorder[ItemA]) -> None: |
| 240 | + recorder.record(ItemA(a=1)) |
| 241 | +
|
| 242 | +
|
| 243 | + # This fails because Recorder[ItemB] cannot be used where Recorder[ItemA] is required |
| 244 | + record_things(RecorderB()) |
| 245 | +
|
| 246 | + if TYPE_CHECKING: |
| 247 | + _RA: Recorder[ItemA] = cast(RecorderA, None) |
| 248 | + _RB: Recorder[ItemB] = cast(RecorderB, None) |
| 249 | +
|
| 250 | +An alternative pattern is to create an intermediary object that is specific to |
| 251 | +what is being operated on. This is a bit of a subtle distinction in this example, |
| 252 | +but it would look like this: |
| 253 | + |
| 254 | +.. code-block:: python |
| 255 | +
|
| 256 | + import dataclasses |
| 257 | + from typing import TYPE_CHECKING, Protocol, cast |
| 258 | +
|
| 259 | +
|
| 260 | + class Recorder(Protocol): |
| 261 | + def record(self) -> None: ... |
| 262 | +
|
| 263 | +
|
| 264 | + @dataclasses.dataclass(frozen=True) |
| 265 | + class ItemA: |
| 266 | + a: int |
| 267 | +
|
| 268 | +
|
| 269 | + class ItemARecorder: |
| 270 | + item: ItemA |
| 271 | +
|
| 272 | + def record(self) -> None: |
| 273 | + print(self.item.a) |
| 274 | +
|
| 275 | +
|
| 276 | + @dataclasses.dataclass(frozen=True) |
| 277 | + class ItemB(ItemA): |
| 278 | + b: int |
| 279 | +
|
| 280 | +
|
| 281 | + class ItemBRecorder: |
| 282 | + item: ItemB |
| 283 | +
|
| 284 | + def record(self) -> None: |
| 285 | + print(self.item.a, self.item.b) |
| 286 | +
|
| 287 | +
|
| 288 | + def record_things(recorder: Recorder) -> None: |
| 289 | + recorder.record() |
| 290 | +
|
| 291 | +
|
| 292 | + record_things(ItemARecorder(item=ItemA(a=1))) |
| 293 | + record_things(ItemBRecorder(item=ItemB(a=1, b=5))) |
| 294 | +
|
| 295 | + if TYPE_CHECKING: |
| 296 | + _RA: Recorder = cast(ItemARecorder, None) |
| 297 | + _RB: Recorder = cast(ItemBRecorder, None) |
| 298 | +
|
| 299 | +If this is the extent of requirements then it's likely reasonable to only need |
| 300 | +to have a record method directly on the items themselves, but it's easy to imagine |
| 301 | +a scenario where there's a 1:n relationship between item and "recording" functionality |
| 302 | +and this pattern lets us separate the action of this "record" from what that |
| 303 | +actually means so that it's the caller that controls what that means rather |
| 304 | +than the orchestrator. |
| 305 | + |
| 306 | +.. note:: |
| 307 | + |
| 308 | + Note that in both these situations, we are able to represent the two sides |
| 309 | + of the design coin such that the implementation is generic and the usage is |
| 310 | + not |
| 311 | + |
| 312 | + .. code-block:: python |
| 313 | +
|
| 314 | + import dataclasses |
| 315 | + from typing import TYPE_CHECKING, Protocol, TypeVar, cast |
| 316 | +
|
| 317 | + T_CO_Item = TypeVar("T_CO_Item", covariant=True) |
| 318 | +
|
| 319 | +
|
| 320 | + class ForImplementation(Protocol[T_CO_Item]): |
| 321 | + @property |
| 322 | + def item(self) -> T_CO_Item: ... |
| 323 | +
|
| 324 | + def do_something(self) -> None: ... |
| 325 | +
|
| 326 | +
|
| 327 | + class ForUse(Protocol): |
| 328 | + def do_something(self) -> None: ... |
| 329 | +
|
| 330 | +
|
| 331 | + @dataclasses.dataclass(frozen=True) |
| 332 | + class Implementation: |
| 333 | + item: MyItem |
| 334 | +
|
| 335 | + def do_something(self) -> None: |
| 336 | + self.item.take_over_the_world() |
| 337 | +
|
| 338 | +
|
| 339 | + if TYPE_CHECKING: |
| 340 | + _FI: ForImplementation[MyItem] = cast(Implementation, None) |
| 341 | + _FU: ForUse = cast(Implementation, None) |
| 342 | +
|
| 343 | +Sharing TypeVars |
| 344 | +---------------- |
| 345 | + |
| 346 | +There are many cases where it's not necessary to share TypeVars, but there are |
| 347 | +two scenarios where it can be useful: |
| 348 | + |
| 349 | +- If the TypeVar is bound to a particular type or has a default |
| 350 | +- When the TypeVar is used to create a class that is intended to be subclass'd |
| 351 | + |
| 352 | +In both of these scenarios it can reduce problems around drift to be using a |
| 353 | +common definition of the TypeVar so that shared uses update when the shape of |
| 354 | +the TypeVar change. |
0 commit comments