Skip to content

Commit a1c7bfb

Browse files
committed
Add understanding TypeVars page
1 parent 9e032d8 commit a1c7bfb

File tree

2 files changed

+355
-0
lines changed

2 files changed

+355
-0
lines changed
Lines changed: 354 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,354 @@
1+
.. _understanding_typevars:
2+
3+
Understanding TypeVars
4+
======================
5+
6+
Static type checking is by definition a static system of checks, which means
7+
anything that is determined at runtime is irrelevant to the type checker!
8+
However it can be necessary for the types discovered by the linter to depend on
9+
how a piece of code is called.
10+
11+
For example, it can be common to define code that supports many different
12+
types but only operates on one of those types at a time and this is the situation
13+
where a TypeVar becomes very useful.
14+
15+
Essentially, a TypeVar is an annotation where the concrete type it represents
16+
is determined by the caller rather than by the implementation.
17+
18+
For example
19+
20+
.. code-block:: python
21+
22+
import attrs
23+
24+
@attrs.frozen
25+
class Container[T_Item]:
26+
item: T_Item
27+
28+
def get_item_twice(self) -> tuple[T_Item, T_Item]:
29+
return (self.item, self.item)
30+
31+
# Revealed type is "tuple[builtins.str, builtins.str]"
32+
reveal_type(Container("asdf").get_item_twice())
33+
34+
# Revealed type is "tuple[builtins.int, builtins.int]"
35+
reveal_type(Container(1).get_item_twice())
36+
37+
# Revealed type is "tuple[builtins.bool, builtins.bool]"
38+
reveal_type(Container(True).get_item_twice())
39+
40+
Think of it like simple algebra where we need to "solve for X".
41+
42+
.. note::
43+
44+
❗ A TypeVar needs to be **bound** which means any time there is a TypeVar in
45+
an output, there needs to be a chance for the caller to provide the specific
46+
type that the type var represents in that context.
47+
48+
So either as part of the definition of the enclosing class:
49+
50+
.. code-block:: python
51+
52+
class MyProtocol[T_Item](Protocol):
53+
@property
54+
def item(self) -> T_Item: ...
55+
56+
57+
@attrs.frozen
58+
class MyBaseClass[T_Item]:
59+
item: T_Item
60+
61+
Or as an input to the function returning the TypeVar:
62+
63+
.. code-block:: python
64+
65+
class MyProtocol(Protocol):
66+
def process[T_Item](self, item: T_Item) -> T_Item: ...
67+
68+
69+
def process[T_Item](item: T_Item) -> T_Item:
70+
raise NotImplementedError()
71+
72+
def extract[T_Item](container: Container[T_Item]) -> T_Item:
73+
raise NotImplementedError()
74+
75+
TypeVars also have a concept called "variance". This is relevant when you have
76+
a bound type variable (like ``T_Vehicle``) and some specific implementations
77+
of it (like ``Car`` or ``Bicycle``), which, alongside the basic features
78+
guaranteed by ``Vehicle``, also have some "extra" attributes and methods;
79+
an extra "API surface", if you will.
80+
81+
There are three types of this variance:
82+
83+
- **covariant** - extra API surface is kept
84+
- **contravariant** - extra API surface is forgotten
85+
- **invariant** - is neither just covariant or just contravariant
86+
87+
When you aren't using `PEP 695 <https://peps.python.org/pep-0695/>`_
88+
(inline squarebrackets) syntax or ``infer_variance=True`` when creating the
89+
TypeVar, mypy will enforce the variance of the TypeVar based on:
90+
91+
- Inputs are always contravariant
92+
- Outputs are always covariant
93+
94+
This means:
95+
96+
- A type var that's only ever an output must be defined with ``covariant=True``
97+
- A type var that's only ever an input must be defined with ``contravariant=True``
98+
- A type var that appears as an input and as an output cannot be covariant
99+
or contravariant, so it must be invariant.
100+
101+
.. note::
102+
103+
🤔 Note that a TypeVar is only necessary when the API surface of an object
104+
changes based on the implementation. If a different implementation doesn't
105+
change what attributes or methods are available, then it can be represented
106+
as a Protocol.
107+
108+
Giving an upper bound to a ``TypeVar``
109+
--------------------------------------
110+
111+
By default a TypeVar statically has no methods or attributes on it at the point
112+
it appears to the linter as a TypeVar.
113+
114+
.. code-block:: python
115+
116+
def pass_through[T_Item](item: T_Item) -> T_Item:
117+
# at this point item statically has no attributes or methods on it
118+
return item
119+
120+
my_variable: int = 1
121+
# my_variable has all the attributes/methods that an int has
122+
123+
after_pass_through = pass_through(my_variable)
124+
# after_pass_through is typed as the type use for the input, so it also
125+
# statically has all the attributes/methods that an int has
126+
127+
It's possible to make it so that the code thinking about the TypeVar has
128+
specific attributes and methods that are statically guaranteed to be available:
129+
130+
.. code-block:: python
131+
132+
from typing import Protocol
133+
134+
class HasProcess(Protocol):
135+
def process(self) -> None: ...
136+
137+
def pass_through[T_Item: HasProcess](item: T_Item) -> T_Item:
138+
# Our type var is bound to ``HasProcess``
139+
# This means whilst the object passed in is allowed to have any number
140+
# of additional attributes and methods, we are guaranteed that it at least
141+
# has a "process" method that takes in no arguments and returns None
142+
item.process()
143+
return item
144+
145+
# Error, int has no "process" on it!
146+
pass_through(1)
147+
148+
@attrs.define
149+
class MyProcessor:
150+
number: int
151+
152+
def process(self) -> None:
153+
print(self.number)
154+
155+
# Valid because an instance of MyProcessor has a method on it
156+
# called "process" that can be called with no paramaters and returns a None
157+
pass_through(MyProcessor(10))
158+
159+
There is also the ability to constrain the TypeVar to exact types rather than
160+
subtypes of the upper bound (as long as you have at least two constraints):
161+
162+
.. code-block:: python
163+
164+
from typing import TypeVar
165+
166+
T_Item = TypeVar("T_Item", str, int)
167+
168+
# Or with PEP 695
169+
def item_twice[T_Item: (str, int)](item: T_Item) -> T_Item:
170+
return item + item
171+
172+
The reason to do this rather than just saying the function takes and returns
173+
``str | int`` is that this way we know that if we pass in eg a ``str`` we
174+
definitely get a ``str`` back.
175+
176+
Prefer to not use contravariant type vars
177+
-----------------------------------------
178+
179+
As a general rule contravariant type vars impose restrictions when extending
180+
classes and can be avoided by replacing:
181+
182+
.. code-block:: python
183+
184+
from typing import Protocol
185+
186+
class Thing[T_Item: Item](Protocol):
187+
def do_something(self, item: T_Item) -> None: ...
188+
189+
With:
190+
191+
.. code-block:: python
192+
193+
from typing import Protocol
194+
195+
class Thing[T_Item: Item](Protocol):
196+
@property
197+
def item(self) -> T_Item: ...
198+
199+
def do_something(self) -> None: ...
200+
201+
Such that the changeable API surface being acted on is separate from the
202+
signature of the function doing the action.
203+
204+
As mentioned in :ref:`understanding_annotations`, a contravariant type var is
205+
used to represent a value where extra API surface is always dropped:
206+
207+
.. code-block:: python
208+
209+
import dataclasses
210+
from typing import TYPE_CHECKING, Protocol, TypeVar, cast
211+
212+
T_COT_Item = TypeVar("T_COT_Item", contravariant=True)
213+
214+
215+
class Recorder(Protocol[T_COT_Item]):
216+
def record(self, item: T_COT_Item) -> None: ...
217+
218+
219+
@dataclasses.dataclass(frozen=True)
220+
class ItemA:
221+
a: int
222+
223+
224+
@dataclasses.dataclass(frozen=True)
225+
class ItemB(ItemA):
226+
b: int
227+
228+
229+
class RecorderA:
230+
def record(self, item: ItemA) -> None:
231+
print(item.a)
232+
233+
234+
class RecorderB:
235+
def record(self, item: ItemB) -> None:
236+
print(item.a, item.b)
237+
238+
239+
def record_things(recorder: Recorder[ItemA]) -> None:
240+
recorder.record(ItemA(a=1))
241+
242+
243+
# This fails because Recorder[ItemB] cannot be used where Recorder[ItemA] is required
244+
record_things(RecorderB())
245+
246+
if TYPE_CHECKING:
247+
_RA: Recorder[ItemA] = cast(RecorderA, None)
248+
_RB: Recorder[ItemB] = cast(RecorderB, None)
249+
250+
An alternative pattern is to create an intermediary object that is specific to
251+
what is being operated on. This is a bit of a subtle distinction in this example,
252+
but it would look like this:
253+
254+
.. code-block:: python
255+
256+
import dataclasses
257+
from typing import TYPE_CHECKING, Protocol, cast
258+
259+
260+
class Recorder(Protocol):
261+
def record(self) -> None: ...
262+
263+
264+
@dataclasses.dataclass(frozen=True)
265+
class ItemA:
266+
a: int
267+
268+
269+
class ItemARecorder:
270+
item: ItemA
271+
272+
def record(self) -> None:
273+
print(self.item.a)
274+
275+
276+
@dataclasses.dataclass(frozen=True)
277+
class ItemB(ItemA):
278+
b: int
279+
280+
281+
class ItemBRecorder:
282+
item: ItemB
283+
284+
def record(self) -> None:
285+
print(self.item.a, self.item.b)
286+
287+
288+
def record_things(recorder: Recorder) -> None:
289+
recorder.record()
290+
291+
292+
record_things(ItemARecorder(item=ItemA(a=1)))
293+
record_things(ItemBRecorder(item=ItemB(a=1, b=5)))
294+
295+
if TYPE_CHECKING:
296+
_RA: Recorder = cast(ItemARecorder, None)
297+
_RB: Recorder = cast(ItemBRecorder, None)
298+
299+
If this is the extent of requirements then it's likely reasonable to only need
300+
to have a record method directly on the items themselves, but it's easy to imagine
301+
a scenario where there's a 1:n relationship between item and "recording" functionality
302+
and this pattern lets us separate the action of this "record" from what that
303+
actually means so that it's the caller that controls what that means rather
304+
than the orchestrator.
305+
306+
.. note::
307+
308+
Note that in both these situations, we are able to represent the two sides
309+
of the design coin such that the implementation is generic and the usage is
310+
not
311+
312+
.. code-block:: python
313+
314+
import dataclasses
315+
from typing import TYPE_CHECKING, Protocol, TypeVar, cast
316+
317+
T_CO_Item = TypeVar("T_CO_Item", covariant=True)
318+
319+
320+
class ForImplementation(Protocol[T_CO_Item]):
321+
@property
322+
def item(self) -> T_CO_Item: ...
323+
324+
def do_something(self) -> None: ...
325+
326+
327+
class ForUse(Protocol):
328+
def do_something(self) -> None: ...
329+
330+
331+
@dataclasses.dataclass(frozen=True)
332+
class Implementation:
333+
item: MyItem
334+
335+
def do_something(self) -> None:
336+
self.item.take_over_the_world()
337+
338+
339+
if TYPE_CHECKING:
340+
_FI: ForImplementation[MyItem] = cast(Implementation, None)
341+
_FU: ForUse = cast(Implementation, None)
342+
343+
Sharing TypeVars
344+
----------------
345+
346+
There are many cases where it's not necessary to share TypeVars, but there are
347+
two scenarios where it can be useful:
348+
349+
- If the TypeVar is bound to a particular type or has a default
350+
- When the TypeVar is used to create a class that is intended to be subclass'd
351+
352+
In both of these scenarios it can reduce problems around drift to be using a
353+
common definition of the TypeVar so that shared uses update when the shape of
354+
the TypeVar change.

docs/advice/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,3 +5,4 @@ Kraken Static Typing Advice
55
:maxdepth: 1
66

77
advice/understanding_annotations
8+
advice/understanding_typevars

0 commit comments

Comments
 (0)