Skip to content

Commit 820a31a

Browse files
authored
[red-knot] Attribute access and the descriptor protocol (#16416)
## Summary * Attributes/method are now properly looked up on metaclasses, when called on class objects * We properly distinguish between data descriptors and non-data descriptors (but we do not yet support them in store-context, i.e. `obj.data_descr = …`) * The descriptor protocol is now implemented in a single unified place for instances, classes and dunder-calls. Unions and possibly-unbound symbols are supported in all possible stages of the process by creating union types as results. * In general, the handling of "possibly-unbound" symbols has been improved in a lot of places: meta-class attributes, attributes, descriptors with possibly-unbound `__get__` methods, instance attributes, … * We keep track of type qualifiers in a lot more places. I anticipate that this will be useful if we import e.g. `Final` symbols from other modules (see relevant change to typing spec: python/typing#1937). * Detection and special-casing of the `typing.Protocol` special form in order to avoid lots of changes in the test suite due to new `@Todo` types when looking up attributes on builtin types which have `Protocol` in their MRO. We previously looked up attributes in a wrong way, which is why this didn't come up before. closes #16367 closes #15966 ## Context The way attribute lookup in `Type::member` worked before was simply wrong (mostly my own fault). The whole instance-attribute lookup should probably never have been integrated into `Type::member`. And the `Type::static_member` function that I introduced in my last descriptor PR was the wrong abstraction. It's kind of fascinating how far this approach took us, but I am pretty confident that the new approach proposed here is what we need to model this correctly. There are three key pieces that are required to implement attribute lookups: - **`Type::class_member`**/**`Type::find_in_mro`**: The `Type::find_in_mro` method that can look up attributes on class bodies (and corresponding bases). This is a partial function on types, as it can not be called on instance types like`Type::Instance(…)` or `Type::IntLiteral(…)`. For this reason, we usually call it through `Type::class_member`, which is essentially just `type.to_meta_type().find_in_mro(…)` plus union/intersection handling. - **`Type::instance_member`**: This new function is basically the type-level equivalent to `obj.__dict__[name]` when called on `Type::Instance(…)`. We use this to discover instance attributes such as those that we see as declarations on class bodies or as (annotated) assignments to `self.attr` in methods of a class. - The implementation of the descriptor protocol. It works slightly different for instances and for class objects, but it can be described by the general framework: - Call `type.class_member("attribute")` to look up "attribute" in the MRO of the meta type of `type`. Call the resulting `Symbol` `meta_attr` (even if it's unbound). - Use `meta_attr.class_member("__get__")` to look up `__get__` on the *meta type* of `meta_attr`. Call it with `__get__(meta_attr, self, self.to_meta_type())`. If this fails (either the lookup or the call), just proceed with `meta_attr`. Otherwise, replace `meta_attr` in the following with the return type of `__get__`. In this step, we also probe if a `__set__` or `__delete__` method exists and store it in `meta_attr_kind` (can be either "data descriptor" or "normal attribute or non-data descriptor"). - Compute a `fallback` type. - For instances, we use `self.instance_member("attribute")` - For class objects, we use `class_attr = self.find_in_mro("attribute")`, and then try to invoke the descriptor protocol on `class_attr`, i.e. we look up `__get__` on the meta type of `class_attr` and call it with `__get__(class_attr, None, self)`. This additional invocation of the descriptor protocol on the fallback type is one major asymmetry in the otherwise universal descriptor protocol implementation. - Finally, we look at `meta_attr`, `meta_attr_kind` and `fallback`, and handle various cases of (possible) unboundness of these symbols. - If `meta_attr` is bound and a data descriptor, just return `meta_attr` - If `meta_attr` is not a data descriptor, and `fallback` is bound, just return `fallback` - If `meta_attr` is not a data descriptor, and `fallback` is unbound, return `meta_attr` - Return unions of these three possibilities for partially-bound symbols. This allows us to handle class objects and instances within the same framework. There is a minor additional detail where for instances, we do not allow the fallback type (the instance attribute) to completely shadow the non-data descriptor. We do this because we (currently) don't want to pretend that we can statically infer that an instance attribute is always set. Dunder method calls can also be embedded into this framework. The only thing that changes is that *there is no fallback type*. If a dunder method is called on an instance, we do not fall back to instance variables. If a dunder method is called on a class object, we only look it up on the meta class, never on the class itself. ## Test Plan New Markdown tests.
1 parent a18d8bf commit 820a31a

File tree

26 files changed

+1835
-558
lines changed

26 files changed

+1835
-558
lines changed

crates/red_knot_python_semantic/resources/mdtest/annotations/literal_string.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -73,12 +73,12 @@ qux = (foo, bar)
7373
reveal_type(qux) # revealed: tuple[Literal["foo"], Literal["bar"]]
7474

7575
# TODO: Infer "LiteralString"
76-
reveal_type(foo.join(qux)) # revealed: @Todo(overloaded method)
76+
reveal_type(foo.join(qux)) # revealed: @Todo(return type of decorated function)
7777

7878
template: LiteralString = "{}, {}"
7979
reveal_type(template) # revealed: Literal["{}, {}"]
8080
# TODO: Infer `LiteralString`
81-
reveal_type(template.format(foo, bar)) # revealed: @Todo(overloaded method)
81+
reveal_type(template.format(foo, bar)) # revealed: @Todo(return type of decorated function)
8282
```
8383

8484
### Assignability

crates/red_knot_python_semantic/resources/mdtest/annotations/stdlib_typing_aliases.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -70,8 +70,7 @@ import typing
7070

7171
class ListSubclass(typing.List): ...
7272

73-
# TODO: should have `Generic`, should not have `Unknown`
74-
# revealed: tuple[Literal[ListSubclass], Literal[list], Unknown, Literal[object]]
73+
# revealed: tuple[Literal[ListSubclass], Literal[list], Literal[MutableSequence], Literal[Sequence], Literal[Reversible], Literal[Collection], Literal[Iterable], Literal[Container], @Todo(protocol), Literal[object]]
7574
reveal_type(ListSubclass.__mro__)
7675

7776
class DictSubclass(typing.Dict): ...

crates/red_knot_python_semantic/resources/mdtest/assignment/augmented.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -75,8 +75,7 @@ def _(flag: bool):
7575

7676
f = Foo()
7777

78-
# TODO: We should emit an `unsupported-operator` error here, possibly with the information
79-
# that `Foo.__iadd__` may be unbound as additional context.
78+
# error: [unsupported-operator] "Operator `+=` is unsupported between objects of type `Foo` and `Literal["Hello, world!"]`"
8079
f += "Hello, world!"
8180

8281
reveal_type(f) # revealed: int | Unknown

crates/red_knot_python_semantic/resources/mdtest/attributes.md

Lines changed: 112 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -155,7 +155,9 @@ reveal_type(c_instance.declared_in_body_and_init) # revealed: str | None
155155

156156
reveal_type(c_instance.declared_in_body_defined_in_init) # revealed: str | None
157157

158-
reveal_type(c_instance.bound_in_body_declared_in_init) # revealed: str | None
158+
# TODO: This should be `str | None`. Fixing this requires an overhaul of the `Symbol` API,
159+
# which is planned in https://github.com/astral-sh/ruff/issues/14297
160+
reveal_type(c_instance.bound_in_body_declared_in_init) # revealed: Unknown | str | None
159161

160162
reveal_type(c_instance.bound_in_body_and_init) # revealed: Unknown | None | Literal["a"]
161163
```
@@ -704,8 +706,91 @@ reveal_type(Derived().declared_in_body) # revealed: int | None
704706
reveal_type(Derived().defined_in_init) # revealed: str | None
705707
```
706708

709+
## Accessing attributes on class objects
710+
711+
When accessing attributes on class objects, they are always looked up on the type of the class
712+
object first, i.e. on the metaclass:
713+
714+
```py
715+
from typing import Literal
716+
717+
class Meta1:
718+
attr: Literal["meta class value"] = "meta class value"
719+
720+
class C1(metaclass=Meta1): ...
721+
722+
reveal_type(C1.attr) # revealed: Literal["meta class value"]
723+
```
724+
725+
However, the meta class attribute only takes precedence over a class-level attribute if it is a data
726+
descriptor. If it is a non-data descriptor or a normal attribute, the class-level attribute is used
727+
instead (see the [descriptor protocol tests] for data/non-data descriptor attributes):
728+
729+
```py
730+
class Meta2:
731+
attr: str = "meta class value"
732+
733+
class C2(metaclass=Meta2):
734+
attr: Literal["class value"] = "class value"
735+
736+
reveal_type(C2.attr) # revealed: Literal["class value"]
737+
```
738+
739+
If the class-level attribute is only partially defined, we union the meta class attribute with the
740+
class-level attribute:
741+
742+
```py
743+
def _(flag: bool):
744+
class Meta3:
745+
attr1 = "meta class value"
746+
attr2: Literal["meta class value"] = "meta class value"
747+
748+
class C3(metaclass=Meta3):
749+
if flag:
750+
attr1 = "class value"
751+
# TODO: Neither mypy nor pyright show an error here, but we could consider emitting a conflicting-declaration diagnostic here.
752+
attr2: Literal["class value"] = "class value"
753+
754+
reveal_type(C3.attr1) # revealed: Unknown | Literal["meta class value", "class value"]
755+
reveal_type(C3.attr2) # revealed: Literal["meta class value", "class value"]
756+
```
757+
758+
If the *meta class* attribute is only partially defined, we emit a `possibly-unbound-attribute`
759+
diagnostic:
760+
761+
```py
762+
def _(flag: bool):
763+
class Meta4:
764+
if flag:
765+
attr1: str = "meta class value"
766+
767+
class C4(metaclass=Meta4): ...
768+
# error: [possibly-unbound-attribute]
769+
reveal_type(C4.attr1) # revealed: str
770+
```
771+
772+
Finally, if both the meta class attribute and the class-level attribute are only partially defined,
773+
we union them and emit a `possibly-unbound-attribute` diagnostic:
774+
775+
```py
776+
def _(flag1: bool, flag2: bool):
777+
class Meta5:
778+
if flag1:
779+
attr1 = "meta class value"
780+
781+
class C5(metaclass=Meta5):
782+
if flag2:
783+
attr1 = "class value"
784+
785+
# error: [possibly-unbound-attribute]
786+
reveal_type(C5.attr1) # revealed: Unknown | Literal["meta class value", "class value"]
787+
```
788+
707789
## Union of attributes
708790

791+
If the (meta)class is a union type or if the attribute on the (meta) class has a union type, we
792+
infer those union types accordingly:
793+
709794
```py
710795
def _(flag: bool):
711796
if flag:
@@ -716,14 +801,35 @@ def _(flag: bool):
716801
class C1:
717802
x = 2
718803

804+
reveal_type(C1.x) # revealed: Unknown | Literal[1, 2]
805+
719806
class C2:
720807
if flag:
721808
x = 3
722809
else:
723810
x = 4
724811

725-
reveal_type(C1.x) # revealed: Unknown | Literal[1, 2]
726812
reveal_type(C2.x) # revealed: Unknown | Literal[3, 4]
813+
814+
if flag:
815+
class Meta3(type):
816+
x = 5
817+
818+
else:
819+
class Meta3(type):
820+
x = 6
821+
822+
class C3(metaclass=Meta3): ...
823+
reveal_type(C3.x) # revealed: Unknown | Literal[5, 6]
824+
825+
class Meta4(type):
826+
if flag:
827+
x = 7
828+
else:
829+
x = 8
830+
831+
class C4(metaclass=Meta4): ...
832+
reveal_type(C4.x) # revealed: Unknown | Literal[7, 8]
727833
```
728834

729835
## Inherited class attributes
@@ -883,7 +989,7 @@ def _(flag: bool):
883989
self.x = 1
884990

885991
# error: [possibly-unbound-attribute]
886-
reveal_type(Foo().x) # revealed: int
992+
reveal_type(Foo().x) # revealed: int | Unknown
887993
```
888994

889995
#### Possibly unbound
@@ -1105,8 +1211,8 @@ Most attribute accesses on bool-literal types are delegated to `builtins.bool`,
11051211
bools are instances of that class:
11061212

11071213
```py
1108-
reveal_type(True.__and__) # revealed: @Todo(overloaded method)
1109-
reveal_type(False.__or__) # revealed: @Todo(overloaded method)
1214+
reveal_type(True.__and__) # revealed: <bound method `__and__` of `Literal[True]`>
1215+
reveal_type(False.__or__) # revealed: <bound method `__or__` of `Literal[False]`>
11101216
```
11111217

11121218
Some attributes are special-cased, however:
@@ -1262,6 +1368,7 @@ reveal_type(C.a_none) # revealed: None
12621368
Some of the tests in the *Class and instance variables* section draw inspiration from
12631369
[pyright's documentation] on this topic.
12641370

1371+
[descriptor protocol tests]: descriptor_protocol.md
12651372
[pyright's documentation]: https://microsoft.github.io/pyright/#/type-concepts-advanced?id=class-and-instance-variables
12661373
[typing spec on `classvar`]: https://typing.readthedocs.io/en/latest/spec/class-compat.html#classvar
12671374
[`typing.classvar`]: https://docs.python.org/3/library/typing.html#typing.ClassVar

crates/red_knot_python_semantic/resources/mdtest/binary/integers.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,7 @@ reveal_type(-3 // 3) # revealed: Literal[-1]
1010
reveal_type(-3 / 3) # revealed: float
1111
reveal_type(5 % 3) # revealed: Literal[2]
1212

13-
# TODO: Should emit `unsupported-operator` but we don't understand the bases of `str`, so we think
14-
# it inherits `Unknown`, so we think `str.__radd__` is `Unknown` instead of nonexistent.
13+
# error: [unsupported-operator] "Operator `+` is unsupported between objects of type `Literal[2]` and `Literal["f"]`"
1514
reveal_type(2 + "f") # revealed: Unknown
1615

1716
def lhs(x: int):

crates/red_knot_python_semantic/resources/mdtest/call/dunder.md

Lines changed: 95 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,17 @@ class DunderOnMetaClass(metaclass=Meta):
4646
reveal_type(DunderOnMetaClass[0]) # revealed: str
4747
```
4848

49+
If the dunder method is only present on the class itself, it will not be called:
50+
51+
```py
52+
class ClassWithNormalDunder:
53+
def __getitem__(self, key: int) -> str:
54+
return str(key)
55+
56+
# error: [non-subscriptable]
57+
ClassWithNormalDunder[0]
58+
```
59+
4960
## Operating on instances
5061

5162
When invoking a dunder method on an instance of a class, it is looked up on the class:
@@ -79,13 +90,32 @@ reveal_type(this_fails[0]) # revealed: Unknown
7990
However, the attached dunder method *can* be called if accessed directly:
8091

8192
```py
82-
# TODO: `this_fails.__getitem__` is incorrectly treated as a bound method. This
83-
# should be fixed with https://github.com/astral-sh/ruff/issues/16367
84-
# error: [too-many-positional-arguments]
85-
# error: [invalid-argument-type]
8693
reveal_type(this_fails.__getitem__(this_fails, 0)) # revealed: Unknown | str
8794
```
8895

96+
The instance-level method is also not called when the class-level method is present:
97+
98+
```py
99+
def external_getitem1(instance, key) -> str:
100+
return "a"
101+
102+
def external_getitem2(key) -> int:
103+
return 1
104+
105+
def _(flag: bool):
106+
class ThisFails:
107+
if flag:
108+
__getitem__ = external_getitem1
109+
110+
def __init__(self):
111+
self.__getitem__ = external_getitem2
112+
113+
this_fails = ThisFails()
114+
115+
# error: [call-possibly-unbound-method]
116+
reveal_type(this_fails[0]) # revealed: Unknown | str
117+
```
118+
89119
## When the dunder is not a method
90120

91121
A dunder can also be a non-method callable:
@@ -126,3 +156,64 @@ class_with_descriptor_dunder = ClassWithDescriptorDunder()
126156

127157
reveal_type(class_with_descriptor_dunder[0]) # revealed: str
128158
```
159+
160+
## Dunders can not be overwritten on instances
161+
162+
If we attempt to overwrite a dunder method on an instance, it does not affect the behavior of
163+
implicit dunder calls:
164+
165+
```py
166+
class C:
167+
def __getitem__(self, key: int) -> str:
168+
return str(key)
169+
170+
def f(self):
171+
# TODO: This should emit an `invalid-assignment` diagnostic once we understand the type of `self`
172+
self.__getitem__ = None
173+
174+
# This is still fine, and simply calls the `__getitem__` method on the class
175+
reveal_type(C()[0]) # revealed: str
176+
```
177+
178+
## Calling a union of dunder methods
179+
180+
```py
181+
def _(flag: bool):
182+
class C:
183+
if flag:
184+
def __getitem__(self, key: int) -> str:
185+
return str(key)
186+
else:
187+
def __getitem__(self, key: int) -> bytes:
188+
return key
189+
190+
c = C()
191+
reveal_type(c[0]) # revealed: str | bytes
192+
193+
if flag:
194+
class D:
195+
def __getitem__(self, key: int) -> str:
196+
return str(key)
197+
198+
else:
199+
class D:
200+
def __getitem__(self, key: int) -> bytes:
201+
return key
202+
203+
d = D()
204+
reveal_type(d[0]) # revealed: str | bytes
205+
```
206+
207+
## Calling a possibly-unbound dunder method
208+
209+
```py
210+
def _(flag: bool):
211+
class C:
212+
if flag:
213+
def __getitem__(self, key: int) -> str:
214+
return str(key)
215+
216+
c = C()
217+
# error: [call-possibly-unbound-method]
218+
reveal_type(c[0]) # revealed: str
219+
```

crates/red_knot_python_semantic/resources/mdtest/call/getattr_static.md

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ import inspect
1212

1313
class Descriptor:
1414
def __get__(self, instance, owner) -> str:
15-
return 1
15+
return "a"
1616

1717
class C:
1818
normal: int = 1
@@ -59,7 +59,7 @@ import sys
5959
reveal_type(inspect.getattr_static(sys, "platform")) # revealed: LiteralString
6060
reveal_type(inspect.getattr_static(inspect, "getattr_static")) # revealed: Literal[getattr_static]
6161

62-
reveal_type(inspect.getattr_static(1, "real")) # revealed: Literal[1]
62+
reveal_type(inspect.getattr_static(1, "real")) # revealed: Literal[real]
6363
```
6464

6565
(Implicit) instance attributes can also be accessed through `inspect.getattr_static`:
@@ -72,6 +72,23 @@ class D:
7272
reveal_type(inspect.getattr_static(D(), "instance_attr")) # revealed: int
7373
```
7474

75+
And attributes on metaclasses can be accessed when probing the class:
76+
77+
```py
78+
class Meta(type):
79+
attr: int = 1
80+
81+
class E(metaclass=Meta): ...
82+
83+
reveal_type(inspect.getattr_static(E, "attr")) # revealed: int
84+
```
85+
86+
Metaclass attributes can not be added when probing an instance of the class:
87+
88+
```py
89+
reveal_type(inspect.getattr_static(E(), "attr", "non_existent")) # revealed: Literal["non_existent"]
90+
```
91+
7592
## Error cases
7693

7794
We can only infer precise types if the attribute is a literal string. In all other cases, we fall

0 commit comments

Comments
 (0)