Skip to content

Commit

Permalink
[red-knot] Attribute access and the descriptor protocol (#16416)
Browse files Browse the repository at this point in the history
## Summary

* Attributes/method are now properly looked up on metaclasses, when
called on class objects
* We properly distinguish between data descriptors and non-data
descriptors (but we do not yet support them in store-context, i.e.
`obj.data_descr = …`)
* The descriptor protocol is now implemented in a single unified place
for instances, classes and dunder-calls. Unions and possibly-unbound
symbols are supported in all possible stages of the process by creating
union types as results.
* In general, the handling of "possibly-unbound" symbols has been
improved in a lot of places: meta-class attributes, attributes,
descriptors with possibly-unbound `__get__` methods, instance
attributes, …
* We keep track of type qualifiers in a lot more places. I anticipate
that this will be useful if we import e.g. `Final` symbols from other
modules (see relevant change to typing spec:
python/typing#1937).
* Detection and special-casing of the `typing.Protocol` special form in
order to avoid lots of changes in the test suite due to new `@Todo`
types when looking up attributes on builtin types which have `Protocol`
in their MRO. We previously
looked up attributes in a wrong way, which is why this didn't come up
before.

closes #16367
closes #15966

## Context

The way attribute lookup in `Type::member` worked before was simply
wrong (mostly my own fault). The whole instance-attribute lookup should
probably never have been integrated into `Type::member`. And the
`Type::static_member` function that I introduced in my last descriptor
PR was the wrong abstraction. It's kind of fascinating how far this
approach took us, but I am pretty confident that the new approach
proposed here is what we need to model this correctly.

There are three key pieces that are required to implement attribute
lookups:

- **`Type::class_member`**/**`Type::find_in_mro`**: The
`Type::find_in_mro` method that can look up attributes on class bodies
(and corresponding bases). This is a partial function on types, as it
can not be called on instance types like`Type::Instance(…)` or
`Type::IntLiteral(…)`. For this reason, we usually call it through
`Type::class_member`, which is essentially just
`type.to_meta_type().find_in_mro(…)` plus union/intersection handling.
- **`Type::instance_member`**: This new function is basically the
type-level equivalent to `obj.__dict__[name]` when called on
`Type::Instance(…)`. We use this to discover instance attributes such as
those that we see as declarations on class bodies or as (annotated)
assignments to `self.attr` in methods of a class.
- The implementation of the descriptor protocol. It works slightly
different for instances and for class objects, but it can be described
by the general framework:
- Call `type.class_member("attribute")` to look up "attribute" in the
MRO of the meta type of `type`. Call the resulting `Symbol` `meta_attr`
(even if it's unbound).
- Use `meta_attr.class_member("__get__")` to look up `__get__` on the
*meta type* of `meta_attr`. Call it with `__get__(meta_attr, self,
self.to_meta_type())`. If this fails (either the lookup or the call),
just proceed with `meta_attr`. Otherwise, replace `meta_attr` in the
following with the return type of `__get__`. In this step, we also probe
if a `__set__` or `__delete__` method exists and store it in
`meta_attr_kind` (can be either "data descriptor" or "normal attribute
or non-data descriptor").
  - Compute a `fallback` type.
    - For instances, we use `self.instance_member("attribute")`
- For class objects, we use `class_attr =
self.find_in_mro("attribute")`, and then try to invoke the descriptor
protocol on `class_attr`, i.e. we look up `__get__` on the meta type of
`class_attr` and call it with `__get__(class_attr, None, self)`. This
additional invocation of the descriptor protocol on the fallback type is
one major asymmetry in the otherwise universal descriptor protocol
implementation.
- Finally, we look at `meta_attr`, `meta_attr_kind` and `fallback`, and
handle various cases of (possible) unboundness of these symbols.
- If `meta_attr` is bound and a data descriptor, just return `meta_attr`
- If `meta_attr` is not a data descriptor, and `fallback` is bound, just
return `fallback`
- If `meta_attr` is not a data descriptor, and `fallback` is unbound,
return `meta_attr`
- Return unions of these three possibilities for partially-bound
symbols.

This allows us to handle class objects and instances within the same
framework. There is a minor additional detail where for instances, we do
not allow the fallback type (the instance attribute) to completely
shadow the non-data descriptor. We do this because we (currently) don't
want to pretend that we can statically infer that an instance attribute
is always set.

Dunder method calls can also be embedded into this framework. The only
thing that changes is that *there is no fallback type*. If a dunder
method is called on an instance, we do not fall back to instance
variables. If a dunder method is called on a class object, we only look
it up on the meta class, never on the class itself.

## Test Plan

New Markdown tests.
  • Loading branch information
sharkdp authored Mar 7, 2025
1 parent a18d8bf commit 820a31a
Show file tree
Hide file tree
Showing 26 changed files with 1,835 additions and 558 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -73,12 +73,12 @@ qux = (foo, bar)
reveal_type(qux) # revealed: tuple[Literal["foo"], Literal["bar"]]

# TODO: Infer "LiteralString"
reveal_type(foo.join(qux)) # revealed: @Todo(overloaded method)
reveal_type(foo.join(qux)) # revealed: @Todo(return type of decorated function)

template: LiteralString = "{}, {}"
reveal_type(template) # revealed: Literal["{}, {}"]
# TODO: Infer `LiteralString`
reveal_type(template.format(foo, bar)) # revealed: @Todo(overloaded method)
reveal_type(template.format(foo, bar)) # revealed: @Todo(return type of decorated function)
```

### Assignability
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,7 @@ import typing

class ListSubclass(typing.List): ...

# TODO: should have `Generic`, should not have `Unknown`
# revealed: tuple[Literal[ListSubclass], Literal[list], Unknown, Literal[object]]
# revealed: tuple[Literal[ListSubclass], Literal[list], Literal[MutableSequence], Literal[Sequence], Literal[Reversible], Literal[Collection], Literal[Iterable], Literal[Container], @Todo(protocol), Literal[object]]
reveal_type(ListSubclass.__mro__)

class DictSubclass(typing.Dict): ...
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -75,8 +75,7 @@ def _(flag: bool):

f = Foo()

# TODO: We should emit an `unsupported-operator` error here, possibly with the information
# that `Foo.__iadd__` may be unbound as additional context.
# error: [unsupported-operator] "Operator `+=` is unsupported between objects of type `Foo` and `Literal["Hello, world!"]`"
f += "Hello, world!"

reveal_type(f) # revealed: int | Unknown
Expand Down
117 changes: 112 additions & 5 deletions crates/red_knot_python_semantic/resources/mdtest/attributes.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,9 @@ reveal_type(c_instance.declared_in_body_and_init) # revealed: str | None

reveal_type(c_instance.declared_in_body_defined_in_init) # revealed: str | None

reveal_type(c_instance.bound_in_body_declared_in_init) # revealed: str | None
# TODO: This should be `str | None`. Fixing this requires an overhaul of the `Symbol` API,
# which is planned in https://github.com/astral-sh/ruff/issues/14297
reveal_type(c_instance.bound_in_body_declared_in_init) # revealed: Unknown | str | None

reveal_type(c_instance.bound_in_body_and_init) # revealed: Unknown | None | Literal["a"]
```
Expand Down Expand Up @@ -704,8 +706,91 @@ reveal_type(Derived().declared_in_body) # revealed: int | None
reveal_type(Derived().defined_in_init) # revealed: str | None
```

## Accessing attributes on class objects

When accessing attributes on class objects, they are always looked up on the type of the class
object first, i.e. on the metaclass:

```py
from typing import Literal

class Meta1:
attr: Literal["meta class value"] = "meta class value"

class C1(metaclass=Meta1): ...

reveal_type(C1.attr) # revealed: Literal["meta class value"]
```

However, the meta class attribute only takes precedence over a class-level attribute if it is a data
descriptor. If it is a non-data descriptor or a normal attribute, the class-level attribute is used
instead (see the [descriptor protocol tests] for data/non-data descriptor attributes):

```py
class Meta2:
attr: str = "meta class value"

class C2(metaclass=Meta2):
attr: Literal["class value"] = "class value"

reveal_type(C2.attr) # revealed: Literal["class value"]
```

If the class-level attribute is only partially defined, we union the meta class attribute with the
class-level attribute:

```py
def _(flag: bool):
class Meta3:
attr1 = "meta class value"
attr2: Literal["meta class value"] = "meta class value"

class C3(metaclass=Meta3):
if flag:
attr1 = "class value"
# TODO: Neither mypy nor pyright show an error here, but we could consider emitting a conflicting-declaration diagnostic here.
attr2: Literal["class value"] = "class value"

reveal_type(C3.attr1) # revealed: Unknown | Literal["meta class value", "class value"]
reveal_type(C3.attr2) # revealed: Literal["meta class value", "class value"]
```

If the *meta class* attribute is only partially defined, we emit a `possibly-unbound-attribute`
diagnostic:

```py
def _(flag: bool):
class Meta4:
if flag:
attr1: str = "meta class value"

class C4(metaclass=Meta4): ...
# error: [possibly-unbound-attribute]
reveal_type(C4.attr1) # revealed: str
```

Finally, if both the meta class attribute and the class-level attribute are only partially defined,
we union them and emit a `possibly-unbound-attribute` diagnostic:

```py
def _(flag1: bool, flag2: bool):
class Meta5:
if flag1:
attr1 = "meta class value"

class C5(metaclass=Meta5):
if flag2:
attr1 = "class value"

# error: [possibly-unbound-attribute]
reveal_type(C5.attr1) # revealed: Unknown | Literal["meta class value", "class value"]
```

## Union of attributes

If the (meta)class is a union type or if the attribute on the (meta) class has a union type, we
infer those union types accordingly:

```py
def _(flag: bool):
if flag:
Expand All @@ -716,14 +801,35 @@ def _(flag: bool):
class C1:
x = 2

reveal_type(C1.x) # revealed: Unknown | Literal[1, 2]

class C2:
if flag:
x = 3
else:
x = 4

reveal_type(C1.x) # revealed: Unknown | Literal[1, 2]
reveal_type(C2.x) # revealed: Unknown | Literal[3, 4]

if flag:
class Meta3(type):
x = 5

else:
class Meta3(type):
x = 6

class C3(metaclass=Meta3): ...
reveal_type(C3.x) # revealed: Unknown | Literal[5, 6]

class Meta4(type):
if flag:
x = 7
else:
x = 8

class C4(metaclass=Meta4): ...
reveal_type(C4.x) # revealed: Unknown | Literal[7, 8]
```

## Inherited class attributes
Expand Down Expand Up @@ -883,7 +989,7 @@ def _(flag: bool):
self.x = 1

# error: [possibly-unbound-attribute]
reveal_type(Foo().x) # revealed: int
reveal_type(Foo().x) # revealed: int | Unknown
```

#### Possibly unbound
Expand Down Expand Up @@ -1105,8 +1211,8 @@ Most attribute accesses on bool-literal types are delegated to `builtins.bool`,
bools are instances of that class:

```py
reveal_type(True.__and__) # revealed: @Todo(overloaded method)
reveal_type(False.__or__) # revealed: @Todo(overloaded method)
reveal_type(True.__and__) # revealed: <bound method `__and__` of `Literal[True]`>
reveal_type(False.__or__) # revealed: <bound method `__or__` of `Literal[False]`>
```

Some attributes are special-cased, however:
Expand Down Expand Up @@ -1262,6 +1368,7 @@ reveal_type(C.a_none) # revealed: None
Some of the tests in the *Class and instance variables* section draw inspiration from
[pyright's documentation] on this topic.

[descriptor protocol tests]: descriptor_protocol.md
[pyright's documentation]: https://microsoft.github.io/pyright/#/type-concepts-advanced?id=class-and-instance-variables
[typing spec on `classvar`]: https://typing.readthedocs.io/en/latest/spec/class-compat.html#classvar
[`typing.classvar`]: https://docs.python.org/3/library/typing.html#typing.ClassVar
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,7 @@ reveal_type(-3 // 3) # revealed: Literal[-1]
reveal_type(-3 / 3) # revealed: float
reveal_type(5 % 3) # revealed: Literal[2]

# TODO: Should emit `unsupported-operator` but we don't understand the bases of `str`, so we think
# it inherits `Unknown`, so we think `str.__radd__` is `Unknown` instead of nonexistent.
# error: [unsupported-operator] "Operator `+` is unsupported between objects of type `Literal[2]` and `Literal["f"]`"
reveal_type(2 + "f") # revealed: Unknown

def lhs(x: int):
Expand Down
99 changes: 95 additions & 4 deletions crates/red_knot_python_semantic/resources/mdtest/call/dunder.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,17 @@ class DunderOnMetaClass(metaclass=Meta):
reveal_type(DunderOnMetaClass[0]) # revealed: str
```

If the dunder method is only present on the class itself, it will not be called:

```py
class ClassWithNormalDunder:
def __getitem__(self, key: int) -> str:
return str(key)

# error: [non-subscriptable]
ClassWithNormalDunder[0]
```

## Operating on instances

When invoking a dunder method on an instance of a class, it is looked up on the class:
Expand Down Expand Up @@ -79,13 +90,32 @@ reveal_type(this_fails[0]) # revealed: Unknown
However, the attached dunder method *can* be called if accessed directly:

```py
# TODO: `this_fails.__getitem__` is incorrectly treated as a bound method. This
# should be fixed with https://github.com/astral-sh/ruff/issues/16367
# error: [too-many-positional-arguments]
# error: [invalid-argument-type]
reveal_type(this_fails.__getitem__(this_fails, 0)) # revealed: Unknown | str
```

The instance-level method is also not called when the class-level method is present:

```py
def external_getitem1(instance, key) -> str:
return "a"

def external_getitem2(key) -> int:
return 1

def _(flag: bool):
class ThisFails:
if flag:
__getitem__ = external_getitem1

def __init__(self):
self.__getitem__ = external_getitem2

this_fails = ThisFails()

# error: [call-possibly-unbound-method]
reveal_type(this_fails[0]) # revealed: Unknown | str
```

## When the dunder is not a method

A dunder can also be a non-method callable:
Expand Down Expand Up @@ -126,3 +156,64 @@ class_with_descriptor_dunder = ClassWithDescriptorDunder()

reveal_type(class_with_descriptor_dunder[0]) # revealed: str
```

## Dunders can not be overwritten on instances

If we attempt to overwrite a dunder method on an instance, it does not affect the behavior of
implicit dunder calls:

```py
class C:
def __getitem__(self, key: int) -> str:
return str(key)

def f(self):
# TODO: This should emit an `invalid-assignment` diagnostic once we understand the type of `self`
self.__getitem__ = None

# This is still fine, and simply calls the `__getitem__` method on the class
reveal_type(C()[0]) # revealed: str
```

## Calling a union of dunder methods

```py
def _(flag: bool):
class C:
if flag:
def __getitem__(self, key: int) -> str:
return str(key)
else:
def __getitem__(self, key: int) -> bytes:
return key

c = C()
reveal_type(c[0]) # revealed: str | bytes

if flag:
class D:
def __getitem__(self, key: int) -> str:
return str(key)

else:
class D:
def __getitem__(self, key: int) -> bytes:
return key

d = D()
reveal_type(d[0]) # revealed: str | bytes
```

## Calling a possibly-unbound dunder method

```py
def _(flag: bool):
class C:
if flag:
def __getitem__(self, key: int) -> str:
return str(key)

c = C()
# error: [call-possibly-unbound-method]
reveal_type(c[0]) # revealed: str
```
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ import inspect

class Descriptor:
def __get__(self, instance, owner) -> str:
return 1
return "a"

class C:
normal: int = 1
Expand Down Expand Up @@ -59,7 +59,7 @@ import sys
reveal_type(inspect.getattr_static(sys, "platform")) # revealed: LiteralString
reveal_type(inspect.getattr_static(inspect, "getattr_static")) # revealed: Literal[getattr_static]

reveal_type(inspect.getattr_static(1, "real")) # revealed: Literal[1]
reveal_type(inspect.getattr_static(1, "real")) # revealed: Literal[real]
```

(Implicit) instance attributes can also be accessed through `inspect.getattr_static`:
Expand All @@ -72,6 +72,23 @@ class D:
reveal_type(inspect.getattr_static(D(), "instance_attr")) # revealed: int
```

And attributes on metaclasses can be accessed when probing the class:

```py
class Meta(type):
attr: int = 1

class E(metaclass=Meta): ...

reveal_type(inspect.getattr_static(E, "attr")) # revealed: int
```

Metaclass attributes can not be added when probing an instance of the class:

```py
reveal_type(inspect.getattr_static(E(), "attr", "non_existent")) # revealed: Literal["non_existent"]
```

## Error cases

We can only infer precise types if the attribute is a literal string. In all other cases, we fall
Expand Down
Loading

0 comments on commit 820a31a

Please sign in to comment.