Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Attribute access and the descriptor protocol #16416

Merged
merged 8 commits into from
Mar 7, 2025

Conversation

sharkdp
Copy link
Contributor

@sharkdp sharkdp commented Feb 27, 2025

Summary

  • Attributes/method are now properly looked up on metaclasses, when called on class objects
  • We properly distinguish between data descriptors and non-data descriptors (but we do not yet support them in store-context, i.e. obj.data_descr = …)
  • The descriptor protocol is now implemented in a single unified place for instances, classes and dunder-calls. Unions and possibly-unbound symbols are supported in all possible stages of the process by creating union types as results.
  • In general, the handling of "possibly-unbound" symbols has been improved in a lot of places: meta-class attributes, attributes, descriptors with possibly-unbound __get__ methods, instance attributes, …
  • We keep track of type qualifiers in a lot more places. I anticipate that this will be useful if we import e.g. Final symbols from other modules (see relevant change to typing spec: Added "Importing Final Variables" section python/typing#1937).
  • Detection and special-casing of the typing.Protocol special form in order to avoid lots of changes in the test suite due to new @Todo types when looking up attributes on builtin types which have Protocol in their MRO. We previously
    looked up attributes in a wrong way, which is why this didn't come up before.

closes #16367
closes #15966

Context

The way attribute lookup in Type::member worked before was simply wrong (mostly my own fault). The whole instance-attribute lookup should probably never have been integrated into Type::member. And the Type::static_member function that I introduced in my last descriptor PR was the wrong abstraction. It's kind of fascinating how far this approach took us, but I am pretty confident that the new approach proposed here is what we need to model this correctly.

There are three key pieces that are required to implement attribute lookups:

  • Type::class_member/Type::find_in_mro: The Type::find_in_mro method that can look up attributes on class bodies (and corresponding bases). This is a partial function on types, as it can not be called on instance types likeType::Instance(…) or Type::IntLiteral(…). For this reason, we usually call it through Type::class_member, which is essentially just type.to_meta_type().find_in_mro(…) plus union/intersection handling.
  • Type::instance_member: This new function is basically the type-level equivalent to obj.__dict__[name] when called on Type::Instance(…). We use this to discover instance attributes such as those that we see as declarations on class bodies or as (annotated) assignments to self.attr in methods of a class.
  • The implementation of the descriptor protocol. It works slightly different for instances and for class objects, but it can be described by the general framework:
    • Call type.class_member("attribute") to look up "attribute" in the MRO of the meta type of type. Call the resulting Symbol meta_attr (even if it's unbound).
    • Use meta_attr.class_member("__get__") to look up __get__ on the meta type of meta_attr. Call it with __get__(meta_attr, self, self.to_meta_type()). If this fails (either the lookup or the call), just proceed with meta_attr. Otherwise, replace meta_attr in the following with the return type of __get__. In this step, we also probe if a __set__ or __delete__ method exists and store it in meta_attr_kind (can be either "data descriptor" or "normal attribute or non-data descriptor").
    • Compute a fallback type.
      • For instances, we use self.instance_member("attribute")
      • For class objects, we use class_attr = self.find_in_mro("attribute"), and then try to invoke the descriptor protocol on class_attr, i.e. we look up __get__ on the meta type of class_attr and call it with __get__(class_attr, None, self). This additional invocation of the descriptor protocol on the fallback type is one major asymmetry in the otherwise universal descriptor protocol implementation.
    • Finally, we look at meta_attr, meta_attr_kind and fallback, and handle various cases of (possible) unboundness of these symbols.
      • If meta_attr is bound and a data descriptor, just return meta_attr
      • If meta_attr is not a data descriptor, and fallback is bound, just return fallback
      • If meta_attr is not a data descriptor, and fallback is unbound, return meta_attr
      • Return unions of these three possibilities for partially-bound symbols.

This allows us to handle class objects and instances within the same framework. There is a minor additional detail where for instances, we do not allow the fallback type (the instance attribute) to completely shadow the non-data descriptor. We do this because we (currently) don't want to pretend that we can statically infer that an instance attribute is always set.

Dunder method calls can also be embedded into this framework. The only thing that changes is that there is no fallback type. If a dunder method is called on an instance, we do not fall back to instance variables. If a dunder method is called on a class object, we only look it up on the meta class, never on the class itself.

Test Plan

New Markdown tests.

@sharkdp sharkdp added the red-knot Multi-file analysis & type inference label Feb 27, 2025
Copy link

codspeed-hq bot commented Feb 28, 2025

CodSpeed Performance Report

Merging #16416 will degrade performances by 4.29%

Comparing david/descriptor-protocol-pt2 (689053e) with main (a18d8bf)

Summary

❌ 1 (👁 1) regressions
✅ 31 untouched benchmarks

Benchmarks breakdown

Benchmark BASE HEAD Change
👁 red_knot_check_file[incremental] 5.2 ms 5.5 ms -4.29%

@sharkdp sharkdp force-pushed the david/descriptor-protocol-pt2 branch 3 times, most recently from cad7a7e to 76aa5c3 Compare March 3, 2025 10:22
@sharkdp sharkdp force-pushed the david/descriptor-protocol-pt2 branch from 746234b to a54cf50 Compare March 3, 2025 13:47
@sharkdp sharkdp changed the title [red-knot] Descriptor protocol: Data descriptors [red-knot] Attribute access and the Descriptor protocol Mar 3, 2025
@sharkdp sharkdp force-pushed the david/descriptor-protocol-pt2 branch from 67334a5 to e3bfa45 Compare March 4, 2025 12:43
@sharkdp sharkdp changed the title [red-knot] Attribute access and the Descriptor protocol [red-knot] Attribute access and the descriptor protocol Mar 4, 2025
@sharkdp sharkdp force-pushed the david/descriptor-protocol-pt2 branch 3 times, most recently from 6bc0b25 to 35349fe Compare March 5, 2025 20:24
@sharkdp sharkdp force-pushed the david/descriptor-protocol-pt2 branch 2 times, most recently from ec50795 to 7abdb56 Compare March 6, 2025 10:43
sharkdp added a commit that referenced this pull request Mar 6, 2025
…e accessed. (#16533)

## Summary

- `Never` is callable
- `Never` is iterable
- Arbitrary attributes can be accessed on `Never`

Split out from #16416 that is going to be required.

## Test Plan

Tests for all properties above.
@sharkdp sharkdp force-pushed the david/descriptor-protocol-pt2 branch from de158fa to f3909de Compare March 6, 2025 16:25
@sharkdp sharkdp marked this pull request as ready for review March 6, 2025 16:29
@sharkdp sharkdp requested a review from carljm as a code owner March 6, 2025 16:29
Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is excellent! Left some comments but IMO nothing that can't be left as a TODO and addressed as follow up; probably better to land this sooner rather than later.

Comment on lines +581 to +587
// The symbol is declared and bound in the class body,
// but we did not find any attribute assignments in
// methods of the class. This means that the attribute
// has a class-level default value, but it would not be
// found in a `__dict__` lookup.

Symbol::Unbound.into()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this case. If it's not explicitly marked as ClassVar, shouldn't we consider this a possible instance attribute, which might be found in __dict__? (We would allow assignment of it on an instance.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is the case I talked about yesterday. It's definitely not straightforward to just return declared.with_qualifiers(qualifiers) here. If we do so, we treat every function on a class (which is declared and bound) into an instance attribute. Which does not just feel wrong, but also produces a lot of test failures. Instance attributes take precedence over non-data descriptors, so that would mean that we do not generate bound methods for something like C().some_method, but a normal Literal function type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to merge this PR before signing off for the weekend, but curious to hear your opinion here. I think what I implemented here (w.r.t. this case) mostly keeps the behavior as it was before, but that doesn't necessarily mean that it's the best possible behavior. It's certainly somewhat inconsistent with how we treat declared-but-unbound attributes on the class body (for which we also see no attribute assignments). Another way to make it consistent would be to make those also return Unbound (maybe except for stubs), unless we see at least one attribute binding in a method(?).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inconsistency that I was thinking about here is that if we see e.g. foo: int = 1 in a class body (without a ClassVar annotation), we generally consider that to be an instance attribute with class fallback value. This means we will allow it to be set on an instance, which means it could exist in the instance dictionary, which means it could shadow a non-data descriptor. But I think you're right that we have to assume it does not, otherwise non-data descriptors would never work at all. It's not just methods, it's anytime you do d: NonDataDescriptor = NonDataDescriptor() on a class, we would think C().d might give you back NonDataDescriptor, because there might be one of those set in the instance dict. This is definitely not useful behavior.

So I think what you've done here is right! We will see if any issues come up on real codebases and can tweak accordingly.

Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a proper review as I'm on my phone, but a couple of minor things I spotted skimming through the diff. Nothing blocking!

@sharkdp sharkdp merged commit 820a31a into main Mar 7, 2025
21 checks passed
@sharkdp sharkdp deleted the david/descriptor-protocol-pt2 branch March 7, 2025 21:03
dcreager added a commit that referenced this pull request Mar 8, 2025
* main:
  [red-knot] Understand `typing.Callable` (#16493)
  [red-knot] Support unpacking `with` target (#16469)
  [red-knot] Attribute access and the descriptor protocol (#16416)
  [`pep8-naming`] Add links to `ignore-names` options in various rules' documentation (#16557)
  [red-knot] avoid inferring types if unpacking fails (#16530)
AlexWaygood added a commit that referenced this pull request Mar 9, 2025
## Summary

Fixes #16566, fixes #16575

The semantics of `Type::class_member` changed in
#16416, but the property-test
infrastructure was not updated. That means that the property tests were
panicking on the second `expect_type` call here:


https://github.com/astral-sh/ruff/blob/036102186392f1263774bfd42d78acc65c0641b8/crates/red_knot_python_semantic/src/types/property_tests.rs#L151-L158

With the somewhat unhelpful message:

```
Expected a (possibly unbound) type, not an unbound symbol
```

Applying this patch, and then running `QUICKCHECK_TESTS=1000000 cargo
test --release -p red_knot_python_semantic -- --ignored
types::property_tests::stable::equivalent_to_is_reflexive` showed
clearly that it was no longer able to find _any_ methods on _any_
classes due to the change in semantics of `Type::class_member`:

```diff
--- a/crates/red_knot_python_semantic/src/types/property_tests.rs
+++ b/crates/red_knot_python_semantic/src/types/property_tests.rs
@@ -27,7 +27,7 @@
 use std::sync::{Arc, Mutex, MutexGuard, OnceLock};
 
 use crate::db::tests::{setup_db, TestDb};
-use crate::symbol::{builtins_symbol, known_module_symbol};
+use crate::symbol::{builtins_symbol, known_module_symbol, Symbol};
 use crate::types::{
     BoundMethodType, CallableType, IntersectionBuilder, KnownClass, KnownInstanceType,
     SubclassOfType, TupleType, Type, UnionType,
@@ -150,10 +150,11 @@ impl Ty {
             Ty::BuiltinsFunction(name) => builtins_symbol(db, name).symbol.expect_type(),
             Ty::BuiltinsBoundMethod { class, method } => {
                 let builtins_class = builtins_symbol(db, class).symbol.expect_type();
-                let function = builtins_class
-                    .class_member(db, method.into())
-                    .symbol
-                    .expect_type();
+                let Symbol::Type(function, ..) =
+                    builtins_class.class_member(db, method.into()).symbol
+                else {
+                    panic!("no method `{method}` on class `{class}`");
+                };
 
                 create_bound_method(db, function, builtins_class)
             }
```

This PR updates the property-test infrastructure to use `Type::member`
rather than `Type::class_member`.

## Test Plan

- Ran `QUICKCHECK_TESTS=1000000 cargo test --release -p
red_knot_python_semantic -- --ignored types::property_tests::stable`
successfully
- Checked that there were no remaining uses of `Type::class_member` in
`property_tests.rs`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
4 participants