Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add flag to allow more flexible variable redefinition #18727

Open
wants to merge 84 commits into
base: master
Choose a base branch
from

Conversation

JukkaL
Copy link
Collaborator

@JukkaL JukkaL commented Feb 24, 2025

Infer union types for simple variables from multiple assignments, if the variable isn't annotated. The feature is enabled via --allow-redefinition-new. --local-partial-types must also be enabled.

This is still experimental and has known issues, so it's not documented anywhere. It works well enough that it can be used for non-trivial experimentation, however.

Closes #6233. Closes #6232. Closes #18568. Fixes #18619.

In this example, the type of x is inferred as int | str when using the new behavior:

def f(i: int, s : str) -> int | str:
    if i > 5:
        x = i
    else:
        x = s  # No longer an error
    reveal_type(x)  # int | str
    return s

Here is a summary of how it works:

  • Assignment widens the inferred type of a variable and always narrows (when there is no annotation).
  • Simple variable lvalues are put into the binder on initial assignment when using the new feature. We need to be able to track whether a variable is defined or not to infer correct types (see Binder loses narrowed type of a variable if variable may be uninitialized #18619).
  • Assignment of None values are no longer special, and we don't use partial None if the feature is enabled for simple variables.
  • Lvalues other than simple variables (e.g. self.x) continue to work as in the past. Attribute types can't be widened, since they are externally visible and widening could cause confusion, but this is something we might relax in the future. Globals can be widened, however. This seems necessary for consistency.
  • If a loop body widens a variable type, we have to analyze the body again. However, we only do one extra pass, since the inferred type could be expanded without bound (consider x = 0 outside loop and x = [x] within the loop body).
  • We first infer the type of an rvalue without using the lvalue type as context, as otherwise the type context would often prevent redefinition. If the rvalue type isn't valid for inference (e.g. list item type can't be inferred), we fall back to the lvalue type context.

There are some other known bugs and limitations:

  • Annotated variables can't be freely redefined (but they can still be narrowed, of course). I may want to relax this in the future, but I'm not sure yet.
  • If there is a function definition between assignments to a variable, the inferred types may be incorrect.
  • There are few tests for nonlocal and some other features. We don't have good test coverage for deferrals, mypy daemon, and disabling strict optional.
  • Imported names can't be redefined in a consistent way. This needs further analysis.

In self check the feature generates 6 additional errors, which all seem correct -- we infer more precise types, which will generate additional errors due to invariant containers and fixing false negatives.

When type checking the largest internal codebase at Dropbox, this generated about 700 new errors, the vast majority of which seemed legitimate. Mostly they were due to inferring more precise types for variables that used to have Any types. I used a recent but not the latest version of the feature to type check the internal codebase.

@JukkaL
Copy link
Collaborator Author

JukkaL commented Mar 12, 2025

I fixed issues related to the three examples mentioned above by @cdce8p and @A5rocks. It took me a while since I've been traveling.

This one came up quite a lot actually. Yes, it would be possible to resolve it by annotating l with list[Any] | None. However, if possible, I think it would be better if mypy was able to resolve it to list[Any] itself.

I didn't change this to silently infer list[Any], since it would be inconsistent with how type inference works in general. Not it also asks for a type annotation for the variable. The old behavior was actually buggy -- it didn't infer an optional type at all, so it could result in false negatives. The new behavior seems more correct, even if it requires some extra annotations.

This comment has been minimized.

This comment has been minimized.

@JukkaL
Copy link
Collaborator Author

JukkaL commented Mar 12, 2025

Temporarily enabled --allow-redefine-new by default. I also had to enable --local-partial-types. Note that both of these are expected to generate new errors.

This comment has been minimized.

This comment has been minimized.

@JukkaL
Copy link
Collaborator Author

JukkaL commented Mar 14, 2025

There are still some false positives in the mypy_primer output that I want to fix before merging, but I won't attempt to fix every single issue in this PR, since the new behavior is behind a flag.

@ilevkivskyi
Copy link
Member

@JukkaL

There are still some false positives in the mypy_primer output that I want to fix before merging, but I won't attempt to fix every single issue in this PR, since the new behavior is behind a flag.

Sure, there is no point in turning this into a mega-PR. Let me know if/when you want me to review this PR again.

@hauntsaninja
Copy link
Collaborator

Btw if you want a clean mypy_primer diff on the local partial types front, you can try something like:

diff --git a/.github/workflows/mypy_primer.yml b/.github/workflows/mypy_primer.yml
index ee8684847..a8220e1b6 100644
--- a/.github/workflows/mypy_primer.yml
+++ b/.github/workflows/mypy_primer.yml
@@ -66,6 +66,7 @@ jobs:
             --num-shards 5 --shard-index ${{ matrix.shard-index }} \
             --debug \
             --additional-flags="--debug-serialize" \
+            --additional-flags="--local-partial-types" \
             --output concise \
             | tee diff_${{ matrix.shard-index }}.txt
           ) || [ $? -eq 1 ]

This comment has been minimized.

This comment has been minimized.

@JukkaL
Copy link
Collaborator Author

JukkaL commented Mar 17, 2025

@hauntsaninja

Btw if you want a clean mypy_primer diff on the local partial types front, you can try something like: ...

Since the PR also changes how None types are inferred, this actually generates a larger diff. I'm not sure if there is a good way to isolate the changes caused by the flag in mypy primer.

@JukkaL
Copy link
Collaborator Author

JukkaL commented Mar 17, 2025

@ilevkivskyi I'm not planning any further changes before merging. I'd like to merge this in a few days. It would be great if you can review my recent changes.

Copy link
Contributor

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants