Description
My company is currently using 0.971 internally. I was looking into upgrading to 0.991 and discovered a significant performance regression. Encourage by the recent commit log I tried the master branch as well, which is definitely better but still much worse than 0.971 on our codebase: roughly 4x slower: from ~300s to ~1200s for whole-codebase run without incremental cache.
In the course of narrowing down the slowdown (via per-file and per-line timing stats), I discovered a surprisingly large performance difference for seemingly very similar code. Consider:
from typing import Any, Dict
class Foo(object):
def __init__(self) -> None:
self.field = {
'XXX': {
'XXX': 'XXX',
'XXX': {
'XXX': [
{
'XXX': 'XXX',
'XXX': 'XXX',
'XXX': None,
'XXX': None
},
{
'XXX': 'XXX',
'XXX': 'XXX',
'XXX': None,
'XXX': None
},
{
'XXX': 'XXX',
'XXX': 'XXX',
'XXX': None,
'XXX': None
},
{
'XXX': 'XXX',
'XXX': 'XXX',
'XXX': None,
'XXX': None
},
{
'XXX': 'XXX',
'XXX': 'XXX',
'XXX': None,
'XXX': None
}
],
},
'XXX': {
'XXX': [
{
'XXX': 'XXX',
'XXX': 'XXX',
'XXX': {
'XXX': [
{
'XXX': {
'XXX': {
'XXX': 'XXX',
'XXX': 'XXX'
}
}
},
{
'XXX': {
'XXX': {
'XXX': 'XXX',
'XXX': 'XXX'
}
}
},
]
},
'XXX': {
'XXX': [
{
'XXX': {
'XXX': {
'XXX': 2,
'XXX': [
{
'XXX': 'XXX',
'XXX': {
'XXX': 'XXX',
'XXX': 'XXX'
}
},
{
'XXX': 'XXX',
'XXX': {
'XXX': 'XXX',
'XXX': 'XXX'
}
}
]
},
'XXX': {
'XXX': 'XXX',
'XXX': 'XXX',
'XXX': [
{
'XXX': 'XXX',
'XXX': 'XXX',
}
]
}
}
},
]
}
},
{
'XXX': 'XXX',
'XXX': 'XXX',
'XXX': None,
'XXX': {
'XXX': [
{
'XXX': {
'XXX': {
'XXX': 2,
'XXX': [
{
'XXX': 'XXX',
'XXX': {
'XXX': 'XXX',
'XXX': 'XXX'
}
},
{
'XXX': 'XXX',
'XXX': {
'XXX': 'XXX',
'XXX': 'XXX'
}
}
]
},
'XXX': {
'XXX': 'XXX',
'XXX': 'XXX',
'XXX': [
{
'XXX': 'XXX',
'XXX': 'XXX',
}
]
}
}
},
]
}
},
{
'XXX': 'XXX',
'XXX': 'XXX',
'XXX': None,
'XXX': {
'XXX': [
{
'XXX': {
'XXX': {
'XXX': 2,
'XXX': [
{
'XXX': 'XXX',
'XXX': {
'XXX': 'XXX',
'XXX': 'XXX'
}
},
{
'XXX': 'XXX',
'XXX': {
'XXX': 'XXX',
'XXX': 'XXX'
}
}
]
},
'XXX': {
'XXX': 'XXX',
'XXX': 'XXX',
'XXX': [
{
'XXX': 'XXX',
'XXX': 'XXX',
}
]
}
}
},
]
}
},
{
'XXX': 'XXX',
'XXX': 'XXX',
'XXX': None,
'XXX': {
'XXX': [
{
'XXX': {
'XXX': {
'XXX': 2,
'XXX': [
{
'XXX': 'XXX',
'XXX': {
'XXX': 'XXX',
'XXX': 'XXX'
}
},
{
'XXX': 'XXX',
'XXX': {
'XXX': 'XXX',
'XXX': 'XXX'
}
}
]
},
'XXX': {
'XXX': 'XXX',
'XXX': 'XXX',
'XXX': [
{
'XXX': 'XXX',
'XXX': 'XXX',
}
]
}
}
},
]
}
},
{
'XXX': 'XXX',
'XXX': 'XXX',
'XXX': None,
'XXX': None
}
]
},
'XXX': {
'XXX': 'XXX',
'XXX': 'XXX',
'XXX': [
{
'XXX': 'XXX',
'XXX': 'XXX'
},
{
'XXX': 'XXX',
'XXX': 'XXX'
},
{
'XXX': 'XXX',
'XXX': 'XXX'
},
{
'XXX': 'XXX',
'XXX': 'XXX'
},
{
'XXX': 'XXX',
'XXX': 'XXX'
},
{
'XXX': 'XXX',
'XXX': 'XXX'
},
{
'XXX': 'XXX',
'XXX': 'XXX'
},
{
'XXX': 'XXX',
'XXX': 'XXX'
},
{
'XXX': 'XXX',
'XXX': 'XXX'
}
]
}
}
}
reveal_type(Foo.field)
Results in:
test-san.py:286: note: Revealed type is "builtins.dict[builtins.str, builtins.dict[builtins.str, typing.Collection[builtins.str]]]"
Success: no issues found in 1 source file
real 0m1.238s
user 0m1.140s
sys 0m0.089s
But changing the code to:
from typing import Any, Dict
class Foo(object):
def __init__(self) -> None:
self.field: Dict[Any, Any] = NotImplemented
def reset(self) -> None:
self.field = {
# same contents as above...
}
changes the runtime quite significantly:
test-san.py:289: note: Revealed type is "builtins.dict[Any, Any]"
Success: no issues found in 1 source file
real 0m10.080s
user 0m9.790s
sys 0m0.256s
and a further code change as follows:
from typing import Any, Dict
class Foo(object):
def __init__(self) -> None:
self.field = NotImplemented
def reset(self) -> None:
self.field = {
# same contents as above...
}
leads to another noticeable perf hit:
$ time mypy test-san.py
test-san.py:9: error: Incompatible types in assignment (expression has type "Dict[str, Dict[str, Collection[str]]]", variable has type "_NotImplementedType") [assignment]
test-san.py:289: note: Revealed type is "builtins._NotImplementedType"
Found 1 error in 1 file (checked 1 source file)
real 0m22.067s
user 0m21.666s
sys 0m0.319s
I haven't had time to profile this yet but I suspect it is related to #9477 and #12707 and that somehow one version of the code hits the optimization and another doesn't
Your Environment
- Mypy version used: master as of
695ea3017fee084c9d2ec17d9b28f8af905e3b63
, mypyc-compiled - Mypy command-line flags: N/A
- Mypy configuration options from
mypy.ini
(and other config files): N/A - Python version used: 3.7.8
- OS: macOS 12.5.1 (x86_64)