-
Notifications
You must be signed in to change notification settings - Fork 71
lazy loading #1101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
lazy loading #1101
Conversation
|
We could make it even lazyer but maybe that would be overkill @apoorvalal? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements lazy loading to improve package import performance by replacing eager imports with on-demand module loading. The change reduces initial import time from ~2.5 seconds by deferring submodule loading until they are actually accessed.
Key Changes:
- Replaced direct imports with a
__getattr__hook that loads modules and functions on first access - Created mapping dictionaries (
_submodulesand_lazy_imports) to define which modules contain which functions - Moved version handling to the top while keeping it eager since it's inexpensive
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
pyfixest/__init__.py
Outdated
|
|
||
| def __getattr__(name: str): | ||
| if name in _submodules: | ||
| return _importlib.import_module(f"pyfixest.{name}") | ||
| if name in _lazy_imports: | ||
| module = _importlib.import_module(_lazy_imports[name]) |
Copilot
AI
Dec 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The __getattr__ function will repeatedly import the same module for each attribute access. Consider caching the imported modules in a dictionary (e.g., sys.modules already does this, but you could add explicit memoization) to avoid redundant import_module calls when accessing the same attribute multiple times.
| def __getattr__(name: str): | |
| if name in _submodules: | |
| return _importlib.import_module(f"pyfixest.{name}") | |
| if name in _lazy_imports: | |
| module = _importlib.import_module(_lazy_imports[name]) | |
| # Cache for lazily imported submodules to avoid repeated import_module calls | |
| _import_cache = {} | |
| def _load_module(module_name: str): | |
| """Load a module once and reuse it for subsequent accesses.""" | |
| module = _import_cache.get(module_name) | |
| if module is None: | |
| module = _importlib.import_module(module_name) | |
| _import_cache[module_name] = module | |
| return module | |
| def __getattr__(name: str): | |
| if name in _submodules: | |
| return _load_module(f"pyfixest.{name}") | |
| if name in _lazy_imports: | |
| module = _load_module(_lazy_imports[name]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is required because of Python's sys.modules caching
| def __dir__(): | ||
| return __all__ |
Copilot
AI
Dec 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The __dir__ function returns __all__ but __all__ is defined after the function definition. While Python's module-level code executes sequentially, this could fail if __dir__() is called during module initialization before line 57 executes. Consider moving the __all__ definition before the __dir__ function, or make __dir__ return a combination of _submodules and _lazy_imports.keys() to be independent of __all__.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It probably is cleaner to move __all__ to the top?
Codecov Report❌ Patch coverage is
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 6 files with indirect coverage changes 🚀 New features to boost your workflow:
|
pyfixest/__init__.py
Outdated
| if name in _submodules: | ||
| return _importlib.import_module(f"pyfixest.{name}") | ||
| if name in _lazy_imports: | ||
| module = _importlib.import_module(_lazy_imports[name]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we load the whole parent module rather than name? Couldn't we use _importlib.import_module(f"{_lazy_imports[name]}.{name}")?
| def __dir__(): | ||
| return __all__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It probably is cleaner to move __all__ to the top?
|
The API change is breaking the |
Problem: import pyfixest took ~2.5 seconds because all submodules were eagerly loaded at package import, even when users only needed a subset of functionality.
Solution: Replace eager imports with lazy loading.
Modules are now loaded on-demand when accessed (e.g., pf.feols loads estimation, pf.did2s loads did).
To enable this, moves estimation APIs -
feols,fepois,feglm,quantregto standalone scripts inpyfixest.estimation.api.Closes #1095 .