-
-
Notifications
You must be signed in to change notification settings - Fork 102
feat(analyzer): introduce experimental static analyzer #230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
0ec0171 to
b3efd9f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Introduces the new static analysis infrastructure by adding three core crates and removing legacy dependencies.
- Adds
mago-analyzer,mago-codex, andmago-algebracrates with core types, clauses, and analysis logic - Implements debugging helpers
Mago\inspect()andMago\confirm() - Removes legacy crates (
mago-project,mago-reflection, etc.) replaced by the new cores
Reviewed Changes
Copilot reviewed 63 out of 613 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| crates/analyzer/src/expression/yield.rs | Implements generator yield analysis with key/value type checks |
| crates/analyzer/src/expression/variable.rs | Reads and reports on variable usages, including global variables |
| crates/analyzer/src/expression/unary.rs | Handles unary operators and type casts with detailed dataflow logic |
| crates/analyzer/src/expression/throw.rs | Ensures thrown expressions implement Throwable |
| crates/analyzer/src/expression/mod.rs | Dispatches expression types and integrates clause-based checks |
| crates/analyzer/src/expression/match.rs | Analyzes match expressions with exhaustiveness and reachability QA |
| crates/analyzer/src/expression/magic_constant.rs | Sets types for PHP magic constants |
| crates/analyzer/src/expression/literal.rs | Infers literal types (int, float, string, bool, null) |
2be7c5c to
d0493ae
Compare
d82eb37 to
2a90e20
Compare
Signed-off-by: azjezz <[email protected]>
…ble test Signed-off-by: azjezz <[email protected]>
Signed-off-by: azjezz <[email protected]>
…urns Refactors `FunctionLikeMetadata` to distinguish between native type declarations and docblock return types. - `return_type_metadata` is now `return_type_declaration_metadata`. - The new `return_type_metadata` holds the docblock type if present, otherwise it mirrors the declaration. This enables a fix in the analyzer to correctly allow empty `return;` statements from functions that are untyped or return `mixed`/`null` via docblock. Also removes several now-unused getters/setters from `FunctionLikeMetadata` in favor of direct field access. Signed-off-by: azjezz <[email protected]>
Fixes an issue where method calls on generic parameter types were failing. Previously, the method resolver would not correctly traverse the generic parameter's constraints. It now recursively unpacks generic constraints, allowing it to correctly find methods on the underlying union types. Tests have been added to prevent regressions. Signed-off-by: azjezz <[email protected]>
Signed-off-by: azjezz <[email protected]>
Signed-off-by: azjezz <[email protected]>
Signed-off-by: azjezz <[email protected]>
Signed-off-by: azjezz <[email protected]>
Signed-off-by: azjezz <[email protected]>
Signed-off-by: azjezz <[email protected]>
Signed-off-by: azjezz <[email protected]>
This refactoring overhauls how the analyzer scrapes assertions from integer comparisons (`>`, `<`, `>=`, `<=`). The previous implementation had several limitations: - It would only generate assertions when comparing a variable against a literal integer. - When comparing two variables (`$a` > `$b`), even if one had a known integer range type, no assertions were created. - It only ever created an assertion for one of the two variables in the comparison. The new implementation is significantly smarter: - Two-Way Analysis: When comparing two variables with known integer ranges (e.g., `$a` is `int<20, 60>` and `$b` is `int<50, 100>`), it now correctly creates assertions for both variables based on the comparison. In `if ($a > $b)`, it will assert that `$a` is `int<50, 60>` and $b is also `int<20, 60>`. - Range-Based Assertions: It can now derive assertions from all forms of integer ranges (`int<X, Y>`, `int<X, max>`, `int<min, Y>`), not just literals. - Redundancy-Free: Assertions are now checked for redundancy. If a variable $a is already `int<30, 100>`, an assertion like `$a > 10` is correctly identified as redundant and skipped. This leads to much more precise type inference within conditional blocks, catching a wider range of potential bugs and improving the overall accuracy of the static analysis. Signed-off-by: azjezz <[email protected]>
…thy-string` Signed-off-by: azjezz <[email protected]>
Signed-off-by: azjezz <[email protected]>
Signed-off-by: azjezz <[email protected]>
Signed-off-by: azjezz <[email protected]>
…ters Signed-off-by: azjezz <[email protected]>
Signed-off-by: azjezz <[email protected]>
This commit completely removes the dataflow analysis graph and all related logic from the codebase. The dataflow graph was originally introduced to track the flow of variables through the program, primarily for two main features: detecting unused variables and unused parameters. However, this component is being removed for several reasons: - High Complexity for Limited Use: The dataflow graph is a large and complex component, and its maintenance overhead is not justified for its limited use cases. - Redundancy: Unused parameter detection is already handled more efficiently by the linter. Unused variable detection can also be implemented with a much simpler analysis, without requiring a full dataflow graph. - Divergence from Project Goals: The dataflow engine was inspired by tools like Psalm and Hakana, which use it for taint and security analysis. Since we do not plan to support security analysis, this component is out of scope for our project's goals. By removing the dataflow component, we achieve: - Massive Code Simplification: A significant amount of complex code is deleted from both `codex` (the graph implementation) and `analyzer` (the logic to add nodes and paths). - Reduced State: The `TUnion` type is simplified by the removal of `parent_nodes`, which tracked dataflow origins. - Improved Focus: The analyzer is now more focused on its core responsibility of type checking and inference. If security analysis is reconsidered in the future, it can be implemented in a dedicated crate as a separate stage that runs after the main analysis, rather than being tightly integrated into the type checker. Signed-off-by: azjezz <[email protected]>
This commit enhances the precision of string type analysis by introducing first-class support for `lowercase string` types and adding specialized handlers for the `Psl\Str` library. The `lowercase-string` and `non-empty-lowercase-string` types are now fully supported throughout the type system. Previously, `lowercase-string` was only treated as an alias for `string`. - The internal `TString` atomic type now tracks an `is_lowercase` property. - This property is correctly inferred from literal values and propagated through string operations like concatenation. - The type reconciler now understands and preserves the `lowercase-string` constraint. - Stubs for native PHP functions (e.g., `strtolower`, `mb_strtolower`) have been updated to return this new, more precise type. A new special function handler has been added for the `Psl\Str` component from the `azjezz/psl` library. This provides much more accurate return types for its functions by leveraging the new string properties. For example: - `Psl\Str\slice()` now correctly returns `lowercase-string` if the input is a `lowercase-string`. - Functions like `Psl\Str\lowercase()`, `Psl\Str\trim()`, and `Psl\Str\after()` now preserve the `lowercase` and `non-empty` properties of the input string in their return types. These changes enable the analyzer to catch more subtle bugs related to string casing and formatting, leading to more powerful and precise type inference. Signed-off-by: azjezz <[email protected]>
Signed-off-by: azjezz <[email protected]>
Signed-off-by: azjezz <[email protected]>
Signed-off-by: azjezz <[email protected]>
This commit introduces a new standalone crate, `mago-collector`, to centralize issue collection and suppression logic.
Previously, this functionality was part of the linter, but it has been extracted and enhanced to be used by both the linter and the analyzer, ensuring consistent behavior across the toolchain.
The most significant change is the introduction of "categories" for pragmas. Suppression comments like `@mago-ignore` and `@mago-expect` must now specify which tool they target.
- New crate `mago-collector`: A new crate has been created to handle all diagnostic collection. It is responsible for parsing pragmas, associating them with their correct AST scope, and managing the collection of issues.
- Pragma Categories: Pragmas now require a category prefix (e.g., lint:, analysis:). This allows a single comment to suppress issues from different tools without conflict.
- Migration Note: Users should update their existing pragmas to include the `lint:` prefix (e.g., `@mago-ignore security/rule` becomes `@mago-ignore lint:security/rule`) to ensure they continue to work with the linter.
- Linter & Analyzer Integration: Both `mago-linter` and `mago-analyzer` have been refactored to use the new `mago-collector`, removing their internal issue-handling logic.
- New Rule `comment/no-uncategorized-pragma`: A new linter rule has been added to detect and suggest fixes for old pragmas that are missing a category, helping users migrate to the new format.
Signed-off-by: azjezz <[email protected]>
…plate inference This commit significantly enhances the template type system by introducing the `@where` tag for advanced constraints and improving template type inference from function arguments. Support for the Psalm-specific `@psalm-if-this-is` annotation has been removed. This tag is non-standard, and its functionality can now be achieved in a more explicit and powerful way using the new `@where` tag. Introduces support for the `@where` docblock tag. This tag allows a method to assert a constraint on one of the **class's template parameters**. The method is only considered valid and callable if the template argument for that specific class instance meets this constraint. This is extremely powerful for adding methods to generic classes that should only be available for a subset of the possible template types. Inside the method's body, the template parameter is treated as being narrowed to the constrained type. **Syntax**: `@where T is <type>`, where `T` is a template parameter on the class. **Example**: ```php /** * @template T */ class Box { /** @param T $value */ public function __construct(public mixed $value) {} /** * This method is only available if T is a stringable type. * * @where T is string|Stringable */ public function print(): void { // Inside here, we know $this->value is string|Stringable. echo (string) $this->value; } } (new Box("hello"))->print(); // OK (new Box(new StringableObject())->print(); // OK (new Box(['an', 'array']))->print(); // ERROR: array does not satisfy string|Stringable ``` The type inference engine is now more effective at deducing template types from the arguments passed to a function or method. This leads to more accurate type resolution for generic code without requiring explicit `@var` annotations in many cases. Note: Support for `@this-out` annotations has not been added in this commit but is planned for a subsequent change. Signed-off-by: azjezz <[email protected]>
azjezz
added a commit
that referenced
this pull request
Jul 30, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Experimental
This feature or issue is experimental and may change or be removed in future versions.
Priority: Critical
This should be dealt with ASAP. Not fixing would be a serious error.
Status: In Progress
This issue is being worked on and has someone assigned.
Subject: Analyzer
An issue or PR related to the static analyzer/type checker.
Subject: Dependencies
Pull requests that update a dependency file.
Subject: Linter
An issue or PR related to the linter.
Subject: Parser
An issue or PR related to the parser, lexer, or ast.
Type: BC Break
A change that introduces backward compatibility breaks in the public API.
Type: Enhancement
Request for additions or changes that improve existing functionality.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request introduces the foundational infrastructure for the new mago analyzer, along with its core dependencies codex and algebra.
This feature is being merged in an experimental state. The analyzer is functional but incomplete. The full development roadmap, including stabilization tasks, future features, and bug verification, is tracked in issue #[NEW_ISSUE_NUMBER].
Key Changes:
mago-analyzer,mago-codex,mago-algebra.mago-project,mago-reflection,mago-typing,mago-trinary.Mago\inspect()andMago\confirm().