Skip to content

Conversation

@azjezz
Copy link
Member

@azjezz azjezz commented Jul 10, 2025

This pull request introduces the foundational infrastructure for the new mago analyzer, along with its core dependencies codex and algebra.

This feature is being merged in an experimental state. The analyzer is functional but incomplete. The full development roadmap, including stabilization tasks, future features, and bug verification, is tracked in issue #[NEW_ISSUE_NUMBER].

Key Changes:

  • New Crates: mago-analyzer, mago-codex, mago-algebra.
  • Removed Crates: mago-project, mago-reflection, mago-typing, mago-trinary.
  • New Debugging Functions: Mago\inspect() and Mago\confirm().

@azjezz azjezz force-pushed the analyzer-2 branch 3 times, most recently from 0ec0171 to b3efd9f Compare July 10, 2025 07:41
@azjezz azjezz marked this pull request as ready for review July 10, 2025 11:53
@azjezz azjezz requested a review from Copilot July 10, 2025 11:53

This comment was marked as outdated.

@azjezz azjezz self-assigned this Jul 11, 2025
@azjezz azjezz added Experimental This feature or issue is experimental and may change or be removed in future versions. Priority: Critical This should be dealt with ASAP. Not fixing would be a serious error. Status: In Progress This issue is being worked on and has someone assigned. Subject: Dependencies Pull requests that update a dependency file. Type: BC Break A change that introduces backward compatibility breaks in the public API. Type: Enhancement Request for additions or changes that improve existing functionality. Subject: Linter An issue or PR related to the linter. Subject: Parser An issue or PR related to the parser, lexer, or ast. Subject: Analyzer An issue or PR related to the static analyzer/type checker. labels Jul 11, 2025
@azjezz azjezz marked this pull request as draft July 11, 2025 00:26
@azjezz azjezz requested a review from Copilot July 11, 2025 01:21
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Introduces the new static analysis infrastructure by adding three core crates and removing legacy dependencies.

  • Adds mago-analyzer, mago-codex, and mago-algebra crates with core types, clauses, and analysis logic
  • Implements debugging helpers Mago\inspect() and Mago\confirm()
  • Removes legacy crates (mago-project, mago-reflection, etc.) replaced by the new cores

Reviewed Changes

Copilot reviewed 63 out of 613 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
crates/analyzer/src/expression/yield.rs Implements generator yield analysis with key/value type checks
crates/analyzer/src/expression/variable.rs Reads and reports on variable usages, including global variables
crates/analyzer/src/expression/unary.rs Handles unary operators and type casts with detailed dataflow logic
crates/analyzer/src/expression/throw.rs Ensures thrown expressions implement Throwable
crates/analyzer/src/expression/mod.rs Dispatches expression types and integrates clause-based checks
crates/analyzer/src/expression/match.rs Analyzes match expressions with exhaustiveness and reachability QA
crates/analyzer/src/expression/magic_constant.rs Sets types for PHP magic constants
crates/analyzer/src/expression/literal.rs Infers literal types (int, float, string, bool, null)

@azjezz azjezz force-pushed the analyzer-2 branch 3 times, most recently from 2be7c5c to d0493ae Compare July 11, 2025 08:47
@azjezz azjezz force-pushed the analyzer-2 branch 3 times, most recently from d82eb37 to 2a90e20 Compare July 27, 2025 22:44
azjezz added 5 commits July 28, 2025 00:16
…urns

Refactors `FunctionLikeMetadata` to distinguish between native type declarations and docblock return types.
- `return_type_metadata` is now `return_type_declaration_metadata`.
- The new `return_type_metadata` holds the docblock type if present, otherwise it mirrors the declaration.

This enables a fix in the analyzer to correctly allow empty `return;` statements from functions that are untyped or return `mixed`/`null` via docblock.

Also removes several now-unused getters/setters from `FunctionLikeMetadata` in favor of direct field access.

Signed-off-by: azjezz <[email protected]>
Fixes an issue where method calls on generic parameter types were failing.

Previously, the method resolver would not correctly traverse the generic parameter's constraints. It now recursively unpacks generic constraints, allowing it to correctly find methods on the underlying union types.

Tests have been added to prevent regressions.

Signed-off-by: azjezz <[email protected]>
azjezz added 19 commits July 28, 2025 00:16
This refactoring overhauls how the analyzer scrapes assertions from integer comparisons (`>`, `<`, `>=`, `<=`).

The previous implementation had several limitations:

- It would only generate assertions when comparing a variable against a literal integer.
- When comparing two variables (`$a` > `$b`), even if one had a known integer range type, no assertions were created.
- It only ever created an assertion for one of the two variables in the comparison.

The new implementation is significantly smarter:

- Two-Way Analysis: When comparing two variables with known integer ranges (e.g., `$a` is `int<20, 60>` and `$b` is `int<50, 100>`), it now correctly creates assertions for both variables based on the comparison. In `if ($a > $b)`, it will assert that `$a` is `int<50, 60>` and $b is also `int<20, 60>`.
- Range-Based Assertions: It can now derive assertions from all forms of integer ranges (`int<X, Y>`, `int<X, max>`, `int<min, Y>`), not just literals.
- Redundancy-Free: Assertions are now checked for redundancy. If a variable $a is already `int<30, 100>`, an assertion like `$a > 10` is correctly identified as redundant and skipped.

This leads to much more precise type inference within conditional blocks, catching a wider range of potential bugs and improving the overall accuracy of the static analysis.

Signed-off-by: azjezz <[email protected]>
This commit completely removes the dataflow analysis graph and all related logic from the codebase.

The dataflow graph was originally introduced to track the flow of variables through the program, primarily for two main features: detecting unused variables and unused parameters.

However, this component is being removed for several reasons:

- High Complexity for Limited Use: The dataflow graph is a large and complex component, and its maintenance overhead is not justified for its limited use cases.
- Redundancy: Unused parameter detection is already handled more efficiently by the linter. Unused variable detection can also be implemented with a much simpler analysis, without requiring a full dataflow graph.
- Divergence from Project Goals: The dataflow engine was inspired by tools like Psalm and Hakana, which use it for taint and security analysis. Since we do not plan to support security analysis, this component is out of scope for our project's goals.

By removing the dataflow component, we achieve:

- Massive Code Simplification: A significant amount of complex code is deleted from both `codex` (the graph implementation) and `analyzer` (the logic to add nodes and paths).
- Reduced State: The `TUnion` type is simplified by the removal of `parent_nodes`, which tracked dataflow origins.
- Improved Focus: The analyzer is now more focused on its core responsibility of type checking and inference.

If security analysis is reconsidered in the future, it can be implemented in a dedicated crate as a separate stage that runs after the main analysis, rather than being tightly integrated into the type checker.

Signed-off-by: azjezz <[email protected]>
This commit enhances the precision of string type analysis by introducing first-class support for `lowercase string` types and adding specialized handlers for the `Psl\Str` library.

The `lowercase-string` and `non-empty-lowercase-string` types are now fully supported throughout the type system. Previously, `lowercase-string` was only treated as an alias for `string`.

- The internal `TString` atomic type now tracks an `is_lowercase` property.
- This property is correctly inferred from literal values and propagated through string operations like concatenation.
- The type reconciler now understands and preserves the `lowercase-string` constraint.
- Stubs for native PHP functions (e.g., `strtolower`, `mb_strtolower`) have been updated to return this new, more precise type.

A new special function handler has been added for the `Psl\Str` component from the `azjezz/psl` library. This provides much more accurate return types for its functions by leveraging the new string properties.

For example:
- `Psl\Str\slice()` now correctly returns `lowercase-string` if the input is a `lowercase-string`.
- Functions like `Psl\Str\lowercase()`, `Psl\Str\trim()`, and `Psl\Str\after()` now preserve the `lowercase` and `non-empty` properties of the input string in their return types.

These changes enable the analyzer to catch more subtle bugs related to string casing and formatting, leading to more powerful and precise type inference.

Signed-off-by: azjezz <[email protected]>
@azjezz azjezz marked this pull request as ready for review July 28, 2025 06:33
This commit introduces a new standalone crate, `mago-collector`, to centralize issue collection and suppression logic.
Previously, this functionality was part of the linter, but it has been extracted and enhanced to be used by both the linter and the analyzer, ensuring consistent behavior across the toolchain.

The most significant change is the introduction of "categories" for pragmas. Suppression comments like `@mago-ignore` and `@mago-expect` must now specify which tool they target.

- New crate `mago-collector`: A new crate has been created to handle all diagnostic collection. It is responsible for parsing pragmas, associating them with their correct AST scope, and managing the collection of issues.
    - Pragma Categories: Pragmas now require a category prefix (e.g., lint:, analysis:). This allows a single comment to suppress issues from different tools without conflict.
    - Migration Note: Users should update their existing pragmas to include the `lint:` prefix (e.g., `@mago-ignore security/rule` becomes `@mago-ignore lint:security/rule`) to ensure they continue to work with the linter.
- Linter & Analyzer Integration: Both `mago-linter` and `mago-analyzer` have been refactored to use the new `mago-collector`, removing their internal issue-handling logic.
- New Rule `comment/no-uncategorized-pragma`: A new linter rule has been added to detect and suggest fixes for old pragmas that are missing a category, helping users migrate to the new format.

Signed-off-by: azjezz <[email protected]>
…plate inference

This commit significantly enhances the template type system by introducing the `@where` tag for advanced constraints and improving template type inference from function arguments.

Support for the Psalm-specific `@psalm-if-this-is` annotation has been removed. This tag is non-standard, and its functionality can now be achieved in a more explicit and powerful way using the new `@where` tag.

Introduces support for the `@where` docblock tag. This tag allows a method to assert a constraint on one of the **class's template parameters**. The method is only considered valid and callable if the template argument for that specific class instance meets this constraint.

This is extremely powerful for adding methods to generic classes that should only be available for a subset of the possible template types. Inside the method's body, the template parameter is treated as being narrowed to the constrained type.

**Syntax**: `@where T is <type>`, where `T` is a template parameter on the class.

**Example**:

```php
/**
 * @template T
 */
class Box {
    /** @param T $value */
    public function __construct(public mixed $value) {}

    /**
     * This method is only available if T is a stringable type.
     *
     * @where T is string|Stringable
     */
    public function print(): void {
        // Inside here, we know $this->value is string|Stringable.

        echo (string) $this->value;
    }
}

(new Box("hello"))->print(); // OK
(new Box(new StringableObject())->print(); // OK
(new Box(['an', 'array']))->print(); // ERROR: array does not satisfy string|Stringable
```

The type inference engine is now more effective at deducing template types from the arguments passed to a function or method.
This leads to more accurate type resolution for generic code without requiring explicit `@var` annotations in many cases.

Note: Support for `@this-out` annotations has not been added in this commit but is planned for a subsequent change.

Signed-off-by: azjezz <[email protected]>
@azjezz azjezz mentioned this pull request Jul 30, 2025
15 tasks
@azjezz azjezz changed the title feat: introduce codex, algebra, and analyzer feat(analyzer)!: Introduce experimental static analyzer Jul 30, 2025
@azjezz azjezz changed the title feat(analyzer)!: Introduce experimental static analyzer feat(analyzer): introduce experimental static analyzer Jul 30, 2025
@azjezz azjezz merged commit b2f2c24 into main Jul 30, 2025
29 checks passed
azjezz added a commit that referenced this pull request Jul 30, 2025
azjezz added a commit that referenced this pull request Jul 30, 2025
feat(analyzer): introduce experimental static analyzer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Experimental This feature or issue is experimental and may change or be removed in future versions. Priority: Critical This should be dealt with ASAP. Not fixing would be a serious error. Status: In Progress This issue is being worked on and has someone assigned. Subject: Analyzer An issue or PR related to the static analyzer/type checker. Subject: Dependencies Pull requests that update a dependency file. Subject: Linter An issue or PR related to the linter. Subject: Parser An issue or PR related to the parser, lexer, or ast. Type: BC Break A change that introduces backward compatibility breaks in the public API. Type: Enhancement Request for additions or changes that improve existing functionality.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants