refactor: reorganize expression parser with proper precedence hierarchy#254
Merged
LunaStev merged 1 commit intowavefnd:masterfrom Dec 24, 2025
Merged
Conversation
Restructure expression parsing to follow standard C operator precedence
with dedicated functions for each precedence level.
Changes:
- Implement complete operator precedence hierarchy:
1. Assignment (=, +=, -=, *=, /=)
2. Logical OR (||)
3. Logical AND (&&)
4. Bitwise OR (|)
5. Bitwise XOR (^)
6. Bitwise AND (&)
7. Equality (==, !=)
8. Relational (<, <=, >, >=)
9. Shift (<<, >>)
10. Additive (+, -)
11. Multiplicative (*, /, %)
12. Unary (!, ~, &, deref)
13. Primary (literals, identifiers, function calls)
- Add binary literal support in lexer:
- Parse 0b prefix for binary numbers (0b1010, 0b0101)
- Convert binary strings to i64 using from_str_radix
- Format lexeme as "0b{binary_digits}"
- Move shift operators (<<, >>) to character-level parsing:
- Parse << and >> directly in lexer char matching
- Remove from identifier-based keyword matching
- Check for << before <= in '<' handler
- Check for >> before >= in '>' handler
- Refactor unary expression parsing:
- Move all prefix operators to parse_unary_expression
- Handle !, ~, &, deref in single dedicated function
- Parse unary operators recursively (e.g., !!x, ~!x)
- Fix logical operator code generation:
- Add to_bool() helper for boolean coercion
- Convert integer values to i1 before logical AND/OR
- Handle i1 types without unnecessary conversions
- Improve bitwise operator codegen:
- Add missing BitwiseAnd, BitwiseOr, BitwiseXor implementations
- Generate and, or, xor LLVM instructions
- Properly handle operator in binary expression match
- Fix shift operation type casting:
- Cast shift amount to match shifted value type
- Prevent type mismatch errors in build_left_shift/build_right_shift
- Use build_int_cast for explicit type conversion
- Enhance unary NOT operators:
- LogicalNot (!): Compare with zero for multi-bit integers
- BitwiseNot (~): Use LLVM's build_not instruction
- Handle i1 boolean types specially in logical NOT
- Add Operator::Not variant to AST for consistency
- Add Expression::Grouped for parenthesized expressions
- Simplify parser function signatures with std::iter::Peekable
Benefits:
- Correct operator precedence matching C/C++ standards
- Clear separation of concerns in parsing logic
- Easier to maintain and extend with new operators
- Proper type handling in all binary/unary operations
Example precedence:
a + b * c // * before +
a << 2 + 1 // + before
a & b == c // == before &
!a || b && c // ! > && > ||
Signed-off-by: LunaStev <luna@lunastev.org>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR significantly refactors the expression parsing logic to implement a standard C-style operator precedence hierarchy. It replaces the previous parsing structure with a dedicated recursive descent approach, ensuring that complex expressions are evaluated in the correct order. Additionally, it introduces support for binary literals and fixes several type-related issues in the LLVM code generation for logical and bitwise operations.
Key Changes
1. Expression Parsing & Precedence
!,~,&,deref) into a dedicated recursive function, allowing for nested unary expressions (e.g.,!!x).Expression::Groupedto properly handle and preserve parenthesized sub-expressions.2. Lexer Improvements
0bprefix (e.g.,0b1010), usingfrom_str_radixfor accurate conversion toi64.<<and>>from keyword-based matching to character-level matching in the lexer. Fixed the collision where<was matched before<<.3. LLVM Code Generation & Type Safety
to_bool()helper to handle integer-to-boolean coercion, ensuring logicalAND/ORoperations work correctly with multi-bit integers.BitwiseAnd,BitwiseOr, andBitwiseXorusing native LLVM instructions.LogicalNot(!) to correctly compare integers with zero.BitwiseNot(~) using the LLVMnot(XOR with -1) logic.4. Code Quality & Consistency
Operator::Notfor consistency across unary operations.std::iter::Peekablefor better readability and performance.Operator Precedence (Highest to Lowest)
()!,~,&,deref*,/,%+,-<<,>><,<=,>,>===,!=&^|&&||=,+=,-=,*=,/=,%=Examples of Improved Behavior
Benefits