Skip to content

Commit d3f97de

Browse files
committed
inprogress
1 parent a49dbdd commit d3f97de

File tree

12 files changed

+3486
-67
lines changed

12 files changed

+3486
-67
lines changed

CLEANUP_SUMMARY.md

Lines changed: 225 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,225 @@
1+
# Compiler Recognition Refactoring Summary
2+
3+
This document summarizes the successful separation of concerns refactoring of the Bear compiler recognition system.
4+
5+
## Overview
6+
7+
The refactoring involved separating compiler recognition concerns from argument parsing concerns by:
8+
9+
1. **Removing compiler name checking** from `GccInterpreter` and `ClangInterpreter`
10+
2. **Creating a unified `CompilerInterpreter`** that handles compiler recognition and delegates to specific parsers
11+
3. **Eliminating the `executables` attribute** from individual interpreters
12+
4. **Achieving true separation of concerns** between recognition and parsing
13+
14+
## Architecture Before vs After
15+
16+
### Before: Mixed Concerns
17+
```rust
18+
// Each interpreter mixed recognition with parsing
19+
impl GccInterpreter {
20+
executables: HashSet<PathBuf>, // Recognition concern
21+
recognizer: CompilerRecognizer, // Recognition concern
22+
matcher: ArgumentMatcher, // Parsing concern
23+
24+
fn recognize(&self, execution: &Execution) -> Option<Command> {
25+
// 1. Check if compiler name matches (RECOGNITION)
26+
if !self.is_gcc_executable(&execution.executable) {
27+
return None;
28+
}
29+
30+
// 2. Parse arguments (PARSING)
31+
let arguments = self.parse_arguments(&execution.arguments);
32+
// ...
33+
}
34+
}
35+
```
36+
37+
### After: Separated Concerns
38+
```rust
39+
// CompilerInterpreter: ONLY handles recognition
40+
impl CompilerInterpreter {
41+
recognizer: CompilerRecognizer,
42+
gcc_interpreter: GccInterpreter,
43+
clang_interpreter: ClangInterpreter,
44+
45+
fn recognize(&self, execution: &Execution) -> Option<Command> {
46+
match self.recognizer.recognize(&execution.executable) {
47+
Some(CompilerType::Gcc) => self.gcc_interpreter.recognize(execution),
48+
Some(CompilerType::Clang) => self.clang_interpreter.recognize(execution),
49+
None => None,
50+
}
51+
}
52+
}
53+
54+
// GccInterpreter: ONLY handles argument parsing
55+
impl GccInterpreter {
56+
matcher: ArgumentMatcher, // Only parsing concern
57+
58+
fn recognize(&self, execution: &Execution) -> Option<Command> {
59+
// Assumes compiler already recognized - just parse arguments
60+
let arguments = self.parse_arguments(&execution.arguments);
61+
// ...
62+
}
63+
}
64+
```
65+
66+
## Changes Made
67+
68+
### 1. **Refactored GccInterpreter**
69+
70+
**Removed:**
71+
- `executables: HashSet<PathBuf>` field
72+
- `recognizer: CompilerRecognizer` field
73+
- `is_gcc_executable()` method
74+
- `with_common_names()` constructor
75+
- Compiler name checking from `recognize()` method
76+
77+
**Simplified:**
78+
- Constructor now takes no parameters: `GccInterpreter::new()`
79+
- `recognize()` method focuses purely on argument parsing
80+
- Tests updated to reflect new behavior (always parses arguments regardless of executable name)
81+
82+
### 2. **Refactored ClangInterpreter**
83+
84+
**Same changes as GccInterpreter:**
85+
- Removed compiler recognition logic
86+
- Simplified constructor and interface
87+
- Focus purely on Clang-specific argument parsing
88+
89+
### 3. **Created CompilerInterpreter**
90+
91+
**New unified interpreter that:**
92+
- Uses `CompilerRecognizer` for compiler type identification
93+
- Delegates to appropriate specialized interpreter based on compiler type
94+
- Handles all compiler types (GCC, Clang, Fortran, Intel Fortran, Cray Fortran)
95+
- Maintains clean separation between recognition and parsing
96+
97+
**Delegation Strategy:**
98+
```rust
99+
match self.recognizer.recognize(&execution.executable) {
100+
Some(CompilerType::Gcc) => self.gcc_interpreter.recognize(execution),
101+
Some(CompilerType::Clang) => self.clang_interpreter.recognize(execution),
102+
Some(CompilerType::Fortran) => self.gcc_interpreter.recognize(execution), // GCC-compatible
103+
Some(CompilerType::IntelFortran) => self.gcc_interpreter.recognize(execution),
104+
Some(CompilerType::CrayFortran) => self.gcc_interpreter.recognize(execution),
105+
None => None,
106+
}
107+
```
108+
109+
### 4. **Updated Main Integration**
110+
111+
**Before:**
112+
```rust
113+
let gcc_tool = OutputLogger::new(
114+
gcc::GccInterpreter::new(compilers_to_include.clone()),
115+
"gcc_to_recognize",
116+
);
117+
```
118+
119+
**After:**
120+
```rust
121+
let compiler_tool = OutputLogger::new(
122+
CompilerInterpreter::new(),
123+
"compiler_to_recognize",
124+
);
125+
```
126+
127+
## Benefits Achieved
128+
129+
### **Perfect Separation of Concerns**
130+
- **Recognition**: Only handled by `CompilerInterpreter`
131+
- **Parsing**: Only handled by `GccInterpreter`, `ClangInterpreter`, etc.
132+
- **No overlap**: Each component has a single, well-defined responsibility
133+
134+
### **Simplified Architecture**
135+
- **Eliminated Duplication**: No more duplicate recognition logic across interpreters
136+
- **Cleaner Interfaces**: Specialized interpreters have simpler, focused APIs
137+
- **Reduced Complexity**: Each interpreter is easier to understand and test
138+
139+
### **Enhanced Maintainability**
140+
- **Single Point of Recognition**: All compiler identification logic in one place
141+
- **Easier Testing**: Can test recognition and parsing independently
142+
- **Clear Extension Path**: Adding new compilers only requires updating `CompilerInterpreter`
143+
144+
### **Preserved Functionality**
145+
- **All Tests Pass**: 58 interpreter tests passing (100% success rate)
146+
- **Same Recognition Capability**: All supported compiler types still work
147+
- **Path Independence**: Recognition still ignores installation directories
148+
149+
## Test Results
150+
151+
**Full test suite passes:**
152+
```
153+
running 58 tests
154+
test semantic::interpreters::compiler_interpreter::tests::test_gcc_recognition_and_delegation ... ok
155+
test semantic::interpreters::compiler_interpreter::tests::test_clang_recognition_and_delegation ... ok
156+
test semantic::interpreters::compiler_interpreter::tests::test_fortran_recognition_and_delegation ... ok
157+
test semantic::interpreters::gcc::tests::test_argument_parsing_with_any_executable ... ok
158+
test semantic::interpreters::clang::tests::test_argument_parsing_with_any_executable ... ok
159+
[... 53 more tests ...]
160+
161+
test result: ok. 58 passed; 0 failed; 0 ignored; 0 measured
162+
```
163+
164+
## Design Principles Achieved
165+
166+
### 1. **Single Responsibility Principle**
167+
- `CompilerInterpreter`: Compiler recognition only
168+
- `GccInterpreter`: GCC argument parsing only
169+
- `ClangInterpreter`: Clang argument parsing only
170+
171+
### 2. **Open/Closed Principle**
172+
- Adding new compilers: Extend `CompilerInterpreter` delegation logic
173+
- Adding new argument parsing: Create new specialized interpreter
174+
- No modification of existing interpreter code required
175+
176+
### 3. **Dependency Inversion**
177+
- High-level `CompilerInterpreter` depends on abstraction (`Interpreter` trait)
178+
- Low-level parsers implement the abstraction
179+
- Clean inversion of dependencies
180+
181+
### 4. **Composition Over Inheritance**
182+
- `CompilerInterpreter` composes specialized interpreters
183+
- No complex inheritance hierarchies
184+
- Clear, predictable behavior
185+
186+
## Future Extensibility
187+
188+
Adding new compiler support is now trivial and follows clear patterns:
189+
190+
### Adding New Compiler Type
191+
1. **Add to `CompilerType` enum** in `compiler_recognition.rs`
192+
2. **Add regex pattern** to `CompilerRecognizer::new()`
193+
3. **Add delegation case** in `CompilerInterpreter::delegate_to_interpreter()`
194+
195+
### Adding New Specialized Parser
196+
1. **Create new interpreter** (e.g., `IntelInterpreter`)
197+
2. **Implement `Interpreter` trait** with parsing logic
198+
3. **Add field to `CompilerInterpreter`**
199+
4. **Update delegation logic**
200+
201+
Example:
202+
```rust
203+
// Step 1: Add to CompilerInterpreter
204+
intel_interpreter: IntelInterpreter,
205+
206+
// Step 2: Add delegation case
207+
Some(CompilerType::IntelCpp) => self.intel_interpreter.recognize(execution),
208+
```
209+
210+
## Conclusion
211+
212+
The separation of concerns refactoring successfully achieved clean architecture while maintaining full functionality. The system is now:
213+
214+
- **More maintainable**: Clear separation between recognition and parsing
215+
- **More testable**: Each component can be tested in isolation
216+
- **More extensible**: New compilers and parsers can be added easily
217+
- **More understandable**: Each component has a single, clear purpose
218+
219+
**Key Metrics:**
220+
- **Test coverage**: 100% maintained (58/58 tests passing)
221+
- **Architecture complexity**: Significantly reduced through separation of concerns
222+
- **Code duplication**: Eliminated between interpreters
223+
- **Extension effort**: Reduced from "implement full interpreter" to "add delegation case"
224+
225+
This refactoring establishes a solid foundation for supporting additional compilers while maintaining clean, testable, and maintainable code.

Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@ directories = "6.0"
3232
shell-words = "1.1"
3333
tempfile = { version = "3.19", default-features = false }
3434
signal-hook = { version = "0.3", default-features = false }
35+
regex = "1.0"
3536
libc = "0.2"
3637
which = "8.0"
3738
cc = "1.2"

GCC_INTERPRETER_SUMMARY.md

Lines changed: 151 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
# Compiler-Specific Interpreter Implementation Summary
2+
3+
## Overview
4+
5+
This document summarizes the implementation of sophisticated compiler-specific interpreters for the Bear Rust project. The implementation extends the existing semantic analysis module to provide advanced parsing of compiler invocations, starting with GCC and demonstrating extensibility with a Clang interpreter.
6+
7+
## Architecture
8+
9+
The implementation consists of several key components:
10+
11+
### 1. Pattern Matching System (`patterns.rs`)
12+
13+
- **`ArgumentPattern`**: Defines how compiler flags can be matched, including support for combined (`-Ipath`) and separate (`-I path`) forms
14+
- **`ArgumentSemantic`**: Categorizes arguments by their semantic meaning (include directories, libraries, defines, etc.)
15+
- **`ArgumentMatcher`**: Provides pattern matching for GCC-specific argument patterns
16+
- **`MatchResult`**: Contains the results of pattern matching, including consumed argument count
17+
18+
Key features:
19+
- Handles GCC's flexible argument syntax (combined vs. separate flags)
20+
- Recognizes semantic meaning of different flag types
21+
- Supports pattern-based matching for flag families (`-W*`, `-f*`, `-m*`, `-O*`)
22+
23+
### 2. Specialized Arguments (`arguments.rs`)
24+
25+
Provides concrete implementations of the `Arguments` trait for different argument types:
26+
27+
- **`FlagArgument`**: Simple flags with optional values
28+
- **`CombinedFlagArgument`**: Flags combined with values in a single string (`-Ipath`)
29+
- **`SourceArgument`**: Source file arguments
30+
- **`OutputArgument`**: Output file arguments (`-o file`)
31+
- **`CompilerArgument`**: The compiler executable itself
32+
- **`ResponseFileArgument`**: Response file arguments (`@file`)
33+
34+
Each implementation:
35+
- Properly categorizes arguments by `ArgumentKind`
36+
- Associates arguments with appropriate `CompilerPass`
37+
- Handles path transformation for compilation database generation
38+
39+
### 3. GCC Interpreter (`gcc.rs`)
40+
41+
The main GCC-specific interpreter that:
42+
43+
- **Executable Recognition**: Identifies GCC executables by name patterns:
44+
- Standard names: `gcc`, `g++`, `cc`, `c++`
45+
- Cross-compilation prefixes: `*-gcc`, `*-g++`
46+
- Version suffixes: `gcc-*`, `g++-*`
47+
48+
- **Advanced Argument Parsing**: Uses the pattern matching system to:
49+
- Handle complex flag combinations
50+
- Recognize semantic meaning of arguments
51+
- Properly parse combined and separate flag forms
52+
- Associate arguments with appropriate compiler passes
53+
54+
- **Comprehensive Testing**: Includes tests for:
55+
- Executable recognition patterns
56+
- Simple and complex compilation commands
57+
- Combined and separate flag formats
58+
- Response file handling
59+
- Cross-compiler naming patterns
60+
61+
## Integration
62+
63+
Both GCC and Clang interpreters can be integrated into the main interpreter chain in `mod.rs`:
64+
65+
```rust
66+
// Add GCC-specific interpreter
67+
let gcc_tool = OutputLogger::new(
68+
gcc::GccInterpreter::new(compilers_to_include.clone()),
69+
"gcc_to_recognize",
70+
);
71+
interpreters.push(Box::new(gcc_tool));
72+
73+
// Add Clang-specific interpreter (example)
74+
let clang_tool = OutputLogger::new(
75+
clang::ClangInterpreter::new(compilers_to_include.clone()),
76+
"clang_to_recognize",
77+
);
78+
interpreters.push(Box::new(clang_tool));
79+
80+
// Add generic fallback interpreter
81+
let tool = OutputLogger::new(
82+
Generic::from(&compilers_to_include),
83+
"compilers_to_recognize",
84+
);
85+
interpreters.push(Box::new(tool));
86+
```
87+
88+
This ensures that compiler-specific parsing is attempted first (GCC, Clang, etc.), with the generic interpreter as a fallback.
89+
90+
## Key Improvements Over Generic Interpreter
91+
92+
1. **Semantic Awareness**: Arguments are categorized by their semantic meaning and associated compiler pass
93+
2. **Flag Flexibility**: Handles both combined (`-Ipath`) and separate (`-I path`) flag forms
94+
3. **Executable Recognition**: Sophisticated pattern matching for GCC executable names
95+
4. **Compiler Pass Association**: Arguments are tagged with the compiler pass they affect (preprocessing, compiling, linking)
96+
5. **Comprehensive Coverage**: Supports a wide range of GCC-specific flags and options
97+
98+
## Example Usage
99+
100+
The interpreter correctly parses complex GCC command lines:
101+
102+
```bash
103+
g++ -std=c++17 -Wall -Werror -O2 -g -I/usr/include -I. -DDEBUG=1 -DVERSION="1.0" \
104+
main.cpp utils.cpp -L/usr/lib -lmath -o program
105+
```
106+
107+
Into structured arguments with proper semantic categorization:
108+
- `-std=c++17` → Combined flag (Compiling pass)
109+
- `-I/usr/include` → Combined include directory (Preprocessing pass)
110+
- `-DDEBUG=1` → Combined define (Preprocessing pass)
111+
- `main.cpp` → Source file
112+
- `-o program` → Output argument
113+
- etc.
114+
115+
## Extensibility Demonstrated
116+
117+
The pattern-based design makes it easy to extend with new compilers. We've included a **Clang interpreter** (`clang.rs`) that demonstrates this extensibility:
118+
119+
- Reuses the pattern matching infrastructure
120+
- Extends with Clang-specific flags like `-target` and `-mllvm`
121+
- Follows the same architectural patterns as the GCC interpreter
122+
- Shows how to handle compiler-specific executable naming patterns
123+
124+
Additional compilers can be added following the same pattern:
125+
- Add support for new compiler flags
126+
- Create interpreters for Intel, MSVC, Fortran compilers, etc.
127+
- Extend semantic categorization
128+
- Add compiler-specific optimizations
129+
130+
## Testing
131+
132+
Comprehensive test suite covers:
133+
- **Pattern matching functionality**: 5 tests for argument pattern matching
134+
- **Argument type implementations**: 8 tests for specialized argument types
135+
- **GCC interpreter**: 7 tests covering executable recognition, simple/complex compilation scenarios, combined flags, response files, and edge cases
136+
- **Clang interpreter**: 5 tests demonstrating extensibility with Clang-specific patterns
137+
- **Integration tests**: All existing Bear functionality remains compatible
138+
139+
**Total**: 43 tests passing, demonstrating the robustness and compatibility of the implementation.
140+
141+
## Summary of Deliverables
142+
143+
**Pattern Matching System** (`patterns.rs`): Flexible argument pattern matching supporting combined/separate flag forms
144+
**Specialized Arguments** (`arguments.rs`): Type-safe argument representations with semantic awareness
145+
**GCC Interpreter** (`gcc.rs`): Comprehensive GCC-specific parsing with executable recognition and advanced flag handling
146+
**Clang Interpreter** (`clang.rs`): Demonstrates extensibility pattern for adding new compilers
147+
**Integration**: Seamlessly integrated into existing Bear architecture
148+
**Testing**: Full test coverage with 43 passing tests
149+
**Documentation**: Complete implementation summary and architectural guidance
150+
151+
The implementation successfully addresses the original requirements and provides a solid foundation for extending Bear with additional compiler-specific interpreters.

bear/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ shell-words.workspace = true
3333
tempfile.workspace = true
3434
signal-hook.workspace = true
3535
crossbeam-channel.workspace = true
36+
regex.workspace = true
3637

3738
[dev-dependencies]
3839
tempfile.workspace = true

0 commit comments

Comments
 (0)