-
Notifications
You must be signed in to change notification settings - Fork 487
Open
Labels
Description
Issue: symbolic_file.py Test Failure - Incompatibility Between Symbolic Files and libc Functions
Summary
The symbolic_file.py test consistently fails because Manticore cannot properly handle symbolic file content when it's read through libc functions like getline(). The test expects multiple execution paths but only generates one.
Test Details
- Location:
examples/script/symbolic_file.py - Purpose: Demonstrate symbolic file analysis by having Manticore explore both branches of a password check
- Expected: Multiple execution paths (password matches and doesn't match)
- Actual: Only one path explored, assertion fails
Root Cause Analysis
The Core Problem
Manticore's symbolic file implementation is incompatible with libc's buffered I/O functions:
- Type Mismatch: When a symbolic file is opened, Manticore returns a
SymbolicFilePython object, butgetline()expects a CFILE*structure - No Function Modeling:
getline()is not modeled/hooked by Manticore, so it executes natively without understanding symbolic data - Early Termination: The program exits with code 127 (command/library failure) when trying to use the incompatible file handle
Technical Details
How Symbolic Files Work in Manticore
- Files marked with
--fileflag oradd_symbolic_file()become symbolic - '+' characters in the file are treated as symbolic bytes
- Manticore creates a
SymbolicFileobject (Python) with a symbolic array - Works at the syscall level (read/write/open)
Why getline() Fails
// In fileio.c
FILE *infile = fopen(fname, "r"); // Returns SymbolicFile, not FILE*
getline(&line, &line_size, infile); // Expects FILE*, crashesEvidence from Investigation
- State lifecycle: States are killed rather than terminated normally
- No branching: Fork callbacks never trigger
- Hook analysis: File operation hooks don't execute, indicating early failure
- Exit code: 127 indicates library/command failure
- Workspace: Only generates 1 testcase instead of expected 2+
Code Flow
1. Program starts with symbolic file argument
2. fopen() called -> Manticore returns SymbolicFile object
3. getline() called -> Expects FILE* structure
4. Type mismatch causes crash/undefined behavior
5. State is killed with exit code 127
6. No symbolic data reaches strcmp()
7. No branching occurs
Reproduction Steps
cd manticore/examples/script
python symbolic_file.py
# AssertionError: Should have found more than 1 path through the programThree Options for Fixing
Option 1: Model getline() in Manticore (Recommended Long-term)
Effort: High
Impact: Would fix all similar issues
Implementation would require:
- Hook
getline()at the binary level - Detect when file handle is a SymbolicFile
- Manually implement getline's behavior with symbolic data
- Return symbolic bytes to the buffer
# Pseudocode for manticore/native/models.py
def getline(state, lineptr, n, stream):
if isinstance(stream, SymbolicFile):
# Read symbolic bytes until newline or EOF
symbolic_data = stream.read_until_newline()
state.cpu.write_bytes(lineptr, symbolic_data)
return len(symbolic_data)
else:
# Fall back to native execution
return state.invoke_model(getline_native, lineptr, n, stream)Option 2: Rewrite Test to Use Compatible Functions (Recommended Short-term)
Effort: Low
Impact: Fixes this specific test
Replace getline() with direct read() syscall:
// Instead of:
getline(&line, &line_size, infile);
// Use:
int fd = open(fname, O_RDONLY);
read(fd, buffer, 11);This works because Manticore properly handles read() syscalls with symbolic files.
Option 3: Document as Known Limitation (Immediate)
Effort: Minimal
Impact: Prevents CI failures, sets expectations
Mark the test as expected to fail with clear documentation:
- Add to known limitations in docs
- Skip test in CI with explanation
- Document that symbolic files only work with syscalls, not libc buffered I/O
Impact
- CI/CD: Test causes false failures in continuous integration
- User Experience: Users may think symbolic file analysis is broken
- Documentation: Feature appears to work but has undocumented limitations
Related Code
manticore/platforms/linux.py:540-663- SymbolicFile classmanticore/platforms/linux.py:3847-3852- _sys_open_get_file()manticore/platforms/linux.py:4155-4160- generate_workspace_files()manticore/native/cli.py:33-34- add_symbolic_file initialization
Workaround
Users can work around this by:
- Using programs that read files with syscalls (read/write) instead of libc functions
- Writing custom models for specific libc functions they need
- Using concrete files instead of symbolic for programs using buffered I/O
Test History
- Test was already marked as "flaky" in CI
- Has been a known intermittent issue
- Original
fileio.chad uninitialized variable bug (fixed in ef8677a)
Recommendation
- Immediate: Disable test in CI with skip marker and explanation
- Short-term: Create alternative test using syscall-based file reading
- Long-term: Implement proper modeling for common libc file functions
References
- Original issue with uninitialized variables: Fixed in commit ef8677a
- Similar limitations exist for other libc functions that use FILE* structures
- Symbolic file feature works correctly with syscall-level operations