Testing: Bytecode Equivalence Infrastructure

## Summary

Build infrastructure to compare Solar-generated bytecode against solc for behavioral and structural equivalence.

**Parent issue:** #687

## Context

To achieve bytecode parity with solc, we need automated testing that:
1. Compiles the same Solidity code with both compilers
2. Compares the results (behavior and/or structure)
3. Reports differences and tracks progress over time

This is Phase 1 of the roadmap and provides continuous feedback for all other work.

## Tasks

### Compilation harness
- [ ] CLI or test harness that:
  - Takes Solidity files/projects
  - Compiles with solc (specific version)
  - Compiles with Solar (same language version)
  - Extracts creation + runtime bytecode

### Behavioral equivalence (Level 1)
- [ ] Deploy both bytecodes to Anvil (or revm directly)
- [ ] Run identical transactions/calls against both
- [ ] Compare:
  - Return values
  - State changes
  - Revert behavior
  - Gas usage (within tolerance)
- [ ] Fuzz testing: random inputs, compare outputs

### Structural equivalence (Level 2, optional)
- [ ] Ignore metadata and compiler version markers
- [ ] Normalize trivial differences (jump label ordering)
- [ ] Compare opcode sequences
- [ ] Track bytecode size differences

### Test corpus
- [ ] Start with handcrafted tests for each feature
- [ ] Import Solidity official tests (if license-compatible)
- [ ] Generate random contracts for differential testing
- [ ] Track which features pass/fail

### CI integration
- [ ] Run equivalence tests on every PR
- [ ] Track metrics: % of tests passing
- [ ] Store known deltas, track reductions over time
- [ ] Gate features on maintaining or improving metrics

## Example workflow

```bash
# Run equivalence test on a contract
solar-test-equiv contracts/Token.sol

# Output:
# Token.sol:
#   Compile: ✓ Solar, ✓ solc
#   Bytecode size: Solar 1234, solc 1256 (-22 bytes)
#   Behavioral tests: 45/50 passing
#   Failures:
#     - transfer(address,uint256): different return value
#     - approve: Solar reverts, solc succeeds
```

## Patterns to follow

**From Venom & Sonatina:**
- Heavy reliance on differential testing against reference compilers
- Corpus-based testing + fuzzers feeding random programs

## Acceptance Criteria

- [ ] Command like `solar-test-equiv path/to/contracts` works
- [ ] Reports # tests passed vs failed for both structural and behavioral
- [ ] Clear metrics: % of contracts fully equivalent
- [ ] New features can be gated on improving/preserving metrics

## Estimated Complexity

**Medium** - Tooling work, not algorithmically complex

## Dependencies

- Solar can emit *some* bytecode (even if incomplete)
- Foundry/Anvil/revm for execution


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing: Bytecode Equivalence Infrastructure #704

Summary

Context

Tasks

Compilation harness

Behavioral equivalence (Level 1)

Structural equivalence (Level 2, optional)

Test corpus

CI integration

Example workflow

Patterns to follow

Acceptance Criteria

Estimated Complexity

Dependencies

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Testing: Bytecode Equivalence Infrastructure #704

Description

Summary

Context

Tasks

Compilation harness

Behavioral equivalence (Level 1)

Structural equivalence (Level 2, optional)

Test corpus

CI integration

Example workflow

Patterns to follow

Acceptance Criteria

Estimated Complexity

Dependencies

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions