Skip to content

Implement hybrid inline PPU/APU stepping for cycle-level accuracy#562

Open
bfirsh wants to merge 1 commit intomainfrom
bfirsh/hybrid-cycle-accuracy
Open

Implement hybrid inline PPU/APU stepping for cycle-level accuracy#562
bfirsh wants to merge 1 commit intomainfrom
bfirsh/hybrid-cycle-accuracy

Conversation

@bfirsh
Copy link
Owner

@bfirsh bfirsh commented Feb 15, 2026

Summary

Redesigns PPU synchronization to advance the PPU by 3 dots after every CPU bus operation, keeping it in sync at the bus-cycle level. This eliminates the old lazy catch-up approach and enables precise NMI timing, VBlank edge detection, and sprite 0 hit accuracy.

Key Changes

  • PPU step(dots) method: Encapsulates dot-by-dot PPU logic with fast path (~99% of calls) and handles VBlank, sprite 0 hits, and scanline boundaries.
  • Inline PPU stepping: All CPU bus operations (load/write/push/pull) call ppu.step(3) to keep PPU state always current.
  • NMI cycle-level detection: Computes remaining PPU dots after the VBlank edge to determine 0-delay vs 1-delay, matching real 6502 behavior.
  • Simplified frame loop: Removes complex PPU catch-up math and dot-by-dot processing.
  • Critical NMI bug fix: Update instrBusCycles before missing-cycles step to prevent double-counting in delay formula.

Results

AccuracyCoin: 98/134 pass (+1 from baseline), including newly passing SHY (0x0449) and DMA+open-bus (0x046c). All 445 unit tests pass. Performance: ~1621 fps (10% slower, acceptable).

Fixes: NMI Control and NMI Timing now accurate, inline stepping enables correct open bus behavior during DMA cycles.

🤖 Generated with Claude Code

Redesign PPU synchronization to advance the PPU by 3 dots after every CPU bus
operation (load/write/push/pull), keeping it in sync at the bus-cycle level.
This replaces the old lazy catch-up approach where the PPU was only advanced
when PPU registers were accessed. PPU and APU state are now always current,
enabling accurate NMI timing, VBlank edge detection, and sprite 0 hit behavior.

Key changes:
- Add ppu.step(dots) method: encapsulates dot-by-dot PPU logic with fast path
  for common cases and handles VBlank set/clear, sprite 0 hits, scanline boundaries
- Wire inline PPU stepping into all CPU bus operations (load, write, push, pull)
- Simplify frame loop: removes PPU catch-up math and dot-by-dot processing
- Implement precise 0-delay vs 1-delay NMI detection using remaining PPU dots
  after the edge, matching real 6502 behavior
- Add NMI handler dummy reads (7 cycles matching real hardware)
- Fix critical bug: update instrBusCycles before missing-cycles PPU step to
  prevent double-counting in NMI delay formula

Results:
- AccuracyCoin: 98/134 pass (up from 97), including newly passing SHY and DMA+open-bus
- All 445 unit tests pass
- Performance: ~1621 fps (10% slower than baseline, acceptable for accuracy)

Fixes:
- NMI Control (0x0422): proper 1-delay when edge occurs late in instruction
- NMI Timing (0x0453): cycle-level PPU sync enables exact NMI edge detection
- DMA + open bus (0x046c): now passing with inline stepping
- SHY timing (0x0449): now passing after NMI/interrupt fixes

See https://www.nesdev.org/wiki/Catch-up and https://www.nesdev.org/wiki/CPU_interrupts

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant