Implement hybrid inline PPU/APU stepping for cycle-level accuracy#562
Open
Implement hybrid inline PPU/APU stepping for cycle-level accuracy#562
Conversation
Redesign PPU synchronization to advance the PPU by 3 dots after every CPU bus operation (load/write/push/pull), keeping it in sync at the bus-cycle level. This replaces the old lazy catch-up approach where the PPU was only advanced when PPU registers were accessed. PPU and APU state are now always current, enabling accurate NMI timing, VBlank edge detection, and sprite 0 hit behavior. Key changes: - Add ppu.step(dots) method: encapsulates dot-by-dot PPU logic with fast path for common cases and handles VBlank set/clear, sprite 0 hits, scanline boundaries - Wire inline PPU stepping into all CPU bus operations (load, write, push, pull) - Simplify frame loop: removes PPU catch-up math and dot-by-dot processing - Implement precise 0-delay vs 1-delay NMI detection using remaining PPU dots after the edge, matching real 6502 behavior - Add NMI handler dummy reads (7 cycles matching real hardware) - Fix critical bug: update instrBusCycles before missing-cycles PPU step to prevent double-counting in NMI delay formula Results: - AccuracyCoin: 98/134 pass (up from 97), including newly passing SHY and DMA+open-bus - All 445 unit tests pass - Performance: ~1621 fps (10% slower than baseline, acceptable for accuracy) Fixes: - NMI Control (0x0422): proper 1-delay when edge occurs late in instruction - NMI Timing (0x0453): cycle-level PPU sync enables exact NMI edge detection - DMA + open bus (0x046c): now passing with inline stepping - SHY timing (0x0449): now passing after NMI/interrupt fixes See https://www.nesdev.org/wiki/Catch-up and https://www.nesdev.org/wiki/CPU_interrupts Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Redesigns PPU synchronization to advance the PPU by 3 dots after every CPU bus operation, keeping it in sync at the bus-cycle level. This eliminates the old lazy catch-up approach and enables precise NMI timing, VBlank edge detection, and sprite 0 hit accuracy.
Key Changes
step(dots)method: Encapsulates dot-by-dot PPU logic with fast path (~99% of calls) and handles VBlank, sprite 0 hits, and scanline boundaries.ppu.step(3)to keep PPU state always current.instrBusCyclesbefore missing-cycles step to prevent double-counting in delay formula.Results
AccuracyCoin: 98/134 pass (+1 from baseline), including newly passing SHY (0x0449) and DMA+open-bus (0x046c). All 445 unit tests pass. Performance: ~1621 fps (10% slower, acceptable).
Fixes: NMI Control and NMI Timing now accurate, inline stepping enables correct open bus behavior during DMA cycles.
🤖 Generated with Claude Code