Summary
The runtime always uses the full __go_swapcontext path, which saves and restores FPU state on every goroutine context switch even when neither goroutine uses floating-point operations.
Why
- Integer-only goroutines currently pay avoidable context-switch cost.
runtime_sh4_minimal.S already contains __go_swapcontext_lazy and __go_swapcontext_nofpu, but the scheduler never selects them.
- On a single-core Dreamcast runtime, context-switch overhead directly affects frame budget and goroutine-heavy workloads.
Evidence
runtime/runtime_sh4_minimal.S documents the current full path at about 88 cycles with FPU save/restore included.
- The same file documents
__go_swapcontext_lazy as saving about 19-25 cycles per skipped direction and __go_swapcontext_nofpu as saving about 50 cycles versus the full path.
- The scheduler currently uses only
__go_swapcontext and does not track per-goroutine FPU usage or pass any FPU flags.
Direction
- Investigate a sound way to track whether a goroutine has used floating point.
- Teach the scheduler to select among
__go_swapcontext, __go_swapcontext_lazy, and __go_swapcontext_nofpu when it is safe to do so.
- Prefer correctness first: any optimization must preserve FPU state across goroutines that do use floating point.
- Benchmark the real savings on hardware for integer-heavy workloads.
Summary
The runtime always uses the full
__go_swapcontextpath, which saves and restores FPU state on every goroutine context switch even when neither goroutine uses floating-point operations.Why
runtime_sh4_minimal.Salready contains__go_swapcontext_lazyand__go_swapcontext_nofpu, but the scheduler never selects them.Evidence
runtime/runtime_sh4_minimal.Sdocuments the current full path at about 88 cycles with FPU save/restore included.__go_swapcontext_lazyas saving about 19-25 cycles per skipped direction and__go_swapcontext_nofpuas saving about 50 cycles versus the full path.__go_swapcontextand does not track per-goroutine FPU usage or pass any FPU flags.Direction
__go_swapcontext,__go_swapcontext_lazy, and__go_swapcontext_nofpuwhen it is safe to do so.