-
Notifications
You must be signed in to change notification settings - Fork 173
Open
Description
Code invalidation today is costly and takes too much time. This shows up drmatically in Dark Souls: Remastered where it invalidates ~700 times per second and causes the game's FPS to be utterly garbage.
perf top shows what's going wrong
10.02% FEX [.] FEXCore::GuestToHostMap::Erase(FEXCore::Core::CpuStateFrame*, unsigned long, FEXCore::LookupCacheWriteLockToken const&)
7.39% FEX [.] FEXCore::Context::ContextImpl::InvalidateGuestCodeRange(FEXCore::Core::InternalThreadState*, std::vector<std::vector<unsigned long, fextl::FEXAlloc<unsigned long> >, fextl::FEXAlloc<std::vector<unsigned long, fextl::FEXAlloc<unsigned long> > > >&, unsigned long, unsigned long)
4.78% [kernel] [k] unmap_page_range
3.99% [kernel] [k] __wake_up_common_lock
3.23% [kernel] [k] el0_svc
2.07% [kernel] [k] mas_walk
1.48% FEX [.] FEXCore::IR::ConstrainedRAPass::Run(FEXCore::IR::IREmitter*)
0.99% [kernel] [k] get_random_u16
0.95% [kernel] [k] try_to_wake_up
0.79% libgcc_s.so.1 [.] 0x0000000000009a14
0.76% [kernel] [k] __fget_light
0.74% FEX [.] unsigned long FEX::HLE::SyscallPassthrough3<212>(FEXCore::Core::CpuStateFrame*, unsigned long, unsigned long, unsigned long) requires (212)!=(-(1))
And FEX stats show it clearly.
Top 12 threads executing
[ ]: 0.03% (0 ms/S, 317600 cycles)
[ ]: 0.40% (4 ms/S, 4051390 cycles)
[ ]: 0.42% (4 ms/S, 4187150 cycles)
[ ]: 0.49% (4 ms/S, 4900050 cycles)
[ ]: 0.55% (5 ms/S, 5556570 cycles)
[ ]: 0.61% (6 ms/S, 6092310 cycles)
[ ]: 0.65% (6 ms/S, 6529080 cycles)
[ ]: 0.69% (6 ms/S, 6887220 cycles)
[ ]: 0.76% (7 ms/S, 7631600 cycles)
[▁ ]: 2.59% (25 ms/S, 25927000 cycles)
[▂ ]: 3.64% (36 ms/S, 36478180 cycles)
[█████████▅ ]: 26.69% (267 ms/S, 267222900 cycles)
Total (1000 millisecond sample period):
JIT Time: 192.448310 ms/second (1.60 percent)
Signal Time: 184.736040 ms/second (1.54 percent)
SIGBUS Cnt: 1 (1.001366 per second)
SMC Cnt: 670
Softfloat Cnt: 0
CacheMiss Cnt: 12524 (12541.103626 per second)
$RDLck Time: 1.824620 ms/second (0.02 percent)
$WRLck Time: 0.907340 ms/second (0.01 percent)
JIT Cnt: 2669 (2672.644968 percent)
FEX JIT Load: 3.138916 (cycles: 377184350)
Total FEX Anon memory resident: 479 MiB
JIT resident: 78 MiB
OpDispatcher resident: 78 MiB
Frontend resident: 18 MiB
CPUBackend resident: 884 KiB
Lookup cache resident: 0 KiB
Lookup L1 cache resident: 42 MiB
ThreadStates resident: 544 KiB
BlockLinks resident: 13 MiB
Misc resident: 23 MiB
JEMalloc resident: 0 KiB
Unaccounted resident: 223 MiB
To repro, just run Dark Souls Remastered, or create a bench that causes 700 invalidations a second.
Metadata
Metadata
Assignees
Labels
No labels