Skip to content

Race condition in gameLoopInitialized causes server hang when starting multiple instances concurrently #1177

@uzapolsky

Description

@uzapolsky

Description

When starting multiple CS2 server instances simultaneously from the same installation directory, some servers hang indefinitely during map loading. This is caused by a race condition in the global variable gameLoopInitialized.

Environment

  • OS: Ubuntu Linux
  • CS2 Server: Dedicated server, 5 instances
  • CounterStrikeSharp Version: 1.0.355
  • Metamod Version: Latest
  • Hardware: 96 CPU cores, 251GB RAM, RAID1 storage

Steps to Reproduce

  1. Install CounterStrikeSharp in a single CS2 server directory
  2. Start 5+ server instances simultaneously using the same game files:
    for port in 27015 27016 27017 27018 27019; do
      ./cs2 -dedicated -hostport $port +map de_dust2 &
    done
  3. Observe that some servers hang during initialization

Expected Behavior

All servers should initialize successfully and load the map, regardless of concurrent startup.

Actual Behavior

  • 1-2 servers hang indefinitely (tested with 5 concurrent starts)
  • Hung servers show CPU usage (~28%) but make no progress
  • Last log entry before hang: CNavGenParams - Nav mesh requests project default generatTotal connected players: 2 (truncated mid-line)
  • Process spins in ThreadNanoSleep() loop waiting for an event that never arrives

Root Cause Analysis

Problem Location

File: src/core/globals.cpp:94

bool gameLoopInitialized = false;  // Global variable shared across all processes

File: src/mm_plugin.cpp:292

void CounterStrikeSharpMMPlugin::Hook_RegisterLoopMode(const char* pszLoopModeName, ...)
{
    if (strcmp(pszLoopModeName, "game") == 0)
    {
        if (!globals::gameLoopInitialized) globals::gameLoopInitialized = true;
        CALL_GLOBAL_LISTENER(OnGameLoopInitialized());
    }
}

File: src/core/managers/event_manager.cpp:82

bool EventManager::HookEvent(const char* szName, CallbackT fnCallback, bool bPost)
{
    if (!globals::gameLoopInitialized)
    {
        const PendingEventHook pendingHook{ szName, fnCallback, bPost };
        m_PendingHooks.push(pendingHook);
        return true;
    }
    // ...
}

Race Condition Sequence

  1. Multiple server processes share the same .so library memory
  2. gameLoopInitialized is a single global variable shared by all processes
  3. Race condition occurs:
    Process A: checks gameLoopInitialized == false
    Process B: checks gameLoopInitialized == false
    Process A: sets gameLoopInitialized = true, calls OnGameLoopInitialized()
    Process B: sets gameLoopInitialized = true, but listener ALREADY fired
    Process B: EventManager waits for OnGameLoopInitialized() that never comes
    Process B: m_PendingHooks never processed → server hangs forever
    

Technical Evidence

GDB Stack Trace of Hung Server

Thread 1 (main):
#0  __GI___clock_nanosleep
#1  __GI___nanosleep
#2  ThreadNanoSleep() from libtier0.so
#3  libresourcesystem.so  (spin-wait loop)

System Call Analysis

Hung server makes 30,130 futex calls and 320 clock_nanosleep calls in 5 seconds (6,000+ futex/sec), indicating a busy spin-wait loop.

Perf Profile

10.28% CPU time spent in counterstrikesharp::__SourceHook_FHCls_IServerGameDLLGameFrame0::Func, confirming the hook is active but waiting.

Proposed Solutions

Option 1: Process-Local Variable (Recommended)

Make gameLoopInitialized process-local instead of global. Use thread-local storage or instance variable:

// In mm_plugin.h
class CounterStrikeSharpMMPlugin {
private:
    bool m_gameLoopInitialized = false;  // Instance variable
    // ...
};

Option 2: Atomic Protection

Protect the variable with atomic operations:

// In globals.h
#include <atomic>
extern std::atomic<bool> gameLoopInitialized;

// In mm_plugin.cpp
bool expected = false;
if (globals::gameLoopInitialized.compare_exchange_strong(expected, true)) {
    CALL_GLOBAL_LISTENER(OnGameLoopInitialized());
}

Option 3: Process Synchronization

Use a process-wide mutex/semaphore, but this is more complex and may impact performance.

Additional Notes

  • Issue occurs when servers share the same CounterStrikeSharp installation directory

Reproduction Rate

Out of 5 concurrent server starts:

  • 3-4 servers: initialize successfully (5-8 seconds)
  • 1-2 servers: hang indefinitely

Metadata

Metadata

Assignees

No one assigned

    Labels

    untriagedNew issue has not been triaged

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions