Skip to content

Slow startup and code re-execution in multiprocessing contexts #5684

@falkoschindler

Description

@falkoschindler

This umbrella issue consolidates multiple reports about NiceGUI's behavior when Python spawns subprocesses. The related tickets below will be closed as duplicates, pointing here for tracking and discussion.

Type Link Summary
Issue #3356 Native mode startup 2x slower due to webview subprocess
PR #3365 Proposes separate nicegui_webview package
Discussion #4353 multiprocessing.Pool 4-5x slower with NiceGUI
Issue #4412 Multiprocessing causes code reload (Windows)
PR #4429 Profile imports - 70% is Matplotlib, mocking experiments
Discussion #4465 Better FAQ for "code executed twice"
PR #4471 PoC: fast cpu_bound with lazy dummy modules
Issue #4542 run.cpu_bound causes code reload on first call
Issue #4891 Manager() causes delay
PR #5303 Lazy imports via lazy-imports library
Discussion #5574 Proposal: use loky executor to avoid slow first call

Problem

When Python spawns subprocesses (for native mode, run.cpu_bound(), multiprocessing.Pool, Manager(), etc.), each subprocess re-imports the main module. Because NiceGUI performs significant initialization at import time (creating the App, importing FastAPI, Matplotlib, and ~200 element classes), this causes:

  1. Slow startup - especially in native mode where the webview runs in a subprocess
  2. Code executed multiple times - user code outside if __name__ == '__main__' runs in each subprocess
  3. Slow first cpu_bound call - worker processes must import NiceGUI before executing the task

Affected Scenarios

  1. Code executed twice with reload=True

    The reload mechanism spawns a subprocess, causing module-level code to run twice.

    from nicegui import ui
    
    print('Running main', f'{__name__=}')
    
    ui.label('Hello')
    ui.run()

    Output:

    Running main __name__='__mp_main__'
    NiceGUI ready to go on http://localhost:8080, http://192.168.7.151:8080, and http://192.168.7.168:8080
    Running main __name__='__main__'
    
  2. Native mode (native=True) causes double initialization despite reload=False

    The webview subprocess re-imports NiceGUI, causing all import-time initialization to run twice.

    import time
    
    start = time.perf_counter()
    from nicegui import ui
    print(f'Import took {time.perf_counter() - start:.2f} seconds')
    
    @ui.page('/')
    def root():
        ui.label('Hello!')
    
    ui.run(native=True, reload=False)

    Output:

    Import took 0.52 seconds
    Import took 0.50 seconds   # <-- again in webview subprocess
    

    With native=False, the import message only prints once. The doubled import time causes a noticeable delay before the window appears.

  3. Manager() is slow

    Creating a multiprocessing Manager spawns a subprocess that imports NiceGUI.

    import time
    from multiprocessing import Manager
    
    from nicegui import ui
    
    
    def root() -> None:
        @ui.button('Create a manager').on_click
        def _():
            start = time.perf_counter()
            Manager()
            print(f'Manager() took {time.perf_counter() - start:.2f}s')
    
    
    ui.run(root)

    Output:

    Manager().Queue() took 0.62s
    
  4. run.cpu_bound() is slow on first call

    Worker processes import NiceGUI before executing the task. Subsequent calls are fast because the process pool is reused.

    import asyncio
    import time
    from nicegui import run, ui
    
    def wait():
        time.sleep(1.0)
    
    def root() -> None:
        @ui.button('Run 10 CPU-bound processes').on_click
        async def _():
            start = time.perf_counter()
            await asyncio.gather(*[run.cpu_bound(wait) for _ in range(10)])
            print(f'All done in {time.perf_counter() - start:.2f} s')
    
    ui.run(root)

    Output:

    All done in 1.63 s
    All done in 1.01 s
    All done in 1.01 s
    ...
    

Solution Direction

Two approaches are being explored:

  1. Lazy imports (#5303): Uses the lazy-imports library to defer heavy imports until actually needed. Achieves ~85% reduction in import time without API changes.

  2. Alternative executor (#5574): Proposes using loky instead of ProcessPoolExecutor. Avoids slow first call and removes need for main guards by using cloudpickle for serialization.

Key findings from profiling:

  • ~70% of import time is Matplotlib (optional via NICEGUI_INCLUDE_MATPLOTLIB=false)
  • FastAPI import is expensive but required for the main process
  • Subprocesses don't need most of NiceGUI's functionality

Workarounds (for users)

Until this is fixed, users can mitigate the issue by:

  1. Use reload=False and guard imports:

    if __name__ == '__main__':
        from nicegui import ui
        ui.run(reload=False)
  2. Set environment variable in worker processes:

    import os
    os.environ['IS_WORKER'] = '1'
    # ... spawn subprocess
    
    # In main module:
    if not os.environ.get('IS_WORKER'):
        from nicegui import ui
        # ...
  3. Disable Matplotlib if not needed:

    export NICEGUI_INCLUDE_MATPLOTLIB=false

Metadata

Metadata

Assignees

No one assigned

    Labels

    analysisStatus: Requires team/community inputbugType/scope: Incorrect behavior in existing functionality🟡 mediumPriority: Relevant, but not essential

    Type

    No type

    Projects

    Status

    In Progress

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions