Skip to content

Add notebook testing support and developer testing guide#4075

Open
chesterxgchen wants to merge 4 commits intoNVIDIA:mainfrom
chesterxgchen:note_book_test
Open

Add notebook testing support and developer testing guide#4075
chesterxgchen wants to merge 4 commits intoNVIDIA:mainfrom
chesterxgchen:note_book_test

Conversation

@chesterxgchen
Copy link
Collaborator

@chesterxgchen chesterxgchen commented Jan 30, 2026

Summary

This PR enhances runtest.sh with new testing capabilities, transparent dependency caching for faster development iteration, and comprehensive documentation for developers.

Features

Developer Testing Improvements

  • Dependency caching: First run installs dependencies; subsequent runs skip (auto-expires after 7 days or 50 runs)
  • Verbose mode (-v): Detailed pytest output for all test types
  • --fresh-deps: Force dependency reinstall when needed
  • Command printing: Shows exact command being executed for debugging

Notebook Testing (-n option)

  • Run Jupyter notebooks as tests using nbmake
  • Default kernel detection (prefers python3 if available)
  • Configurable timeout, kernel, and output cleaning
  • Cell skipping via tags (skip-execution, skip, colab)
  • Automatic notebook backup and restore during tests

Usage

# Full test suite (default behavior)
./runtest.sh              # License check + style check + unit tests with coverage

# Individual commands
./runtest.sh -u           # Unit tests only (faster iteration)
./runtest.sh -n           # Notebook tests
./runtest.sh -s           # Check code formatting
./runtest.sh -f           # Fix code formatting

# Options (work with any command)
./runtest.sh -u -v        # Unit tests with verbose output
./runtest.sh -n -v        # Notebook tests with verbose output
./runtest.sh --fresh-deps # Force dependency reinstall

Dependency Caching

The script automatically caches dependency installation:

  • First run: Installs dependencies, creates cache marker
  • Subsequent runs: Skips installation (displays cache status)
  • Auto-refresh: After 7 days or 50 runs
  • Manual refresh: --fresh-deps or --clean

Example output:

Dependencies cached (5/50 runs, 2/7 days) - skipping install

Documentation

  • developer_testing.rst: General runtest.sh usage guide covering all test commands
  • notebook_testing.rst: Notebook-specific testing guide

Note: Not all notebooks are ready for automated testing yet. Some require external infrastructure (running FLARE servers, provisioned environments). Notebook test coverage is being improved over time.

Files Changed

  • runtest.sh: Added notebook testing, dependency caching, verbose mode
  • docs/programming_guide/developer_testing.rst: New general testing guide
  • docs/programming_guide/notebook_testing.rst: New notebook testing guide
  • docs/programming_guide.rst: Updated toctree

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 30, 2026

Greptile Overview

Greptile Summary

Enhanced runtest.sh with notebook testing capabilities, transparent dependency caching (7-day/50-run expiry), and comprehensive developer documentation.

Key Changes:

  • Added -n option for notebook testing using nbmake with configurable timeout, kernel, and output cleaning
  • Implemented dependency caching mechanism with atomic file operations to speed up repeated test runs
  • Added -v verbose mode and --fresh-deps option for development flexibility
  • Created two new RST documentation guides covering general testing and notebook-specific workflows
  • Fixed typo in error message (runtests.sh → runtest.sh)

Implementation Quality:

  • Previous review comments have been addressed (race conditions, kernel validation, empty kernel handling, runtime detection)
  • Atomic file writes using temp files prevent cache corruption
  • Proper numeric validation for marker file contents with fallback to 0
  • Kernel detection deferred to runtime allowing jupyter installation after script load
  • Empty kernel properly handled by not adding --kernel= argument when kernel is empty string

Confidence Score: 4.5/5

  • This PR is safe to merge with very low risk - it adds testing infrastructure and documentation without modifying core functionality
  • Score reflects solid implementation with all previous review concerns addressed, comprehensive documentation, and defensive coding practices. Minor deduction for complexity of the caching mechanism which could benefit from additional integration testing
  • No files require special attention - all previous review comments have been addressed and implementation is clean

Important Files Changed

Filename Overview
runtest.sh Added notebook testing support, dependency caching mechanism, and verbose mode - implementation is solid with previous review comments addressed
docs/programming_guide/developer_testing.rst Comprehensive documentation covering runtest.sh usage, dependency caching, and all testing options
docs/programming_guide/notebook_testing.rst Detailed guide for notebook testing with nbmake, including cell tagging, troubleshooting, and best practices
docs/programming_guide.rst Added two new documentation entries to the toctree for developer and notebook testing guides

Sequence Diagram

sequenceDiagram
    participant User
    participant runtest.sh
    participant maybe_install_deps
    participant should_install_deps
    participant install_deps
    participant notebook_test
    participant get_current_kernel
    participant validate_kernel
    participant pytest

    User->>runtest.sh: ./runtest.sh -n [options]
    runtest.sh->>runtest.sh: Parse arguments (--timeout, --kernel, --nb-clean, -v)
    runtest.sh->>maybe_install_deps: Check dependencies
    
    alt fresh_deps flag set
        maybe_install_deps->>install_deps: Force install
        install_deps->>install_deps: pip install -e .[dev]
        install_deps->>install_deps: Create .deps_marker (timestamp, 0 runs)
    else check cache
        maybe_install_deps->>should_install_deps: Check cache status
        should_install_deps->>should_install_deps: Read .deps_marker file
        
        alt marker missing
            should_install_deps-->>maybe_install_deps: return 0 (install needed)
            maybe_install_deps->>install_deps: Install dependencies
            install_deps->>install_deps: Create .deps_marker
        else cache expired (7 days or 50 runs)
            should_install_deps-->>maybe_install_deps: return 0 (install needed)
            maybe_install_deps->>install_deps: Install dependencies
            install_deps->>install_deps: Reset .deps_marker
        else cache valid
            should_install_deps->>should_install_deps: Increment run count
            should_install_deps-->>maybe_install_deps: return 1 (skip install)
        end
    end

    runtest.sh->>notebook_test: Execute notebook_test(target)
    
    alt kernel not specified
        notebook_test->>get_current_kernel: Auto-detect kernel
        get_current_kernel->>get_current_kernel: Check for python3 kernel
        get_current_kernel-->>notebook_test: Return kernel name or empty
    end
    
    alt kernel specified/detected
        notebook_test->>validate_kernel: Validate kernel exists
        validate_kernel->>validate_kernel: jupyter kernelspec list
        alt kernel not found
            validate_kernel-->>notebook_test: return 1 (error)
            notebook_test->>User: Error: kernel not found
        else kernel valid
            validate_kernel-->>notebook_test: return 0 (valid)
        end
    end
    
    notebook_test->>notebook_test: Build pytest command array
    notebook_test->>notebook_test: Add --nbmake, --nbmake-timeout, --nbmake-clean
    notebook_test->>notebook_test: Add --kernel if specified
    notebook_test->>notebook_test: Add -v if verbose_flag=true
    notebook_test->>pytest: Execute pytest --nbmake [options]
    
    Note over pytest: conftest.py hooks execute:<br/>- filter_notebook()<br/>- Create .backup file<br/>- Remove skip-execution cells<br/>- Update kernel spec<br/>- Execute notebook<br/>- Restore from backup<br/>- Clean outputs per --nbmake-clean
    
    pytest-->>notebook_test: Return exit status
    notebook_test->>notebook_test: report_status()
    notebook_test-->>User: Test results
Loading

greptile-apps[bot]

This comment was marked as outdated.

@greptile-apps

This comment was marked as outdated.

greptile-apps[bot]

This comment was marked as outdated.

@greptile-apps

This comment was marked as outdated.

greptile-apps[bot]

This comment was marked as outdated.

@greptile-apps

This comment was marked as outdated.

@chesterxgchen
Copy link
Collaborator Author

/build

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

…eloper guide

Summary:
This PR enhances runtest.sh with new testing capabilities, transparent dependency
caching for faster development iteration, and comprehensive documentation.

Developer Testing Improvements:
- Dependency caching: First run installs; subsequent runs skip (auto-expires
  after 7 days or 50 runs)
- Verbose mode (-v): Detailed pytest output for all test types
- --fresh-deps: Force dependency reinstall when needed
- Command printing: Shows exact command being executed for debugging

Notebook Testing (-n option):
- Run Jupyter notebooks as tests using nbmake
- Default kernel detection (prefers python3 if available)
- Configurable timeout, kernel, and output cleaning
- Cell skipping via tags (skip-execution, skip, colab)
- Automatic notebook backup and restore during tests

Usage:
  ./runtest.sh              # Full suite (license, style, unit tests)
  ./runtest.sh -u           # Unit tests only (faster iteration)
  ./runtest.sh -n           # Notebook tests
  ./runtest.sh -n -v        # Notebook tests with verbose output
  ./runtest.sh --fresh-deps # Force dependency reinstall

Documentation:
- developer_testing.rst: General runtest.sh usage guide
- notebook_testing.rst: Notebook-specific testing guide

Note: Not all notebooks are ready for automated testing yet.
- Fix script name typo: runtests.sh -> runtest.sh
- Fix missing $ in color variable: {noColor} -> ${noColor}
- Fix race condition: use atomic file operations (temp file + mv) for marker file
- Add kernel validation: validate kernel exists before running notebook tests
- Add numeric validation for marker file values to handle corruption
- Move get_current_kernel() call from script load to notebook_test() runtime
- Remove intermediate kernel_opt variable, use kernel directly
- Allows jupyter to be installed after script loads
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@chesterxgchen
Copy link
Collaborator Author

/build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant