Notebook Testing

Automated test suite that runs all notebooks in Docker and checks for errors and empty SELECT queries.

Test Execution Order

Important: The test suite automatically runs notebooks in the correct order:

Setup.ipynb runs FIRST (prerequisite - downloads data and initializes environment)
All other notebooks run independently after Setup completes

Each video notebook (Module1, Module2, Module3) can run completely independently after Setup.

Quick Start

# Install dependencies
pip install -r requirements-test.txt

# Run all notebooks tests (Setup runs first automatically)
pytest test_notebooks.py -v

Other Ways to Run

# Test a specific notebook
pytest test_notebooks.py -k "Module1"

# Run in parallel (faster)
pytest test_notebooks.py -n auto

# Run as standalone Python script
python test_notebooks.py

What It Checks

✅ All cells execute without errors
✅ SELECT queries return data rows (not empty)
✅ No Python exceptions or broken cells

Testing Offline JAR Mode

The offline JAR mode allows Spark to use pre-downloaded JARs instead of fetching them from Maven Central at runtime. This is tested by a dedicated CI workflow (.github/workflows/test-offline-jars.yml) and can also be run locally.

Running Locally

# 1. Download JARs (use --insecure if behind a corporate proxy)
./manual-download-dependencies.sh

# 2. Start services (JARs are auto-mounted from ./jars/)
docker compose up -d

# 3. Verify the config uses local JARs
docker exec jupyter-spark grep "^spark.jars=" /home/jovyan/.sparkconf/spark-defaults.conf

# 4. Run a notebook test to confirm everything works
pytest test_notebooks.py -v -k "test_01_setup or E1_2_DataModeling" --tb=short

# 5. Clean up
docker compose down -v
rm -rf jars/

What the CI Workflow Checks

manual-download-dependencies.sh downloads all expected JARs
The generated spark-defaults.conf uses spark.jars= (local paths) instead of spark.jars.packages=
A notebook runs successfully with Spark loading from local JARs

Requirements

Docker running with jupyter-spark container
Python packages: pytest, nbconvert, nbformat (installed via requirements-test.txt)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Notebook Testing

Test Execution Order

Quick Start

Other Ways to Run

What It Checks

Testing Offline JAR Mode

Running Locally

What the CI Workflow Checks

Requirements

FilesExpand file tree

TESTING.md

Latest commit

History

TESTING.md

File metadata and controls

Notebook Testing

Test Execution Order

Quick Start

Other Ways to Run

What It Checks

Testing Offline JAR Mode

Running Locally

What the CI Workflow Checks

Requirements