Testing Infrastructure

This repository includes comprehensive testing infrastructure for validating Jupyter notebooks.

Overview

The tests are organized into two main categories:

Validation Tests (tests/validation/): Repository-wide tests that validate all notebooks automatically
Smoke Tests (tests/examples/): Example-specific tests that verify specific example workflows

Quick Start

Install Test Dependencies

# Install the package with test dependencies
pip install -e ".[test]"

Run All Tests

# Run all tests
pytest

# Run with verbose output
pytest -v

# Run tests in parallel (faster)
pytest -n auto

Run Specific Test Suites

# Run only validation tests (structure, content, syntax)
pytest tests/validation/

# Run only smoke tests (example-specific)
pytest tests/examples/

Get Test Coverage

# Run tests with coverage report
pytest --cov=tests --cov-report=term

# Generate HTML coverage report
pytest --cov=tests --cov-report=html

# Generate both terminal and HTML reports
pytest --cov=tests --cov-report=term --cov-report=html

Directory Structure

tests/
├── conftest.py                              # Shared fixtures and configuration
├── validation/                              # Repository-wide validation tests
│   ├── test_notebook_structure.py          # Basic structure validation
│   ├── test_notebook_content.py            # Content cleanliness validation
│   ├── test_notebook_syntax.py             # Python syntax validation
│   └── test_pyproject_toml.py              # Configuration file validation
│
└── examples/                                # Example-specific smoke tests
    └── knowledge_tuning/
        ├── conftest.py                      # Knowledge-tuning fixtures
        ├── test_smoke.py                    # Structure and consistency tests
        ├── test_knowledge_utils.py          # Utility function tests
        └── mocks/
            └── transformers_mock.py         # Mock transformers for testing

Test Types Explained

Validation Tests

Repository-wide tests in tests/validation/ that run on all notebooks automatically:

Structure Tests (`test_notebook_structure.py`)

✅ JSON Validity: Ensures notebooks are valid JSON
✅ Structure Validation: Validates nbformat schema
✅ Cell Validation: Checks cells have valid types
✅ Metadata Validation: Ensures proper metadata exists
✅ Error Detection: Identifies execution errors in outputs
✅ Type Checking: Validates cell types (code, markdown, raw)

Content Tests (`test_notebook_content.py`)

✅ No Execution Counts: Ensures notebooks have cleared execution counts
✅ No Stored Outputs: Ensures notebooks have cleared outputs
✅ No Empty Code Cells: Prevents empty code cells

Syntax Tests (`test_notebook_syntax.py`)

✅ Import Parseability: Ensures all import statements are parseable
✅ Well-formed Imports: Validates import statements have correct structure
✅ Code Validation: All code cells have valid Python syntax
✅ Shell Commands: Skips shell commands and magic commands appropriately

Metadata Tests (`test_notebook_metadata.py`)

✅ Kernelspec Consistency: Ensures all notebooks use the same kernel
✅ No Environment Metadata: Prevents environment-specific metadata (vscode, colab, etc.)
✅ Standardized Schema: Validates consistent metadata structure across notebooks
✅ Required Sections: Ensures notebooks have setup/import documentation
✅ Cell Ordering: Validates logical cell type ordering

PyProject Tests (`test_pyproject_toml.py`)

✅ Valid TOML: Ensures pyproject.toml files are valid TOML
✅ Required Sections: Validates [project] or [tool] sections exist
✅ Dependencies: Checks dependencies are well-formed
✅ Python Version: Validates requires-python field
✅ No GPU Packages: Ensures GPU-specific packages aren't in required dependencies
✅ Version Consistency: Validates Python version consistency across projects
✅ Venv Build Validation: Uses pip install --dry-run to verify dependencies can be resolved
✅ Conflict Detection: Checks for dependency conflicts without installing packages

Smoke Tests (Knowledge-Tuning)

Example-specific tests in tests/examples/knowledge_tuning/ that verify the knowledge-tuning workflow:

Structure Tests (`test_smoke.py`)

✅ Directory structure validation
✅ Required files present (README, .env.example, pyproject.toml)
✅ All step directories exist (00_Setup through 06_Evaluation)
✅ Notebook structure validation
✅ Import validation (transformers, torch, etc.)
✅ Environment variable usage
✅ Documentation completeness (overview sections)
✅ Consistency across notebooks (Python version)

Utility Function Tests (`test_knowledge_utils.py`)

✅ get_avg_summaries_per_raw_doc: Tests average summary calculation logic
✅ sample_doc_qa: Tests Q&A pair sampling with proper column validation
✅ generate_knowledge_qa_dataset: Tests dataset generation in chat format
✅ count_len_in_tokens: Tests token counting with mocked tokenizers
✅ Data Contract Validation: Ensures all functions validate required columns
✅ JSON Schema Validation: Verifies output follows expected structure
✅ Reasoning Support: Tests functions with and without reasoning columns
✅ Pre-training Flags: Validates unmask column generation

Shared Fixtures

The tests/conftest.py file provides shared fixtures for all tests:

repo_root: Repository root path
all_notebooks: List of all notebooks in examples/
all_pyproject_files: List of all pyproject.toml files
knowledge_tuning_path: Path to knowledge-tuning example
toml_parser: TOML parser (tomllib or tomli)

Continuous Integration

Tests run automatically in GitHub Actions:

Trigger: On push to main, pull requests, or manual dispatch
Workflow: .github/workflows/notebook-tests.yml
Jobs:
- validation-tests: Runs validation tests (structure, content, syntax, metadata, pyproject)
- smoke-tests: Runs example-specific smoke tests
- test-coverage: Generates coverage report (PRs only)

Running Tests Locally

Basic Usage

# Run all tests
pytest

# Run all tests with verbose output
pytest -v

# Run specific test file
pytest tests/validation/test_notebook_structure.py

# Run specific test function
pytest tests/validation/test_notebook_structure.py::test_all_notebooks_found

# Run tests matching a pattern
pytest -k "validation"
pytest -k "structure"
pytest -k "knowledge_tuning"

Selective Test Execution

# Run only validation tests
pytest tests/validation/

# Run only smoke tests
pytest tests/examples/

# Run tests for specific example
pytest tests/examples/knowledge_tuning/

# Run specific validation category
pytest tests/validation/test_notebook_content.py

Get Coverage Locally

# Run tests with coverage (terminal output)
pytest --cov=tests --cov-report=term

# Run with HTML coverage report
pytest --cov=tests --cov-report=html

# Run with both terminal and HTML reports
pytest --cov=tests --cov-report=term --cov-report=html

# Show which lines are missing coverage
pytest --cov=tests --cov-report=term-missing

Test Markers

Mark tests with categories for selective execution:

@pytest.mark.smoke
def test_quick_check():
    pass

@pytest.mark.slow
def test_comprehensive():
    pass

Run specific markers:

pytest -m smoke        # Run only smoke tests
pytest -m "not slow"   # Skip slow tests

Testing Requirements for New Examples

When adding new examples to the repository, they must pass all validation tests automatically. Additionally, consider adding example-specific smoke tests for comprehensive validation.

Automatic Validation (Required)

All examples automatically undergo validation testing without any additional setup:

Notebook Structure Validation
Notebook Content Validation
Notebook Syntax Validation
PyProject.toml Validation

Example-Specific Smoke Tests (Recommended)

For complex examples with multiple steps or critical workflows, add smoke tests in tests/examples/your-example-name/:

Example Structure:

tests/examples/your-example-name/
├── __init__.py
├── conftest.py              # Fixtures for your example
├── test_smoke.py            # Structure and consistency tests
├── test_utils.py            # Utility function tests (if applicable)
└── mocks/                   # Mock heavy dependencies
    └── __init__.py

Reference Implementation:

See tests/examples/knowledge_tuning/ for a complete example of smoke tests including:

File structure validation
Notebook import verification
Documentation completeness checks
Utility function testing with mocked transformers/torch
Configuration consistency validation

Running Tests for New Examples

Verify your example passes all tests:

# Run all validation tests
pytest tests/validation/ -v

# Run validation tests for specific notebooks
pytest tests/validation/ -v -k "your_notebook_name"

# Verify PyProject.toml
pytest tests/validation/test_pyproject_toml.py -v

# Run your smoke tests (if added)
pytest tests/examples/your-example-name/ -v

# Run all tests together
pytest -v

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing Infrastructure

Overview

Quick Start

Install Test Dependencies

Run All Tests

Run Specific Test Suites

Get Test Coverage

Directory Structure

Test Types Explained

Validation Tests

Structure Tests (`test_notebook_structure.py`)

Content Tests (`test_notebook_content.py`)

Syntax Tests (`test_notebook_syntax.py`)

Metadata Tests (`test_notebook_metadata.py`)

PyProject Tests (`test_pyproject_toml.py`)

Smoke Tests (Knowledge-Tuning)

Structure Tests (`test_smoke.py`)

Utility Function Tests (`test_knowledge_utils.py`)

Shared Fixtures

Continuous Integration

Running Tests Locally

Basic Usage

Selective Test Execution

Get Coverage Locally

Test Markers

Testing Requirements for New Examples

Automatic Validation (Required)

Example-Specific Smoke Tests (Recommended)

Running Tests for New Examples

FilesExpand file tree

TESTING.md

Latest commit

History

TESTING.md

File metadata and controls

Testing Infrastructure

Overview

Quick Start

Install Test Dependencies

Run All Tests

Run Specific Test Suites

Get Test Coverage

Directory Structure

Test Types Explained

Validation Tests

Structure Tests (test_notebook_structure.py)

Content Tests (test_notebook_content.py)

Syntax Tests (test_notebook_syntax.py)

Metadata Tests (test_notebook_metadata.py)

PyProject Tests (test_pyproject_toml.py)

Smoke Tests (Knowledge-Tuning)

Structure Tests (test_smoke.py)

Utility Function Tests (test_knowledge_utils.py)

Shared Fixtures

Continuous Integration

Running Tests Locally

Basic Usage

Selective Test Execution

Get Coverage Locally

Test Markers

Testing Requirements for New Examples

Automatic Validation (Required)

Example-Specific Smoke Tests (Recommended)

Running Tests for New Examples

Structure Tests (`test_notebook_structure.py`)

Content Tests (`test_notebook_content.py`)

Syntax Tests (`test_notebook_syntax.py`)

Metadata Tests (`test_notebook_metadata.py`)

PyProject Tests (`test_pyproject_toml.py`)

Structure Tests (`test_smoke.py`)

Utility Function Tests (`test_knowledge_utils.py`)