|
| 1 | +# Testing Infrastructure |
| 2 | + |
| 3 | +This repository includes comprehensive testing infrastructure for validating Jupyter notebooks. |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +The tests are organized into two main categories: |
| 8 | + |
| 9 | +1. **Validation Tests** (`tests/validation/`): Repository-wide tests that validate **all** notebooks automatically |
| 10 | +2. **Smoke Tests** (`tests/examples/`): Example-specific tests that verify specific example workflows |
| 11 | + |
| 12 | +## Quick Start |
| 13 | + |
| 14 | +### Install Test Dependencies |
| 15 | + |
| 16 | +```bash |
| 17 | +# Install the package with test dependencies |
| 18 | +pip install -e ".[test]" |
| 19 | +``` |
| 20 | + |
| 21 | +### Run All Tests |
| 22 | + |
| 23 | +```bash |
| 24 | +# Run all tests |
| 25 | +pytest |
| 26 | + |
| 27 | +# Run with verbose output |
| 28 | +pytest -v |
| 29 | + |
| 30 | +# Run tests in parallel (faster) |
| 31 | +pytest -n auto |
| 32 | +``` |
| 33 | + |
| 34 | +### Run Specific Test Suites |
| 35 | + |
| 36 | +```bash |
| 37 | +# Run only validation tests (structure, content, syntax) |
| 38 | +pytest tests/validation/ |
| 39 | + |
| 40 | +# Run only smoke tests (example-specific) |
| 41 | +pytest tests/examples/ |
| 42 | +``` |
| 43 | + |
| 44 | +### Get Test Coverage |
| 45 | + |
| 46 | +```bash |
| 47 | +# Run tests with coverage report |
| 48 | +pytest --cov=tests --cov-report=term |
| 49 | + |
| 50 | +# Generate HTML coverage report |
| 51 | +pytest --cov=tests --cov-report=html |
| 52 | + |
| 53 | +# Generate both terminal and HTML reports |
| 54 | +pytest --cov=tests --cov-report=term --cov-report=html |
| 55 | +``` |
| 56 | + |
| 57 | +## Directory Structure |
| 58 | + |
| 59 | +```text |
| 60 | +tests/ |
| 61 | +├── conftest.py # Shared fixtures and configuration |
| 62 | +├── validation/ # Repository-wide validation tests |
| 63 | +│ ├── test_notebook_structure.py # Basic structure validation |
| 64 | +│ ├── test_notebook_content.py # Content cleanliness validation |
| 65 | +│ ├── test_notebook_syntax.py # Python syntax validation |
| 66 | +│ └── test_pyproject_toml.py # Configuration file validation |
| 67 | +│ |
| 68 | +└── examples/ # Example-specific smoke tests |
| 69 | + └── knowledge_tuning/ |
| 70 | + ├── conftest.py # Knowledge-tuning fixtures |
| 71 | + ├── test_smoke.py # Structure and consistency tests |
| 72 | + ├── test_knowledge_utils.py # Utility function tests |
| 73 | + └── mocks/ |
| 74 | + └── transformers_mock.py # Mock transformers for testing |
| 75 | +``` |
| 76 | + |
| 77 | +## Test Types Explained |
| 78 | + |
| 79 | +### Validation Tests |
| 80 | + |
| 81 | +Repository-wide tests in `tests/validation/` that run on **all notebooks** automatically: |
| 82 | + |
| 83 | +#### Structure Tests (`test_notebook_structure.py`) |
| 84 | + |
| 85 | +- ✅ **JSON Validity**: Ensures notebooks are valid JSON |
| 86 | +- ✅ **Structure Validation**: Validates nbformat schema |
| 87 | +- ✅ **Cell Validation**: Checks cells have valid types |
| 88 | +- ✅ **Metadata Validation**: Ensures proper metadata exists |
| 89 | +- ✅ **Error Detection**: Identifies execution errors in outputs |
| 90 | +- ✅ **Type Checking**: Validates cell types (code, markdown, raw) |
| 91 | + |
| 92 | +#### Content Tests (`test_notebook_content.py`) |
| 93 | + |
| 94 | +- ✅ **No Execution Counts**: Ensures notebooks have cleared execution counts |
| 95 | +- ✅ **No Stored Outputs**: Ensures notebooks have cleared outputs |
| 96 | +- ✅ **No Empty Code Cells**: Prevents empty code cells |
| 97 | + |
| 98 | +#### Syntax Tests (`test_notebook_syntax.py`) |
| 99 | + |
| 100 | +- ✅ **Import Parseability**: Ensures all import statements are parseable |
| 101 | +- ✅ **Well-formed Imports**: Validates import statements have correct structure |
| 102 | +- ✅ **Code Validation**: All code cells have valid Python syntax |
| 103 | +- ✅ **Shell Commands**: Skips shell commands and magic commands appropriately |
| 104 | + |
| 105 | +#### Metadata Tests (`test_notebook_metadata.py`) |
| 106 | + |
| 107 | +- ✅ **Kernelspec Consistency**: Ensures all notebooks use the same kernel |
| 108 | +- ✅ **No Environment Metadata**: Prevents environment-specific metadata (vscode, colab, etc.) |
| 109 | +- ✅ **Standardized Schema**: Validates consistent metadata structure across notebooks |
| 110 | +- ✅ **Required Sections**: Ensures notebooks have setup/import documentation |
| 111 | +- ✅ **Cell Ordering**: Validates logical cell type ordering |
| 112 | + |
| 113 | +#### PyProject Tests (`test_pyproject_toml.py`) |
| 114 | + |
| 115 | +- ✅ **Valid TOML**: Ensures pyproject.toml files are valid TOML |
| 116 | +- ✅ **Required Sections**: Validates [project] or [tool] sections exist |
| 117 | +- ✅ **Dependencies**: Checks dependencies are well-formed |
| 118 | +- ✅ **Python Version**: Validates requires-python field |
| 119 | +- ✅ **No GPU Packages**: Ensures GPU-specific packages aren't in required dependencies |
| 120 | +- ✅ **Version Consistency**: Validates Python version consistency across projects |
| 121 | +- ✅ **Venv Build Validation**: Uses `pip install --dry-run` to verify dependencies can be resolved |
| 122 | +- ✅ **Conflict Detection**: Checks for dependency conflicts without installing packages |
| 123 | + |
| 124 | +### Smoke Tests (Knowledge-Tuning) |
| 125 | + |
| 126 | +Example-specific tests in `tests/examples/knowledge_tuning/` that verify the knowledge-tuning workflow: |
| 127 | + |
| 128 | +#### Structure Tests (`test_smoke.py`) |
| 129 | + |
| 130 | +- ✅ Directory structure validation |
| 131 | +- ✅ Required files present (README, .env.example, pyproject.toml) |
| 132 | +- ✅ All step directories exist (00_Setup through 06_Evaluation) |
| 133 | +- ✅ Notebook structure validation |
| 134 | +- ✅ Import validation (transformers, torch, etc.) |
| 135 | +- ✅ Environment variable usage |
| 136 | +- ✅ Documentation completeness (overview sections) |
| 137 | +- ✅ Consistency across notebooks (Python version) |
| 138 | + |
| 139 | +#### Utility Function Tests (`test_knowledge_utils.py`) |
| 140 | + |
| 141 | +- ✅ **get_avg_summaries_per_raw_doc**: Tests average summary calculation logic |
| 142 | +- ✅ **sample_doc_qa**: Tests Q&A pair sampling with proper column validation |
| 143 | +- ✅ **generate_knowledge_qa_dataset**: Tests dataset generation in chat format |
| 144 | +- ✅ **count_len_in_tokens**: Tests token counting with mocked tokenizers |
| 145 | +- ✅ **Data Contract Validation**: Ensures all functions validate required columns |
| 146 | +- ✅ **JSON Schema Validation**: Verifies output follows expected structure |
| 147 | +- ✅ **Reasoning Support**: Tests functions with and without reasoning columns |
| 148 | +- ✅ **Pre-training Flags**: Validates unmask column generation |
| 149 | + |
| 150 | +## Shared Fixtures |
| 151 | + |
| 152 | +The [tests/conftest.py](tests/conftest.py) file provides shared fixtures for all tests: |
| 153 | + |
| 154 | +- `repo_root`: Repository root path |
| 155 | +- `all_notebooks`: List of all notebooks in examples/ |
| 156 | +- `all_pyproject_files`: List of all pyproject.toml files |
| 157 | +- `knowledge_tuning_path`: Path to knowledge-tuning example |
| 158 | +- `toml_parser`: TOML parser (tomllib or tomli) |
| 159 | + |
| 160 | +## Continuous Integration |
| 161 | + |
| 162 | +Tests run automatically in GitHub Actions: |
| 163 | + |
| 164 | +- **Trigger**: On push to main, pull requests, or manual dispatch |
| 165 | +- **Workflow**: [.github/workflows/notebook-tests.yml](.github/workflows/notebook-tests.yml) |
| 166 | +- **Jobs**: |
| 167 | + - **validation-tests**: Runs validation tests (structure, content, syntax, metadata, pyproject) |
| 168 | + - **smoke-tests**: Runs example-specific smoke tests |
| 169 | + - **test-coverage**: Generates coverage report (PRs only) |
| 170 | + |
| 171 | +## Running Tests Locally |
| 172 | + |
| 173 | +### Basic Usage |
| 174 | + |
| 175 | +```bash |
| 176 | +# Run all tests |
| 177 | +pytest |
| 178 | + |
| 179 | +# Run all tests with verbose output |
| 180 | +pytest -v |
| 181 | + |
| 182 | +# Run specific test file |
| 183 | +pytest tests/validation/test_notebook_structure.py |
| 184 | + |
| 185 | +# Run specific test function |
| 186 | +pytest tests/validation/test_notebook_structure.py::test_all_notebooks_found |
| 187 | + |
| 188 | +# Run tests matching a pattern |
| 189 | +pytest -k "validation" |
| 190 | +pytest -k "structure" |
| 191 | +pytest -k "knowledge_tuning" |
| 192 | +``` |
| 193 | + |
| 194 | +### Selective Test Execution |
| 195 | + |
| 196 | +```bash |
| 197 | +# Run only validation tests |
| 198 | +pytest tests/validation/ |
| 199 | + |
| 200 | +# Run only smoke tests |
| 201 | +pytest tests/examples/ |
| 202 | + |
| 203 | +# Run tests for specific example |
| 204 | +pytest tests/examples/knowledge_tuning/ |
| 205 | + |
| 206 | +# Run specific validation category |
| 207 | +pytest tests/validation/test_notebook_content.py |
| 208 | +``` |
| 209 | + |
| 210 | +### Get Coverage Locally |
| 211 | + |
| 212 | +```bash |
| 213 | +# Run tests with coverage (terminal output) |
| 214 | +pytest --cov=tests --cov-report=term |
| 215 | + |
| 216 | +# Run with HTML coverage report |
| 217 | +pytest --cov=tests --cov-report=html |
| 218 | + |
| 219 | +# Run with both terminal and HTML reports |
| 220 | +pytest --cov=tests --cov-report=term --cov-report=html |
| 221 | + |
| 222 | +# Show which lines are missing coverage |
| 223 | +pytest --cov=tests --cov-report=term-missing |
| 224 | +``` |
| 225 | + |
| 226 | +## Test Markers |
| 227 | + |
| 228 | +Mark tests with categories for selective execution: |
| 229 | + |
| 230 | +```python |
| 231 | +@pytest.mark.smoke |
| 232 | +def test_quick_check(): |
| 233 | + pass |
| 234 | + |
| 235 | +@pytest.mark.slow |
| 236 | +def test_comprehensive(): |
| 237 | + pass |
| 238 | +``` |
| 239 | + |
| 240 | +Run specific markers: |
| 241 | + |
| 242 | +```bash |
| 243 | +pytest -m smoke # Run only smoke tests |
| 244 | +pytest -m "not slow" # Skip slow tests |
| 245 | +``` |
| 246 | + |
| 247 | +## Testing Requirements for New Examples |
| 248 | + |
| 249 | +When adding new examples to the repository, they must pass all validation tests automatically. Additionally, consider adding example-specific smoke tests for comprehensive validation. |
| 250 | + |
| 251 | +### Automatic Validation (Required) |
| 252 | + |
| 253 | +All examples automatically undergo validation testing without any additional setup: |
| 254 | + |
| 255 | +- **Notebook Structure Validation** |
| 256 | +- **Notebook Content Validation** |
| 257 | +- **Notebook Syntax Validation** |
| 258 | +- **PyProject.toml Validation** |
| 259 | + |
| 260 | +### Example-Specific Smoke Tests (Recommended) |
| 261 | + |
| 262 | +For complex examples with multiple steps or critical workflows, add smoke tests in `tests/examples/your-example-name/`: |
| 263 | + |
| 264 | +**Example Structure:** |
| 265 | + |
| 266 | +```text |
| 267 | +tests/examples/your-example-name/ |
| 268 | +├── __init__.py |
| 269 | +├── conftest.py # Fixtures for your example |
| 270 | +├── test_smoke.py # Structure and consistency tests |
| 271 | +├── test_utils.py # Utility function tests (if applicable) |
| 272 | +└── mocks/ # Mock heavy dependencies |
| 273 | + └── __init__.py |
| 274 | +``` |
| 275 | + |
| 276 | +**Reference Implementation:** |
| 277 | + |
| 278 | +See [tests/examples/knowledge_tuning/](tests/examples/knowledge_tuning/) for a complete example of smoke tests including: |
| 279 | + |
| 280 | +- File structure validation |
| 281 | +- Notebook import verification |
| 282 | +- Documentation completeness checks |
| 283 | +- Utility function testing with mocked transformers/torch |
| 284 | +- Configuration consistency validation |
| 285 | + |
| 286 | +### Running Tests for New Examples |
| 287 | + |
| 288 | +Verify your example passes all tests: |
| 289 | + |
| 290 | +```bash |
| 291 | +# Run all validation tests |
| 292 | +pytest tests/validation/ -v |
| 293 | + |
| 294 | +# Run validation tests for specific notebooks |
| 295 | +pytest tests/validation/ -v -k "your_notebook_name" |
| 296 | + |
| 297 | +# Verify PyProject.toml |
| 298 | +pytest tests/validation/test_pyproject_toml.py -v |
| 299 | + |
| 300 | +# Run your smoke tests (if added) |
| 301 | +pytest tests/examples/your-example-name/ -v |
| 302 | + |
| 303 | +# Run all tests together |
| 304 | +pytest -v |
| 305 | +``` |
0 commit comments