Purpose: Comprehensive test suite execution and validation for the GNN processing pipeline
Pipeline Step: Step 2: Test suite execution (2_tests.py)
Category: Testing / Quality Assurance
Status: ✅ Production Ready
Version: 1.0.0
Last Updated: 2026-03-24
- Comprehensive test suite execution
- Test result collection and analysis
- Coverage analysis and reporting
- Performance testing and benchmarking
- Test environment management and validation
- Multi-level test execution (unit, integration, performance)
- Comprehensive test reporting and analysis
- Coverage analysis and optimization
- Performance benchmarking and profiling
- Test environment validation and setup
run_tests(logger, output_dir, verbose=False, include_slow=False, fast_only=True, comprehensive=False, generate_coverage=False, auto_fallback=True) -> bool
Description: Main test execution function called by orchestrator (2_tests.py). Routes to appropriate test execution mode based on parameters.
Parameters:
logger(logging.Logger): Logger instance for progress reportingoutput_dir(Path): Output directory for test resultsverbose(bool): Enable verbose output (default: False)include_slow(bool): Include slow tests (deprecated, use comprehensive instead) (default: False)fast_only(bool): Run only fast tests (default: True)comprehensive(bool): Run comprehensive test suite - all tests (default: False)generate_coverage(bool): Generate coverage reports (default: False)auto_fallback(bool): If fast mode collects zero tests, retry with comprehensive mode (default: True)
Returns: True if tests passed, False otherwise
Behavior:
- If
comprehensive=True: Runs all tests viarun_comprehensive_tests() - If
fast_only=Trueandcomprehensive=False: Runs fast tests viarun_fast_pipeline_tests(). If that fails andauto_fallback=Trueand the execution report shows zero tests collected, falls back torun_comprehensive_tests() - Otherwise: Runs reliable fast tests via
run_fast_reliable_tests()
Strict markers: pyproject.toml enables --strict-markers. Unregistered markers (for example anyio without pytest-anyio) break collection. Prefer sync tests that call asyncio.run() for short async checks when the environment may omit dev extras; with uv sync --extra dev, pytest-asyncio and @pytest.mark.asyncio are available.
Example:
from tests import run_tests
success = run_tests(
logger=logger,
output_dir=Path("output/2_tests_output"),
verbose=True,
fast_only=True,
comprehensive=False
)Description: Run fast test suite for quick pipeline validation
Parameters:
logger(logging.Logger): Logger instance for progress reportingoutput_dir(Path): Output directory for test resultsverbose(bool): Enable verbose output
Returns: True if tests passed or collection errors detected and reported
Features:
- Automatic detection of collection errors (import errors, syntax errors)
- Clear error messages with actionable suggestions
- Fast test execution (skips slow tests)
- Comprehensive error reporting
Description: Run comprehensive test suite with all tests enabled. Includes slow tests, performance tests, and full coverage analysis.
Parameters:
logger(logging.Logger): Logger instance for progress reportingoutput_dir(Path): Output directory for test resultsverbose(bool): Enable verbose output (default: False)generate_coverage(bool): Generate coverage reports (default: False)
Returns: True if tests passed, False otherwise
Features:
- Executes all test categories from
MODULAR_TEST_CATEGORIES - Includes slow and performance tests
- Generates comprehensive coverage reports if enabled
- Uses category-based execution with resource monitoring
Description: Run a reliable subset of fast tests with improved error handling. Focuses on essential tests that should always pass.
Parameters:
logger(logging.Logger): Logger instance for progress reportingoutput_dir(Path): Output directory for test resultsverbose(bool): Enable verbose output (default: False)
Returns: True if tests passed, False otherwise
Features:
- Runs only essential test files:
test_core_modules.py,test_fast_suite.py,test_main_orchestrator.py - 90-second timeout for reliability
- Improved error handling and reporting
- Used as recovery when fast pipeline tests are not suitable
Description: Extract and parse collection errors from pytest output. Detects import errors, syntax errors, and other collection failures.
Parameters:
stdout(str): Standard output from pyteststderr(str): Standard error from pytest
Returns: List of unique error messages (strings)
Error Types Detected:
ERROR collecting- Test file collection failuresNameError- Missing variable/import namesImportError- Module import failuresSyntaxError- Code syntax issues
Example:
errors = _extract_collection_errors(pytest_stdout, pytest_stderr)
# Returns: ["test_file.py: ImportError: No module named 'missing_module'"]pytest- Test frameworkpytest-cov- Coverage analysispathlib- Path manipulation
pytest-xdist- Parallel test executionpytest-benchmark- Performance benchmarkingpytest-html- HTML test reports
utils.pipeline_template- Pipeline utilities
TEST_CONFIG = {
'fast_tests': True,
'standard_tests': True,
'slow_tests': False,
'performance_tests': False,
'coverage_analysis': True,
'parallel_execution': True,
'timeout': 300
}TEST_CATEGORIES = {
'unit': ['test_*unit*.py'],
'integration': ['test_*integration*.py'],
'performance': ['test_*performance*.py'],
'slow': ['test_*slow*.py']
}from tests.runner import run_tests
success = run_tests(
logger=logger,
output_dir=Path("output/2_tests_output"),
verbose=True,
comprehensive=True
)from tests import run_tests
from pathlib import Path
import logging
logger = logging.getLogger(__name__)
success = run_tests(
logger=logger,
output_dir=Path("output/2_tests_output"),
verbose=True,
fast_only=True,
comprehensive=False
)from tests import run_tests
from pathlib import Path
import logging
logger = logging.getLogger(__name__)
success = run_tests(
logger=logger,
output_dir=Path("output/2_tests_output"),
verbose=True,
comprehensive=True,
generate_coverage=True
)from tests.runner import run_fast_reliable_tests
from pathlib import Path
import logging
logger = logging.getLogger(__name__)
success = run_fast_reliable_tests(
logger=logger,
output_dir=Path("output/2_tests_output"),
verbose=True
)from tests.runner import run_comprehensive_tests
from pathlib import Path
import logging
logger = logging.getLogger(__name__)
success = run_comprehensive_tests(
logger=logger,
output_dir=Path("output/2_tests_output"),
verbose=True,
generate_coverage=True
)test_results.json- Test execution resultscoverage.xml- Coverage analysis reporttest_report.html- HTML test reportperformance_report.json- Performance analysistest_summary.md- Human-readable test summary
output/2_tests_output/
├── test_results.json
├── coverage.xml
├── test_report.html
├── performance_report.json
├── test_summary.md
└── test_details/
├── unit_tests/
├── integration_tests/
└── performance_tests/
- Duration: ~5-15 minutes for comprehensive suite
- Memory: ~100-300MB during test execution
- Status: ✅ Production Ready
- Fast Tests: 1-3 minutes
- Standard Tests: 3-8 minutes
- Slow Tests: 5-15 minutes
- Performance Tests: 10-30 minutes
- Test Failures: Individual test case failures
- Collection Errors: Import errors, syntax errors, or missing dependencies during test collection
- Setup Errors: Test environment setup failures
- Dependency Errors: Missing test dependencies
- Timeout Errors: Test execution timeouts
- Coverage Errors: Coverage analysis failures
- Collection Error Detection: Automatic detection and reporting of import/syntax errors during test collection
- Test Isolation: Run tests in isolation on failure
- Environment Reset: Reset test environment
- Dependency Installation: Install missing dependencies
- Timeout Adjustment: Adjust test timeouts
- Error Reporting: Comprehensive error documentation with actionable suggestions
- Script:
2_tests.py(Step 2) - Function:
run_tests()
utils.pipeline_template- Pipeline utilities
main.py- Pipeline orchestrationtests.test_*- Individual test modules
The test infrastructure follows the thin orchestrator pattern:
flowchart TD
A["2_tests.py<br/>(Thin Orchestrator)"] -->|"Calls run_tests()"| B["runner.run_tests()"]
B -->|"fast_only=True"| C["run_fast_pipeline_tests()"]
B -->|"comprehensive=True"| D["run_comprehensive_tests()"]
B -->|"recovery"| E["run_fast_reliable_tests()"]
C --> F["ModularTestRunner"]
D --> F
E --> F
F -->|"Category execution"| G["pytest execution"]
G -->|"Test discovery"| H["Test files"]
G -->|"Test execution"| I["Test results"]
I -->|"Result collection"| J["Reporting & Analysis"]
J --> K["JSON reports"]
J --> L["Markdown summaries"]
J --> M["Coverage reports"]
style A fill:#e1f5ff
style B fill:#fff4e1
style F fill:#e8f5e9
style G fill:#f3e5f5
style J fill:#fce4ec
Component Responsibilities:
- 2_tests.py: Thin orchestrator that handles CLI arguments, logging setup, and delegates to test runner
- runner.run_tests(): Main entry point that routes to appropriate test execution mode
- run_fast_pipeline_tests(): Default mode - fast tests for quick pipeline validation
- run_comprehensive_tests(): Comprehensive mode - all tests with full coverage
- run_fast_reliable_tests(): Recovery mode - essential tests only
- ModularTestRunner: Category-based test execution with resource monitoring
- pytest: Test framework for actual test discovery and execution
2_tests.py (Thin Orchestrator):
- Handles command-line arguments
- Sets up logging and output directories
- Delegates to
tests.run_tests()fromtests/__init__.py - Returns standardized exit codes
runner.py (Core Implementation):
- Contains all test execution logic
- Provides
run_tests(),run_fast_pipeline_tests(),run_comprehensive_tests(), etc. - Implements
ModularTestRunnerfor category-based execution - Handles resource monitoring and error recovery
test_utils.py (Shared Utilities):
- Provides test fixtures and helper functions
- Defines test categories and markers
- Provides test data creation utilities
- Used by both test files and runner
conftest.py (Pytest Fixtures):
- Defines pytest fixtures for all tests
- Configures pytest markers
- Provides shared test setup/teardown
- Handles test environment configuration
Test Discovery → Environment Setup → Test Execution → Result Collection → Report Generation
To add a new test category to MODULAR_TEST_CATEGORIES in runner.py:
MODULAR_TEST_CATEGORIES["new_module"] = {
"name": "New Module Tests",
"description": "Tests for the new module",
"files": [
"test_template_overall.py",
"test_new_module_integration.py"
],
"markers": ["new_module"], # Optional pytest markers
"timeout_seconds": 120, # Category timeout
"max_failures": 8, # Max failures before stopping
"parallel": True # Allow parallel execution
}Follow the naming convention:
test_MODULENAME_overall.py- Comprehensive module teststest_MODULENAME_area.py- Specific area tests (e.g.,test_gnn_parsing.py)test_MODULENAME_integration.py- Integration tests
Example:
# src/tests/test_template_overall.py
import pytest
from pathlib import Path
@pytest.mark.fast
def test_new_module_basic():
"""Test basic functionality."""
# Test implementation
pass
@pytest.mark.slow
def test_new_module_complex():
"""Test complex scenarios."""
# Test implementation
pass- 120+
test_*.pymodules undersrc/tests/(exact count drifts; usefind src/tests -maxdepth 1 -name 'test_*.py' | wc -l) - ~1,906 tests in a typical full run with standard Ollama ignores (see root
CLAUDE.md/README.mdfor the exactpytestcommand) - 20+ test categories for organized execution
- 25+ test markers for selective execution
- Scale: Treat
pytest --collect-only -qas ground truth for item counts; marker filters (-m not slow, ignores) change what runs in the pipeline fast suite. - Fast Tests: Many functions use
@pytest.mark.fast - Integration Tests:
@pytest.mark.integration - Unit Tests:
@pytest.mark.unit - Performance Tests:
@pytest.mark.performance - Safe-to-Fail Tests:
@pytest.mark.safe_to_fail
- Fast Tests (
--fast-only): 1-3 minutes, essential validation - Comprehensive Tests (
--comprehensive): 5-15 minutes, all tests including slow/performance - Reliable Tests (recovery): Essential tests only, 90-second timeout
- Module Import Validation: All modules can be imported and have expected structure
- Core Functionality: All core functions execute correctly with real data
- Integration Testing: Cross-module integration with real data flow
- Error Handling: Comprehensive error scenario testing with real failure modes
- Performance Benchmarking: Performance regression detection
- Coverage Analysis: Code coverage tracking and reporting
- Test Suite Execution: Test runner functionality and management
- No Simulated Usage: All tests use real implementations per testing policy
- Real Data: All tests use real, representative data (no synthetic/placeholder data)
- Real Dependencies: Tests use real dependencies (skip if unavailable, never simulated)
- File-Based Assertions: Tests assert on real file outputs and artifacts
- Error Recovery: Tests validate error handling with real failure modes
- Performance Monitoring: Built-in timing and resource usage tracking
tests.run_suite- Run test suitetests.run_fast- Run fast teststests.get_coverage- Get coverage reporttests.get_performance- Get performance metrics
@mcp_tool("tests.run_suite")
def run_test_suite_tool(output_dir):
"""Run comprehensive test suite"""
# Implementation