|
| 1 | +# GitHub Copilot Instructions for stochastic-benchmark |
| 2 | + |
| 3 | +## Repository Overview |
| 4 | + |
| 5 | +The `stochastic-benchmark` repository is a Python package for benchmarking and analyzing stochastic optimization algorithms. It provides comprehensive tools for bootstrap sampling, statistical analysis, and result visualization. |
| 6 | + |
| 7 | +## Directory Structure |
| 8 | + |
| 9 | +``` |
| 10 | +stochastic-benchmark/ |
| 11 | +├── .github/ |
| 12 | +│ ├── workflows/ |
| 13 | +│ │ └── ci.yml # GitHub Actions CI/CD pipeline |
| 14 | +│ ├── copilot-instructions.md # This file |
| 15 | +│ └── copilot-setup-steps.yml # Copilot setup configuration |
| 16 | +├── src/ # Main source code directory |
| 17 | +│ ├── __init__.py |
| 18 | +│ ├── bootstrap.py # Bootstrap sampling and resampling methods |
| 19 | +│ ├── cross_validation.py # Cross-validation utilities |
| 20 | +│ ├── df_utils.py # DataFrame manipulation utilities |
| 21 | +│ ├── interpolate.py # Data interpolation and resource generation |
| 22 | +│ ├── names.py # Path management and filename utilities |
| 23 | +│ ├── plotting.py # Visualization and plotting functions |
| 24 | +│ ├── random_exploration.py # Random exploration algorithms |
| 25 | +│ ├── sequential_exploration.py # Sequential exploration algorithms |
| 26 | +│ ├── stats.py # Statistical analysis and metrics |
| 27 | +│ ├── stochastic_benchmark.py # Main benchmarking framework |
| 28 | +│ ├── success_metrics.py # Success metric calculations |
| 29 | +│ ├── training.py # Training algorithms and optimization |
| 30 | +│ └── utils_ws.py # Workspace utilities |
| 31 | +├── tests/ # Comprehensive test suite |
| 32 | +│ ├── integration/ # Integration tests |
| 33 | +│ │ └── test_module_integration.py |
| 34 | +│ ├── test_bootstrap.py # Bootstrap module tests |
| 35 | +│ ├── test_df_utils.py # DataFrame utilities tests |
| 36 | +│ ├── test_interpolate.py # Interpolation tests |
| 37 | +│ ├── test_names.py # Names/path utilities tests |
| 38 | +│ ├── test_stats.py # Statistics tests |
| 39 | +│ ├── test_success_metrics.py # Success metrics tests |
| 40 | +│ └── test_training.py # Training algorithms tests |
| 41 | +├── examples/ # Example usage and tutorials |
| 42 | +├── requirements.txt # Python dependencies |
| 43 | +├── pyproject.toml # Project configuration and build settings |
| 44 | +├── run_tests.py # Test runner script |
| 45 | +├── TESTING.md # Testing guidelines and documentation |
| 46 | +└── README.md # Project documentation |
| 47 | +``` |
| 48 | + |
| 49 | +## Code Standards and Guidelines |
| 50 | + |
| 51 | +### Python Version Compatibility |
| 52 | +- **Minimum Python version**: 3.9 |
| 53 | +- **Tested versions**: 3.9, 3.10, 3.11, 3.12 |
| 54 | +- **Type hints**: Use `from typing import List, Dict, DefaultDict, ...` for compatibility |
| 55 | +- **Avoid**: Modern syntax like `list[str]` or `dict[str, int]` (use `List[str]`, `Dict[str, int]`) |
| 56 | + |
| 57 | +### Code Quality Standards |
| 58 | +- **Linting**: Code is linted with flake8 (max line length: 120 characters) |
| 59 | +- **Type annotations**: All function parameters and return types should be annotated |
| 60 | +- **Docstrings**: Use NumPy-style docstrings for all classes and functions |
| 61 | +- **Error handling**: Implement proper exception handling with meaningful error messages |
| 62 | + |
| 63 | +### Testing Requirements |
| 64 | + |
| 65 | +#### Test Coverage Expectations |
| 66 | +- **Unit tests**: Cover all public methods and functions |
| 67 | +- **Integration tests**: Test cross-module functionality |
| 68 | +- **Edge cases**: Handle empty inputs, boundary conditions, and error states |
| 69 | +- **Mocking**: Use `unittest.mock` for external dependencies and multiprocessing |
| 70 | + |
| 71 | +#### Test Patterns |
| 72 | +```python |
| 73 | +# Standard test class structure |
| 74 | +class TestModuleName: |
| 75 | + """Test class for module functionality.""" |
| 76 | + |
| 77 | + def test_function_basic(self): |
| 78 | + """Test basic functionality.""" |
| 79 | + # Setup |
| 80 | + input_data = create_test_data() |
| 81 | + expected = expected_result() |
| 82 | + |
| 83 | + # Execute |
| 84 | + result = module_function(input_data) |
| 85 | + |
| 86 | + # Assert |
| 87 | + assert result == expected |
| 88 | + assert isinstance(result, expected_type) |
| 89 | +``` |
| 90 | + |
| 91 | +#### Mocking Guidelines |
| 92 | +```python |
| 93 | +# For multiprocessing functions |
| 94 | +@patch('module.Pool') |
| 95 | +def test_multiprocess_function(self, mock_pool): |
| 96 | + mock_pool.return_value.__enter__.return_value.map.return_value = [expected_result] |
| 97 | + result = function_using_pool() |
| 98 | + assert result == expected_output |
| 99 | + |
| 100 | +# For success metrics evaluation |
| 101 | +@patch.object(SuccessMetric, 'evaluate') |
| 102 | +def test_bootstrap_with_metrics(self, mock_evaluate): |
| 103 | + def mock_evaluate_func(df, responses, resources): |
| 104 | + df['Key=Metric'] = [test_value] |
| 105 | + df['ConfInt=lower_Key=Metric'] = [lower_value] |
| 106 | + df['ConfInt=upper_Key=Metric'] = [upper_value] |
| 107 | + mock_evaluate.side_effect = mock_evaluate_func |
| 108 | +``` |
| 109 | + |
| 110 | +### Module-Specific Guidelines |
| 111 | + |
| 112 | +#### Bootstrap Module (`bootstrap.py`) |
| 113 | +- **Key classes**: `BootstrapParameters`, `BSParams_iter`, `BSParams_range_iter` |
| 114 | +- **Main functions**: `BootstrapSingle`, `Bootstrap`, `Bootstrap_reduce_mem` |
| 115 | +- **Testing notes**: Mock `initBootstrap` and success metrics; use proper column naming |
| 116 | +- **Multiprocessing**: Functions use local functions that require careful mocking |
| 117 | + |
| 118 | +#### Statistics Module (`stats.py`) |
| 119 | +- **Key classes**: `StatsParameters`, `Mean`, `Median` |
| 120 | +- **Main functions**: `StatsSingle`, `Stats`, `applyBounds` |
| 121 | +- **Testing notes**: Requires multiple rows of data (single row returns empty DataFrame) |
| 122 | +- **Column naming**: Uses `names.param2filename` convention |
| 123 | + |
| 124 | +#### Success Metrics (`success_metrics.py`) |
| 125 | +- **Key classes**: `Response`, `PerfRatio`, `SuccessProb`, `Resource` |
| 126 | +- **Testing notes**: Mock `evaluate` methods to populate DataFrames with correct column names |
| 127 | +- **Column format**: `Key=MetricName`, `ConfInt=lower_Key=MetricName`, `ConfInt=upper_Key=MetricName` |
| 128 | + |
| 129 | +#### Names Module (`names.py`) |
| 130 | +- **Main functions**: `param2filename`, `filename2param`, `parseDir` |
| 131 | +- **Testing notes**: Test parameter-to-filename conversion and directory parsing |
| 132 | +- **File paths**: Handle both relative and absolute paths correctly |
| 133 | + |
| 134 | +## Common Development Tasks |
| 135 | + |
| 136 | +### Adding New Tests |
| 137 | +1. **Create test file**: Follow naming convention `test_module_name.py` |
| 138 | +2. **Test structure**: Use class-based organization with descriptive method names |
| 139 | +3. **Setup/teardown**: Use pytest fixtures for complex setup |
| 140 | +4. **Assertions**: Prefer specific assertions over generic `assert True` |
| 141 | + |
| 142 | +### Working with DataFrames |
| 143 | +- **Empty checks**: Always check `len(df) > 0` before processing |
| 144 | +- **Column validation**: Verify expected columns exist before accessing |
| 145 | +- **Mock data**: Create realistic test DataFrames with appropriate dtypes |
| 146 | + |
| 147 | +### Multiprocessing Code |
| 148 | +- **Testing**: Always mock `Pool` at the module level (`@patch('module.Pool')`) |
| 149 | +- **Local functions**: Avoid local functions in multiprocessing contexts (pickle issues) |
| 150 | +- **Error handling**: Implement proper exception handling for process failures |
| 151 | + |
| 152 | +### Performance Considerations |
| 153 | +- **Large datasets**: Use sampling or mocking for performance-critical tests |
| 154 | +- **Memory usage**: Monitor memory consumption in tests with large data |
| 155 | +- **Timeout**: Set appropriate timeouts for long-running operations |
| 156 | + |
| 157 | +## Debugging Common Issues |
| 158 | + |
| 159 | +### Test Failures |
| 160 | +1. **Import errors**: Check PYTHONPATH and module imports |
| 161 | +2. **Mock issues**: Verify mock targets use correct module paths |
| 162 | +3. **Empty DataFrames**: Ensure test data has multiple rows for statistics |
| 163 | +4. **Column name errors**: Use `names.param2filename` for consistent naming |
| 164 | + |
| 165 | +### Type Annotation Issues |
| 166 | +- **Import error**: Add missing imports from `typing` module |
| 167 | +- **Compatibility**: Use `List`, `Dict`, `DefaultDict` instead of built-in generics |
| 168 | +- **Union types**: Use `Union[type1, type2]` for multiple possible types |
| 169 | + |
| 170 | +### CI/CD Issues |
| 171 | +- **Dependency conflicts**: Update `requirements.txt` and CI configuration |
| 172 | +- **Platform differences**: Test on Ubuntu environment matching CI |
| 173 | +- **Coverage failures**: Ensure tests cover all code paths |
| 174 | + |
| 175 | +## Best Practices Summary |
| 176 | + |
| 177 | +1. **Follow existing code patterns** in the repository |
| 178 | +2. **Write comprehensive tests** before implementing features |
| 179 | +3. **Use proper type annotations** for all new code |
| 180 | +4. **Mock external dependencies** appropriately in tests |
| 181 | +5. **Handle edge cases** and error conditions |
| 182 | +6. **Maintain backward compatibility** with existing APIs |
| 183 | +7. **Document complex algorithms** with clear comments |
| 184 | +8. **Test on multiple Python versions** (3.9-3.12) |
| 185 | +9. **Keep functions focused** with single responsibilities |
| 186 | +10. **Use descriptive variable names** and function signatures |
| 187 | + |
| 188 | +## Getting Help |
| 189 | + |
| 190 | +- **Test execution**: Use `python run_tests.py` for comprehensive testing |
| 191 | +- **Documentation**: Refer to `TESTING.md` for detailed testing guidelines |
| 192 | +- **Examples**: Check `examples/` directory for usage patterns |
| 193 | +- **CI logs**: Review GitHub Actions output for build failures |
| 194 | + |
| 195 | +This repository maintains high standards for code quality, test coverage, and documentation. When contributing, ensure all tests pass and follow the established patterns for consistency and maintainability. |
0 commit comments