|
| 1 | +# NNV 3.0 CodeOcean Capsule |
| 2 | + |
| 3 | +This capsule runs the NNV 3.0 verification test suite, including: |
| 4 | + |
| 5 | +1. **FairNNV** - Fairness verification on Adult Census dataset |
| 6 | +2. **ProbVer** - Probabilistic verification using CP-Star on YOLO benchmark |
| 7 | +3. **ModelStar** - Weight perturbation verification on MNIST |
| 8 | +4. **VideoStar** - Video classification verification |
| 9 | +5. **GNNV** - Graph Neural Network verification on IEEE 24-bus Power Flow |
| 10 | + |
| 11 | +## Model Loading |
| 12 | + |
| 13 | +Different tests use different model loading approaches: |
| 14 | + |
| 15 | +| Test | Model Format | Loading Method | |
| 16 | +|------|-------------|----------------| |
| 17 | +| **ProbVer** | ONNX (`.onnx`) | `importNetworkFromONNX()` at runtime | |
| 18 | +| **FairNNV** | Pre-converted (`.mat`) | Direct `load()` of NNV network | |
| 19 | +| **VideoStar** | Pre-converted (`.mat`) | Direct `load()` of NNV network | |
| 20 | +| **ModelStar** | Native MATLAB (`.mat`) | `matlab2nnv()` conversion | |
| 21 | +| **GNNV** | Native MATLAB (`.mat`) | Direct `load()` of model weights | |
| 22 | + |
| 23 | +### ONNX Runtime Loading (ProbVer) |
| 24 | + |
| 25 | +ProbVer loads ONNX models directly using `importNetworkFromONNX()`. Due to ONNX layer serialization limitations: |
| 26 | +- The parallel pool is disabled (`numCores = 1`) |
| 27 | +- ONNX-specific layers (`nnet.onnx.layer.*`) are removed from the network |
| 28 | +- Falls back to `importONNXNetwork()` if primary loader fails |
| 29 | + |
| 30 | +**ProbVer Python Note:** ProbVer uses CP-Star probabilistic reachability which requires Python (PyTorch, NumPy, SciPy). The Python scripts run via `system()` calls to a virtual environment created during setup. Data is exchanged between MATLAB and Python using `.mat` files (via `scipy.io`). |
| 31 | + |
| 32 | +### Pre-converted Models (FairNNV, VideoStar) |
| 33 | + |
| 34 | +FairNNV and VideoStar use pre-converted `.mat` files containing serialized NNV networks. This approach: |
| 35 | +- Avoids runtime ONNX parsing |
| 36 | +- Works without the ONNX support package (stub classes provided in `/code/+nnet/+onnx/+layer/`) |
| 37 | +- Enables faster startup |
| 38 | + |
| 39 | +## Setup Instructions |
| 40 | + |
| 41 | +### 1. Create CodeOcean Capsule |
| 42 | + |
| 43 | +1. Go to [CodeOcean](https://codeocean.com) and create a new capsule |
| 44 | +2. Select **MATLAB R2024b** as the base environment |
| 45 | + |
| 46 | +### 2. Upload Environment Files |
| 47 | + |
| 48 | +Upload the contents of `environment/` to the CodeOcean environment section: |
| 49 | + |
| 50 | +- `Dockerfile` - Custom Docker configuration |
| 51 | +- `postInstall` - Post-installation script |
| 52 | + |
| 53 | +### 3. Upload Code Files |
| 54 | + |
| 55 | +Upload all files from `code/` to the CodeOcean `/code` directory: |
| 56 | + |
| 57 | +- `run.m` - Main entry point |
| 58 | +- `run_fairnnv.m` - FairNNV test runner |
| 59 | +- `run_probver.m` - ProbVer test runner |
| 60 | +- `run_modelstar.m` - ModelStar test runner |
| 61 | +- `run_videostar.m` - VideoStar test runner |
| 62 | +- `run_gnnv.m` - GNNV test runner |
| 63 | + |
| 64 | +### 4. Upload Data Files |
| 65 | + |
| 66 | +Upload the following data files to `/data` on CodeOcean: |
| 67 | + |
| 68 | +``` |
| 69 | +/data/ |
| 70 | +├── ICAIF24/ # FairNNV data |
| 71 | +│ ├── adult_onnx/ # ONNX models |
| 72 | +│ │ ├── AC-1.onnx |
| 73 | +│ │ └── AC-3.onnx |
| 74 | +│ └── data/ |
| 75 | +│ └── adult_data.mat |
| 76 | +├── ProbVer/ # ProbVer data |
| 77 | +│ └── yolo_2023/ |
| 78 | +│ ├── onnx/ # ONNX model |
| 79 | +│ │ └── TinyYOLO.onnx |
| 80 | +│ ├── vnnlib/ |
| 81 | +│ │ └── *.vnnlib.gz |
| 82 | +│ └── instances.csv |
| 83 | +├── FORMALISE2025/ # VideoStar data |
| 84 | +│ ├── models/ # ONNX model |
| 85 | +│ │ └── zoomin_4f.onnx |
| 86 | +│ └── data/ |
| 87 | +│ └── ZoomIn/ |
| 88 | +│ └── *.npy |
| 89 | +├── GNNV/ # GNNV data |
| 90 | +│ └── models/ |
| 91 | +│ ├── gcn_ieee24.mat |
| 92 | +│ └── gine_ieee24.mat |
| 93 | +└── MNIST/ # ModelStar data |
| 94 | + └── mnist_model_fc.mat |
| 95 | +``` |
| 96 | + |
| 97 | +**Source locations in this repository:** |
| 98 | + |
| 99 | +| CodeOcean Path | Source in Repository | |
| 100 | +|---------------|---------------------| |
| 101 | +| `/data/ICAIF24/adult_onnx/` | `examples/Submission/ICAIF24/onnx/` | |
| 102 | +| `/data/ICAIF24/data/` | `examples/Submission/ICAIF24/data/` | |
| 103 | +| `/data/ProbVer/yolo_2023/onnx/` | `examples/NNV3.0/ProbVer/yolo_2023/onnx/` | |
| 104 | +| `/data/ProbVer/yolo_2023/vnnlib/` | `examples/NNV3.0/ProbVer/yolo_2023/vnnlib/` | |
| 105 | +| `/data/FORMALISE2025/models/` | `examples/NNV3.0/VideoStar/models/` | |
| 106 | +| `/data/FORMALISE2025/data/` | `examples/NNV3.0/VideoStar/data/` | |
| 107 | +| `/data/GNNV/models/` | `codeocean/data/GNNV/models/` | |
| 108 | +| `/data/MNIST/mnist_model_fc.mat` | `examples/Tutorial/NN/MNIST/weightPerturb/mnist_model_fc.mat` | |
| 109 | + |
| 110 | +### 5. Configure MATLAB Toolboxes |
| 111 | + |
| 112 | +The following MATLAB toolboxes are required (installed via postInstall): |
| 113 | + |
| 114 | +- Computer Vision Toolbox |
| 115 | +- Control System Toolbox |
| 116 | +- Deep Learning Toolbox |
| 117 | +- Deep Learning Toolbox Converter for ONNX Model Format |
| 118 | +- Image Processing Toolbox |
| 119 | +- Optimization Toolbox |
| 120 | +- Parallel Computing Toolbox |
| 121 | +- Statistics and Machine Learning Toolbox |
| 122 | +- Symbolic Math Toolbox |
| 123 | +- System Identification Toolbox |
| 124 | + |
| 125 | +### 6. Run the Capsule |
| 126 | + |
| 127 | +Click **Reproducible Run** in CodeOcean. |
| 128 | + |
| 129 | +## Expected Output |
| 130 | + |
| 131 | +Results are saved to `/results/`: |
| 132 | + |
| 133 | +``` |
| 134 | +/results/ |
| 135 | +├── test_summary.txt # Overall test summary |
| 136 | +├── FairNNV/ |
| 137 | +│ ├── fm26_counterfactual_*.csv |
| 138 | +│ ├── fm26_individual_*.csv |
| 139 | +│ └── fm26_timing_*.csv |
| 140 | +├── ProbVer/ |
| 141 | +│ └── results_summary.csv |
| 142 | +├── ModelStar/ |
| 143 | +│ ├── modelstar_results.csv |
| 144 | +│ └── modelstar_summary.txt |
| 145 | +├── VideoStar/ |
| 146 | +│ ├── eps=1_255.csv |
| 147 | +│ ├── eps=2_255.csv |
| 148 | +│ └── eps=3_255.csv |
| 149 | +└── GNNV/ |
| 150 | + └── gnnv_results_*.csv |
| 151 | +``` |
| 152 | + |
| 153 | +## Test Descriptions |
| 154 | + |
| 155 | +### FairNNV |
| 156 | +Tests individual and counterfactual fairness on neural networks trained on the Adult Census dataset. Verifies that protected attributes (e.g., sex) don't unfairly influence predictions. |
| 157 | + |
| 158 | +### ProbVer (CP-Star) |
| 159 | +Probabilistic verification using the CP-Star reachability method on TinyYOLO object detection benchmark. Tests property satisfaction with coverage and confidence guarantees. |
| 160 | + |
| 161 | +### ModelStar |
| 162 | +Weight perturbation verification on MNIST digit classification. Tests robustness to weight perturbations at various magnitudes. |
| 163 | + |
| 164 | +### VideoStar |
| 165 | +Video classification verification on ZoomIn dataset. Tests robustness to input perturbations across multiple frames. |
| 166 | + |
| 167 | +### GNNV |
| 168 | +Graph Neural Network verification on IEEE 24-bus Power Flow task. Tests GCN and GINE architectures with various epsilon perturbations. |
| 169 | + |
| 170 | +## Notes |
| 171 | + |
| 172 | +- **ProbVer requires GPU**: Select a GPU-enabled machine in CodeOcean for best performance |
| 173 | +- **Runtime**: Full test suite may take 30+ minutes depending on configuration |
| 174 | +- **Reduced samples**: Some scripts use reduced sample sizes for quicker execution. Modify the config sections to increase coverage. |
| 175 | + |
| 176 | +## Test Coverage |
| 177 | + |
| 178 | +The CodeOcean tests are configured to provide good coverage while keeping runtime reasonable: |
| 179 | + |
| 180 | +| Test | CodeOcean Configuration | Original Configuration | Coverage | |
| 181 | +|------|------------------------|------------------------|----------| |
| 182 | +| **FairNNV** | 100 samples × 7 epsilon × 2 models | Same | **100%** | |
| 183 | +| **ProbVer** | 3 VNNLIB instances | 3 instances (72 available) | **100%** of original | |
| 184 | +| **VideoStar** | 10 samples × 3 epsilon | Same | **100%** | |
| 185 | +| **GNNV** | 10 scenarios × 3 epsilon × 3 models | Same | **100%** | |
| 186 | +| **ModelStar** | 100 images × 3 layers × 4 fracs | Same | **100%** | |
| 187 | + |
| 188 | +### Increasing Coverage |
| 189 | + |
| 190 | +To run more comprehensive tests, modify the config sections in each `run_*.m` file: |
| 191 | + |
| 192 | +- **ProbVer**: Increase `numSamples` to run more VNNLIB instances (up to 72 available) |
| 193 | +- **VideoStar**: Expand `config.sampleIndices` beyond `1:10` |
| 194 | + |
| 195 | +## Implementation Differences |
| 196 | + |
| 197 | +The CodeOcean test runners are adapted versions of the original NNV 3.0 example scripts. This section documents the key differences. |
| 198 | + |
| 199 | +**Important:** All tests properly call NNV engine functions for verification. The inline helper functions are either: |
| 200 | +1. **Exact copies** from the original scripts (e.g., `perturbationIF`, `L_inf_attack`) |
| 201 | +2. **Data preprocessing** that constructs NNV objects (`Star`, `ImageStar`, `VolumeStar`, `GraphStar`) |
| 202 | + |
| 203 | +### NNV Engine Functions Used |
| 204 | + |
| 205 | +| Test | Key NNV Engine Calls | |
| 206 | +|------|---------------------| |
| 207 | +| FairNNV | `matlab2nnv()`, `net.reach()`, `verify_specification()`, `Star` | |
| 208 | +| ProbVer | `Prob_reach()`, `verify_specification()`, `ImageStar`, `load_vnnlib()` | |
| 209 | +| VideoStar | `net.verify_robustness()`, `VolumeStar` | |
| 210 | +| GNNV | `GNN.reach()`, `GCNLayer`, `GINELayer`, `GraphStar`, `verify_specification()` | |
| 211 | +| ModelStar | `WPutils.*`, `matlab2nnv()`, `load_images_MNIST()` | |
| 212 | + |
| 213 | +### General Changes (All Tests) |
| 214 | +- **Function format**: CodeOcean scripts are wrapped in functions; originals are scripts |
| 215 | +- **Hardcoded paths**: CodeOcean uses `/data/` and `/results/`; originals use relative paths |
| 216 | +- **Self-contained**: CodeOcean scripts include all helper functions inline; originals use external dependencies |
| 217 | +- **Reduced samples**: Sample counts reduced for faster CodeOcean execution |
| 218 | +- **Warning suppression**: CodeOcean scripts suppress warnings for cleaner output |
| 219 | + |
| 220 | +### FairNNV (`run_fairnnv.m`) |
| 221 | + |
| 222 | +| Aspect | Original | CodeOcean | |
| 223 | +|--------|----------|-----------| |
| 224 | +| Structure | Calls external `adult_verify_fm26.m` and `plot_fm26_results.m` | All logic inline | |
| 225 | +| Model loading | Loads ONNX directly via `importNetworkFromONNX` | Uses pre-converted `.mat` files | |
| 226 | +| Figure generation | Generates PNG/PDF figures | Skipped (CSV output only) | |
| 227 | +| Helper function | Uses external `perturbationIF` | Includes `perturbationIF` inline | |
| 228 | +| Lines | ~150 | ~270 | |
| 229 | + |
| 230 | +### ProbVer (`run_probver.m`) |
| 231 | + |
| 232 | +| Aspect | Original | CodeOcean | |
| 233 | +|--------|----------|-----------| |
| 234 | +| Model loading | `importNetworkFromONNX` | Same, with ONNX layer removal logic | |
| 235 | +| Python integration | Uses `pyenv()` for loading `.npz` | Uses `system()` calls only; saves `.mat` | |
| 236 | +| Parallel pool | Enabled | Disabled (ONNX layers not serializable) | |
| 237 | +| Instance limit | Full VNN-LIB instances | Limited to 3 instances | |
| 238 | +| Lines | ~427 | ~450 | |
| 239 | + |
| 240 | +**Key fix (Python)**: The original used MATLAB's `pyenv()` and `py.numpy.load()` to read Python-generated `.npz` files. This fails on CodeOcean because venv Python lacks `libpython.so`. The fix changes Python to save `.mat` files via `scipy.io.savemat()`, and MATLAB loads them natively. |
| 241 | + |
| 242 | +**ONNX workarounds**: ONNX-imported networks contain layers that cannot be serialized to parallel workers: |
| 243 | +- Removes `nnet.onnx.layer.*` layers (e.g., `VerifyBatchSizeLayer`, `CustomOutputLayer`) from the layer graph |
| 244 | +- Disables parallel pool (`numCores = 1`) because `ONNXParameters` also fail serialization |
| 245 | +- Falls back to `importONNXNetwork()` if `importNetworkFromONNX()` fails |
| 246 | + |
| 247 | +### VideoStar (`run_videostar.m`) |
| 248 | + |
| 249 | +| Aspect | Original | CodeOcean | |
| 250 | +|--------|----------|-----------| |
| 251 | +| Model loading | Loads ONNX via `importNetworkFromONNX` | Uses pre-converted `.mat` files | |
| 252 | +| Verification | Calls external `verifyvideo()` from FORMALISE2025 | Inline `L_inf_attack()` and `verify_robustness()` | |
| 253 | +| External deps | Requires `FORMALISE2025/src/vvn/` on path | Self-contained | |
| 254 | +| npy-matlab | Requires separate npy-matlab installation | Bundled `readNPY` function | |
| 255 | +| Lines | ~255 | ~237 | |
| 256 | + |
| 257 | +### GNNV (`run_gnnv.m`) |
| 258 | + |
| 259 | +| Aspect | Original | CodeOcean | |
| 260 | +|--------|----------|-----------| |
| 261 | +| Scenarios | 10 (indices 1:100:1000) | 10 (indices 1:100:1000) - **matches original** | |
| 262 | +| Figure generation | Calls `generate_cav26_dashboard()` and `generate_latex_table()` | Skipped | |
| 263 | +| Logging | Optional quiet mode with diary | Always verbose | |
| 264 | +| Results storage | Includes `voltage_bounds` and `bound_widths` for figures | Only verification counts | |
| 265 | +| Lines | ~558 | ~462 | |
| 266 | + |
| 267 | +### ModelStar (`run_modelstar.m`) |
| 268 | + |
| 269 | +| Aspect | Original | CodeOcean | |
| 270 | +|--------|----------|-----------| |
| 271 | +| Architecture | Class-based (`EXPT` class with YAML config) | Simple function | |
| 272 | +| Configuration | YAML files for experiment definitions | Inline config struct | |
| 273 | +| Layers tested | 3 layers (fc_6, fc_5, fc_4) | 3 layers (fc_6, fc_5, fc_4) - **matches original** | |
| 274 | +| Perturbation fracs | 4 per layer | 4 per layer - **matches original** | |
| 275 | +| Images | 100 | 100 - **matches original** | |
| 276 | +| Results | YAML files with plotting support | CSV and TXT summary | |
| 277 | +| Model source | Various MNIST architectures | Single `mnist_model_fc.mat` | |
| 278 | + |
| 279 | +### NNV Engine Modifications |
| 280 | + |
| 281 | +The following NNV engine files were modified for CodeOcean compatibility: |
| 282 | + |
| 283 | +**`Direction_trainer.py`** (`engine/nn/Prob_reach/`): |
| 284 | +- Changed `np.savez()` to `scipy.io.savemat()` for `.mat` output |
| 285 | +- Enables MATLAB to load directions without Python integration |
| 286 | + |
| 287 | +**`ProbReach_ImageStar.m`** (`engine/nn/Prob_reach/`): |
| 288 | +- Removed `pyenv()` call and `py.numpy.load()` |
| 289 | +- Uses native MATLAB `load()` for `.mat` files |
| 290 | + |
| 291 | +**`Prob_reach.m`** (`engine/utils/`): |
| 292 | +- Removed `pyenv` executable path setup (no longer needed) |
| 293 | + |
| 294 | +## Troubleshooting |
| 295 | + |
| 296 | +### "Model file not found" |
| 297 | +Ensure all data files are uploaded to the correct paths in `/data/`. Check that ONNX files are in the correct directories. |
| 298 | + |
| 299 | +### "NNV not found" |
| 300 | +The postInstall script should clone NNV automatically. If issues persist, manually add NNV to the MATLAB path. |
| 301 | + |
| 302 | +### GPU errors on ProbVer |
| 303 | +ProbVer's CP-Star method uses PyTorch which defaults to CPU mode on CodeOcean. For faster execution, select a GPU-enabled compute environment. |
| 304 | + |
| 305 | +### Python/pyenv errors on ProbVer |
| 306 | +ProbVer uses Python scripts for CP-Star training. The scripts run via `system()` calls (not MATLAB's Python integration). Data is exchanged using `.mat` files via `scipy.io.savemat()` and MATLAB's `load()`. If you see Python-related errors: |
| 307 | +- Ensure the postInstall script created the venv at `/deps/probver_venv` |
| 308 | +- Check that torch, numpy, and scipy are installed in the venv |
| 309 | +- The run.m script creates a symlink from `nnvroot()/.venv` to the actual venv location |
| 310 | + |
| 311 | +### ONNX layer serialization warnings |
| 312 | +When running ProbVer, you may see warnings about ONNX layers not being serializable to parallel workers. This is expected - the parallel pool is disabled automatically to work around this limitation. |
0 commit comments