- Overview
- Requirements
- Setup
- Running the Application
- Project Structure
- Database Setup
- Additional Configuration
- User Interface Guide
DeepBench Analysis is a visualization and analysis tool for evaluating the robustness of machine learning models against various image perturbations. It provides an interactive dashboard that allows researchers and practitioners to:
- Compare different ML models' performance under various image augmentations
- Analyze model stability across different use cases (Medical, Autonomous Driving, etc.)
- Visualize the impact of perturbations through detailed metrics and plots
- Browse and compare results across different model collections
- Python 3.10 or higher
- MongoDB instance (local or remote)
- Git
- Clone the repository:
git clone [repository-url]
cd deepbench_analysis- Create and activate a virtual environment:
python -m venv venv
# On Windows
venv\Scripts\activate
# On Unix or MacOS
source venv/bin/activate- Install the package and development dependencies:
pip install -e .- Create a
.envfile in the root directory with the following variables:
# MongoDB Credentials
DBUSER=your_mongodb_username
DBPASSWD=your_mongodb_password
# Custom MongoDB URI
MONGODB_URI=custom_hostStart the Streamlit application:
streamlit run src/deepbench_analysis.pyThe dashboard will be available at http://localhost:8501 by default.
configs/- Configuration filessrc/- Source codesrc/mappings/- Dataset mapping filessrc/app/- Streamlit application codesrc/deepbench_analysis/- Core analysis functionalitysrc/deepbench_analysis/additional_metrics/- Additional metrics calculation codesrc/deepbench_analysis/config/- Configuration parsing codesrc/deepbench_analysis/db/- Database connection and query codesrc/deepbench_analysis/logger/- Logging and performance monitoring codesrc/deepbench_analysis/mongodb_data_processing/- Data extraction and visualization codesrc/deepbench_analysis/tabs/- Streamlit tab code
The application requires a MongoDB instance with:
- A database named "Deepbench"
- Collections containing model evaluation results
- Each collection should follow the project's schema for model results:
- the collection should contain more than 10 documents
- only one model per collection
- models are only comparable if they have the same augmentations
- general document structure:
{ "experiment_name": "test_collection_2025-04-19-23_44_55", "git": "ec853d3e45b5ae525b061ba66d24b5568d35a11f", "image": "/path/to/image.jpg", "gt": "0", "resolution": { "original": [ 256, 256 ], "scaled": [ 224, 224 ] }, "augment_method": { "SatelliteImaging": { "Contrast": { "contrast": -100 } } }, "model": "model_name", "label_score": { "0": 0.87, "1": 0.09, "2": 0.03, "3": 0.01 }, "prime_img": false, "img_array": [] }
- Modify
configs/default_config.tomlfor:- mongodb name
- Augmentation methods
- Use cases
- Debug options:
- Performance logging: set to True to measure performance
- Collection browsing: set to True to enable browsing MongoDB collections
- Streamlit settings
- Adjust
.streamlit/config.tomlfor Streamlit-specific configurations
The application's sidebar contains the main controls for data selection and visualization:
-
Collection Selection
- Primary Collection: Select the main model collection to analyze
- Comparable Collections: Choose one or more collections to compare against the primary collection
- Only collections with matching Shema will be available for comparison
-
Feature Toggles
- Show Prime Images: Display original (unaugmented) images when expanding augmentation results, if there are any present in the collection
- Display Additional Metrics: Show advanced metrics like confusion matrices, ECE, and ROC curves
The main analysis dashboard where you can:
- View performance comparisons between selected models
- Analyze accuracy across different augmentation methods
- Examine detailed metrics and visualizations
- Download data tables and plots
- Filter results by use case and augmentation type
- Expand sections to see detailed performance breakdowns
Contains essential information about:
- Project TAHAI overview and objectives
- DeepBench Analysis tool description
- Links to related resources:
- Research paper
- Project websites
Available when debug mode is enabled in configs/default_config.toml.
This is useful if you host a local MongoDB instance without its own ui.
- Browse and inspect MongoDB collections
- View detailed document structures
- Download collection data in CSV or JSON format
- Filter and search through documents
- Delete collections
- Upload TinyDB JSON files to MongoDB
- Each augmentation method shows:
- Accuracy comparison plots
- Performance metrics
- Downloadable data tables
- Expandable sections for detailed analysis
- When "Display Additional Metrics" is enabled:
- Confusion matrices
- ROC curves
- Expected Calibration Error (ECE)