This document provides comprehensive instructions for AI agents working with the OpenDataHub Notebooks repository. It outlines the project structure, development workflows, and best practices for contributing to this containerized notebook environment project.
The OpenDataHub Notebooks repository provides a collection of containerized notebook environments tailored for data analysis, machine learning, research, and coding within the OpenDataHub ecosystem. The project includes:
- Jupyter Notebooks: Various flavors (minimal, datascience, pytorch, tensorflow, trustyai)
- Code Server: VS Code-based development environments
- RStudio: R development environments
- Runtime Images: For pipeline execution with Elyra
- Base Images: CUDA and ROCm GPU-accelerated base images
.
├── .github/ # GitHub-specific configuration (workflows, issue templates, etc.)
├── jupyter/ # Jupyter Notebook image definitions, organized by flavor and accelerator
│ ├── datascience/
│ ├── minimal/
│ ├── pytorch/
│ ├── pytorch+llmcompressor/
│ ├── rocm/
│ │ ├── pytorch/
│ │ └── tensorflow/
│ ├── tensorflow/
│ ├── trustyai/
├── runtimes/ # container images that Elyra plugin uses to execute pipeline nodes
│ ├── datascience/
│ ├── minimal/
│ ├── pytorch/
│ ├── pytorch+llmcompressor/
│ ├── rocm-pytorch/
│ ├── rocm-tensorflow/
│ └── tensorflow/
├── codeserver/ # Code-Server (VS Code in the browser) image definitions and configs
│ ├── ubi9-python-3.11/
├── rstudio/ # RStudio image definitions and configs
│ ├── rhel9-python-3.11/
│ └── c9s-python-3.11/
├── ci/ # Continuous Integration scripts, checks, and configuration
├── cuda/ # CUDA-specific files (NVIDIA GPU support), e.g., repo files, licenses
├── manifests/ # Kubernetes manifests for deploying the images
│ ├── odh/base/ # ODH (OpenDataHub) imagestream manifests — used when KONFLUX=no
│ └── rhoai/base/ # RHOAI (Red Hat AI) imagestream manifests — used when KONFLUX=yes
├── scripts/
├── tests/
├── README.md
├── Makefile # Build orchestration tool for local development
└── …
When working with this project, ensure these tools are available:
- Container Runtime: podman/docker
- Python: 3.14 (required)
- Package Manager: uv (preferred) or pipenv
- Build System: make (gmake on macOS)
- Version Control: git with proper signing
The project uses a Makefile-based build system:
# Build a specific workbench
make ${WORKBENCH_NAME} -e IMAGE_REGISTRY=quay.io/${YOUR_USER}/workbench-images -e RELEASE=2023x
# Example builds
make jupyter-minimal-ubi9-python-3.12
make jupyter-datascience-ubi9-python-3.12
make jupyter-pytorch-cuda-ubi9-python-3.12The KONFLUX Makefile variable (default: no) switches between two build variants:
KONFLUX=no— builds fromDockerfile.*, usesmanifests/odh/base/imagestream manifestsKONFLUX=yes— builds fromDockerfile.konflux.*, usesmanifests/rhoai/base/imagestream manifests
This variable must be set consistently across the build step and the test step (make test-*), as the test script (scripts/test_jupyter_with_papermill.sh) reads the imagestream manifest to derive expected package versions.
The Python equivalent (tests/manifests.py::get_source_of_truth_filepath) accepts a konflux: bool keyword argument for the same purpose.
The project uses pytest with testcontainers for container testing:
# Setup environment
./uv venv --python $(which python3.14)
./uv sync --locked
# Run tests
make test # Quick static tests (pytest + Dockerfile alignment)
make test-unit # Python unit tests + doctests + Go tests (no container runtime)
make test-integration PYTEST_ARGS="--image=<image>" # Container integration tests
make test-${NOTEBOOK_NAME} # Specific notebook tests- Avoid unnecessary complexity: Aim for the simplest solution that works, while keeping the code clean.
- Avoid obvious comments: Only add comments to explain especially complex code blocks.
- Maintain code consistency: Follow existing code patterns and architecture.
- Maintain locality of behavior: Keep code close to where it's used.
- Make small, focused changes, unless explicitly asked otherwise.
- Keep security in mind: Avoid filtering sensitive information and running destructive commands.
- When in doubt about something, ask the user.
-
Understand the Inheritance Model: Notebook images inherit from parent images in a hierarchical structure:
- Minimal → DataScience → Specialized (PyTorch, TensorFlow, TrustyAI)
- Always check parent dependencies before adding new packages
-
Package Management:
- Use
pyproject.tomlandpylock.tomlfor Python dependencies - Always regenerate lock files after dependency changes by running
make refresh-pipfilelock-files
- Use
-
Testing:
- Run
make testand analyze logs
- Run
-
Check Dependencies:
- Review parent image changes
- Test downstream images
- Update version compatibility files
-
Security Updates:
- Scan for vulnerabilities using
ci/security-scan/ - Test with security scanning tools
- Scan for vulnerabilities using
-
Python Code:
- Follow PEP 8 style guidelines
- Use type hints where appropriate
- Run
rufffor linting - Use
pyrightfor type checking - This project requires Python 3.14. PEP 758 allows
except ExcA, ExcB:without parentheses to catch multiple exception types. This is NOT the old Python 2 syntax forexcept ExcA as ExcB. Parentheses are still required when binding withas: useexcept (ExcA, ExcB) as e:, notexcept ExcA, ExcB as e:. Ruff format enforces the parenthesis-free style when there is noasclause.
-
Dockerfiles:
- Minimize layers
- Follow security best practices
-
Documentation:
- Update README files for new features
- Add inline comments for complex logic
- Update this Agents.md file for new patterns
- Unit Tests (
tests/unit/): Self-tests for scripts, CI utilities, and shared code. Mirror the source layout (e.g.,scripts/cve/→tests/unit/scripts/cve/). Run with:./uv run pytest tests/unit/ - Static Tests (
tests/*.py): Use pytest to test config and manifests for consistency - Container Tests (
tests/containers/): Use testcontainers for integration testing - Browser Tests (
tests/browser/): Use Playwright for UI testing - Manual Tests (
tests/manual/): Document manual testing procedures
The project uses GitHub Actions for:
- Automated testing
- Security scanning
- Dependency updates
- Image building and publishing
Key CI files:
.github/workflows/- GitHub Actions workflowsci/- Custom CI scripts and configurations
- Local Development:
podman run -it -p 8888:8888 quay.io/opendatahub/workbench-images:jupyter-minimal-ubi9-python-3.12-latest- Kubernetes/OpenShift:
make deploy9-${NOTEBOOK_NAME} # Deploy
make test-${NOTEBOOK_NAME} # Test
make undeploy9-${NOTEBOOK_NAME} # Cleanup-
Build Failures:
- Verify dependency versions
- Review Dockerfile syntax
-
Test Failures:
- Ensure container runtime is running
- Check test environment setup
- Review test logs for specific errors
-
Dependency Conflicts:
- Use
uvfor dependency resolution - Check version compatibility files
- Test with minimal dependencies first
- Use
- Documentation: Check
docs/directory for detailed guides - Issues: Report issues on GitHub with detailed reproduction steps
- Community: Engage with OpenDataHub community for support
- Always test changes in isolated environments before committing
- Follow the inheritance model when adding dependencies
- Update documentation alongside code changes
- Use semantic versioning for releases
- Maintain backward compatibility when possible
- Security first - scan images and update dependencies regularly
- Performance optimization - use multi-stage builds and minimal base images
When contributing to this project:
- Fork and branch from main
- Write clear commit messages
- Add tests for new functionality
- Update documentation
For detailed contribution guidelines, see CONTRIBUTING.md.
This document should be updated as the project evolves. AI agents working with this repository should refer to this guide for consistent and effective contributions.