All notable changes to the AIXCL project will be documented in this file.
Release Candidate 9 for v1.0.0. This release includes 25+ commits since RC8 with major improvements to engine stability, volume management, and CI/CD infrastructure. Key highlights include GPU startup fixes for llama.cpp, standardized volume naming across contexts, and comprehensive CI tests for all three inference engines.
- CI/CD Testing: Comprehensive devcontainer engine tests for all three engines (Ollama, llama.cpp, vLLM) (#882)
- Automated testing in CPU-only mode for GitHub Actions
- Volume persistence validation
- Engine switching tests
- Volume consistency validation script
- llama.cpp GPU Support: Fixed container startup error by properly configuring volumes and entrypoint in GPU compose (#884)
- Removed broken shell entrypoint override
- Added proper volume mounts for models and entrypoint script
- Container now starts successfully with GPU support
- Volume Management: Standardized volume naming across local Docker, devcontainer, and GitHub Codespaces (#883)
- Renamed all volumes to use
aixcl-*prefix (e.g.,aixcl-ollama-data,aixcl-llamacpp-data) - Volumes now marked as
external: truefor persistence across contexts - Added
init-volumes.shscript for one-time volume initialization - Stack start automatically checks and initializes volumes
- Renamed all volumes to use
- llama.cpp Model Format: Fixed default INFERENCE_MODEL to use full HuggingFace path format (#879)
- Changed from filename-only to full path:
Qwen/Qwen2.5-Coder-0.5B-Instruct-GGUF/qwen2.5-coder-0.5b-instruct-q4_k_m.gguf - Aligns with README documentation
- Fixes model download issues in devcontainer workflows
- Changed from filename-only to full path:
Thanks to all contributors who helped improve engine stability, volume management, and testing infrastructure.
Release Candidate 8 for v1.0.0. This release includes 12 commits since RC7 with focus on documentation accuracy and transparency. Major improvements include correcting Podman support claims, fixing documentation inconsistencies, and enhancing README completeness.
- Podman Support: Corrected claims to reflect experimental status (#864)
- README: Added missing CLI commands to Quick Start table (#862)
- Documentation: Fixed orphaned links, outdated paths, and formatting issues (#857-#861)
Thanks to all contributors who helped improve documentation clarity and accuracy.
Release Candidate 7 for v1.0.0. This release includes 35 commits since RC6 with significant additions including the new /release automation command, improved slash command infrastructure with context-aware execution, Alertmanager service integration, Open WebUI upgrade to v0.9.1, and enhanced Grafana dashboards.
- Release Automation: New
/releaseslash command to automate the complete release process from version detection to GitHub Release publication (#845) - Release Templates: Standardized release note templates in
ai/templates/release/and.github/RELEASE_TEMPLATE.md(#843) - Alertmanager Service: Integrated Alertmanager for observability stack alerting (#822)
- Platform Commands: New slash commands for comprehensive platform health reporting:
/platform- Live platform health report with models, ports, volumes, firing alerts/status- Quick triage command for inference, postgres, webui, docker/report- Workflow progress reporting
- Context-Aware Execution: Enhanced
/workflow,/commit,/pr,/branchcommands with automatic state detection
- Open WebUI Upgrade: Updated from v0.8.12 to v0.9.1 incorporating latest security fixes and features (#839)
- Grafana Dashboard: Updated docker-containers dashboard to include all 13 services including vllm, alertmanager, nvidia-gpu-exporter, alloy, loki, cadvisor, node-exporter, postgres-exporter (#841)
- Security hardening compatibility fixes for vLLM entrypoint (#836)
- Removed security hardening from pgAdmin due to su authentication failure (#837)
- Added security hardening to postgres container (#835)
- Added security hardening to nvidia-gpu-exporter container (#825)
- Service addition checklist and references documentation
- AGENTS.md v1.5 alignment and output formatting guidance
- Comprehensive security hardening documentation
- Fixes #844 - Release automation command
- Fixes #842 - Release note templates
- Fixes #840 - Grafana dashboard updates
- Fixes #838 - Open WebUI upgrade
- Fixes #836 - vLLM security hardening compatibility
- Fixes #837 - pgAdmin security reversion
- Fixes #835 - PostgreSQL container security
- Fixes #825 - nvidia-gpu-exporter security
- Fixes #822 - Alertmanager service
- Part of #802 - Context-aware slash commands
Release Candidate 6 for v1.0.0. This release includes 15+ commits since RC5 focusing on container security hardening with Linux capability restrictions and defense-in-depth controls for all observability services.
-
Container Capability Restrictions: Implemented comprehensive security hardening for 6 observability services (prometheus, grafana, loki, postgres-exporter, node-exporter, alloy) with the following controls:
cap_drop: ALL- Remove all Linux capabilitiessecurity_opt: no-new-privileges:true- Prevent privilege escalationread_only: true- Read-only root filesystem (where applicable)tmpfsmounts - Writable temporary space with noexec,nosuid:robind mounts - Read-only configuration mounts
-
Service Security Matrix: Each hardened service now runs with minimal privileges:
Service User cap_drop no-new-priv read_only prometheus default ALL ✅ ✅ grafana default ALL ✅ ❌* loki default ALL ✅ ❌* postgres-exporter 65534:65534 ALL ✅ ✅ node-exporter 65534:65534 ALL ✅ ✅ alloy 12345:12345 ALL ✅ ✅ *Requires data volume writes
-
Security Documentation: Added comprehensive Section 6 to
docs/operations/security.mdcovering:- Container security hardening overview
- Service security matrix with all 9 services
- Verification commands for container inspection
- Troubleshooting guide for restricted containers
- Capability restrictions Phase 1: prometheus, grafana, loki, postgres-exporter (#784, #785)
- Capability restrictions Phase 2: node-exporter, alloy (#786, #787)
- Security options Phase 3: no-new-privileges, read_only, tmpfs mounts (#788, #789)
- Documentation Phase 4: Complete security hardening documentation (#790, #791)
- AGENTS.md output formatting guidance for consistent tabular reports (#782, #783)
- Fixes #705 - Container Capability Restrictions Implementation (all 4 phases complete)
- Part of #698 - Container Security Hardening Initiative
Release Candidate 5 for v1.0.0. This release includes 27 commits since RC4 with critical bug fixes, llamacpp model pre-flight checks, and continued non-root container migrations.
- Llamacpp Pre-flight Check: Added validation to prevent stack start when no llamacpp model is configured (#775)
- Preserved DATABASE_URL environment variable when Open WebUI switches to non-root user (#773)
- Moved PostgreSQL wait logic to entrypoint script for better startup handling (#771)
- Pre-created logs directory to prevent root ownership issues (#770)
- Added PostgreSQL readiness check to Open WebUI startup sequence (#768)
- Ensured CLI profile flag updates .env on fresh installations (#766)
- Added network_mode and fixed permissions for pgAdmin container (#764)
- Fixed pgAdmin servers.json import permission issues (#759)
- Fixed docker-compose.yml corruption from models add command (#744)
- Fixed three critical issues: model selection, pgadmin connection, OpenCode token limit (#740)
- Fixed pgAdmin permission errors with entrypoint script (#737)
- Removed database deletion from engine switch command (#736)
- Fixed environment variable consistency issues (#733)
- Fixed vLLM command syntax in docker-compose.yml (#732)
- Fixed llamacpp model name handling for full HF path API calls (#729)
- Fixed Ollama volume permissions with entrypoint script (#728)
- Fixed Open WebUI configuration for non-Ollama engines (#723)
- Fixed llama.cpp model configuration synchronization (#720)
- Run Open WebUI as non-root user (#721)
- Run vLLM as non-root user via entrypoint script
- Run llama.cpp as non-root user
- Run nvidia-gpu-exporter as non-root user
- Run node-exporter as non-root user (#718)
- Run Ollama as non-root user (#717)
- Hardened Alloy container security configuration with read_only and tmpfs (#716, #715)
- Run postgres-exporter as non-root user (#714)
- Run pgAdmin container as non-root user (#703)
- Run Loki container as non-root user (#702)
- Run Grafana container as non-root user (#701)
- Run Prometheus container as non-root user (#700)
- Run PostgreSQL container as non-root user (#699)
Release Candidate 4 for v1.0.0. This release includes critical bug fixes for Open WebUI PostgreSQL support, pgAdmin integration, and engine management.
- PostgreSQL readiness check to Open WebUI entrypoint
- Pre-create logs directory with correct ownership
- Environment configuration documentation
- Improved pgAdmin servers.json import handling
- Enhanced CLI profile flag behavior on fresh install
- Updated default vLLM model from 7B to 0.5B
- Fixed pgAdmin connection and permission issues
- Fixed Open WebUI SQLite fallback when PostgreSQL unavailable
- Fixed root-owned logs directory creation
- Fixed CLI profile persistence
- Fixed docker-compose.yml corruption from models add
- Fixed vLLM command syntax
- Fixed llamacpp model validation
- Added Open WebUI Direct Connections documentation for vLLM/llama.cpp setup
- Updated environment configuration guide
Release Candidate 3 for v1.0.0. This release includes 35+ commits since RC2 focusing on rootless/Podman support, multi-registry model pulls, vLLM stability fixes, and infrastructure improvements.
- Rootless & Podman Support: Code support for running AIXCL in rootless environments with both Docker and Podman. Docker rootless is verified; Podman support is implemented but experimental. Includes automated socket detection and permission handling for volumes (Fixes #498).
- Native Multi-Registry Pulls: Support for
hf.co/andhuggingface.co/URIs in themodels addcommand, enabling direct pulls from Hugging Face for all supported engines (Fixes #497). - Podman Quadlet Generation: New
stack export-quadletcommand to generate native Systemd unit files for robust, headless deployments. Note: Quadlet generation is functional but not fully tested with Podman (Fixes #499). - Integrated Model Inference Testing: Merged prompt/response verification into the main
platform-tests.shsuite for end-to-end reliability.
- Renamed service/container from
openwebuitowebuiacross codebase and documentation (Fixes #433). Directorywebui/and volume pathwebui-data/; display name "Open WebUI". Service contract filewebui.mdrenamed towebui.md; scriptbuild_and_push_openwebui.shrenamed tobuild_and_push_webui.sh. - Updated Open WebUI to v0.8.0 (Fixes #454)
- Updated Grafana to 12.4.2 (latest stable) (Fixes #680)
- Updated various service container images to latest versions (Fixes #677)
- vLLM Token Limit Error: Fixed vLLM compatibility issues with OpenCode using
--enforce-eagerflag to disable CUDA graph capture (Fixes #685, #682, #682) - ShellCheck SC2168: Resolved ShellCheck errors in test infrastructure (Fixes #678)
- Profile Services: Fixed PROFILE_SERVICES to use current INFERENCE_ENGINE from .env (Fixes #675)
- Grafana Version: Corrected Grafana image tag to use valid stable version
- Added HuggingFace cache volume to vLLM service
- Improved vLLM test error handling for long startup times
- Enhanced workflow documentation with plain text formatting guidelines
- Added assignee requirements to issue and PR templates
- Updated workflow report format with consistent markdown tables (Fixes #688)
- Added documentation for test suite fixes (Fixes #670)
Release Candidate 2 for v1.0.0. This release includes 58 commits since RC1 covering refactoring, bug fixes, feature enhancements, dependency updates, and documentation improvements.
- Token usage reporting with actual Ollama counts for accurate model usage tracking
- OpenCode orchestrator YAML configuration replacing legacy agent config
- Commercial licensing documentation (
COMMERCIAL.md) - Explicit timeouts to platform test script API calls
- orchestrator members test timeout increase to prevent false failures
- Consolidated duplicated code between
aixclCLI and lib modules - Replaced DEBUG print statements with proper logging throughout codebase
- Removed hardcoded confidence penalty heuristics from orchestrator
- Aligned primary model confidence wording and stage 2 ranking criteria
- Defaulted orchestrator to plain text responses
- Updated Open WebUI to v0.7.2
- Bumped
wheeldependency from 0.45.1 to 0.46.2 - Updated README with privacy emphasis and profile testing options
- Improved documentation consistency across project
- Guarded interactive prompts against
set -eon EOF - Removed duplicate database save in non-streaming chat completions path
- Aligned platform test profile messaging
- Updated command references in help messages and error output
- Removed temporary
pr-body-temp.mdfile
- Clarified OpenCode plugin omission from
RUNTIME_CORE_SERVICES - Added assignee and PR labeling requirements to development workflow
- Improved overall documentation consistency
- Governance Framework: Added comprehensive architectural governance model in
docs/architecture/governance/- Runtime Core vs Operational Services separation
- Service contracts defining dependencies and boundaries
- Profile definitions (core, dev, ops, full)
- AI guidance for preserving architectural invariants
- Stack status specification
- Documentation Updates: Updated README.md, docs, and manpage to reflect governance model
- Bash Completion: Updated completion script to reflect service categorization
-
PostgreSQL Integration: Added automatic PostgreSQL-based storage for orchestrator conversations
- Automatic schema creation on startup via
ensure_schema()function - Migration system with
001_create_chat_table.sqlfor initial schema setup - Support for both Open WebUI and OpenCode plugin conversations via
sourcefield - Conversation tracking with unique IDs generated from message hashes
- Full message history preservation with stage data (Stage 1, 2, 3 responses)
- Automatic schema creation on startup via
-
Database Storage Module (
orchestrator/backend/db_storage.py):create_opencode_conversation()- Create new OpenCode conversationsget_opencode_conversation()- Retrieve conversations by IDadd_message_to_conversation()- Add messages to existing conversationslist_opencode_conversations()- List all OpenCode conversationsdelete_conversation()- Delete conversationsfind_conversation_by_messages()- Find conversations by message content
-
Database Connection Management (
orchestrator/backend/db.py):- Connection pool management with asyncpg
- Automatic schema verification and creation
- Graceful degradation when database is unavailable
- Environment-based configuration (ENABLE_DB_STORAGE flag)
-
Conversation Tracker (
orchestrator/backend/conversation_tracker.py):- Deterministic conversation ID generation from message hashes
- Message entry creation with proper formatting
- Integration with database storage
-
API Endpoints:
- Conversation deletion endpoint:
DELETE /v1/chat/completions/{conversation_id} - Automatic conversation persistence on chat completion requests
- Conversation ID returned in API responses
- Conversation deletion endpoint:
- Test Scripts (moved to
orchestrator/scripts/test/):test_db_connection.py- Comprehensive database connection and operation teststest_db_in_container.sh- Container-based test wrappertest_api.sh- API endpoint integration teststest_request.json- Sample API request for testing
- Utility Scripts (organized in
scripts/db/):002_add_source_column.sql- Migration script for adding source column to existing databasesquery_opencode_chats.sql- Query script for OpenCode conversationsquery_all_chats.sql- Query script for all conversationscheck_db.sh- Quick database inspection scriptREADME.md- Documentation for database utilities
- Updated main
README.mdwith database persistence features - Created
orchestrator/scripts/test/README.mdwith test script documentation - Created
scripts/db/README.mdwith database utility documentation - Updated
orchestrator/TESTING.mdwith new script paths and testing procedures
-
Script Organization:
- Moved SQL utility files from root to
scripts/db/directory - Moved test scripts from
orchestrator/toorchestrator/scripts/test/directory - Created logical directory structure for better maintainability
- Moved SQL utility files from root to
-
File Cleanup:
- Removed duplicate
check_opencode.sqlfile (consolidated withcheck_opencode_chats.sql) - Organized temporary test files into appropriate directories
- Updated all script paths in documentation
- Removed duplicate
- Added
ENABLE_DB_STORAGEenvironment variable (default:true) - Database connection uses same PostgreSQL instance as Open WebUI
- Automatic migration execution on service startup
The chat table structure:
id(UUID) - Primary key, auto-generatedtitle(TEXT) - Conversation titlechat(JSONB) - Full conversation data with messages arraymeta(JSONB) - Additional metadatasource(TEXT) - Source identifier ('openwebui' or 'opencode')created_at(TIMESTAMP) - Creation timestampupdated_at(TIMESTAMP) - Auto-updated on changesuser_id(TEXT) - Optional user identifier
Indexes created for performance:
idx_chat_source- Index on source fieldidx_chat_created_at- Index on creation timestamp (DESC)idx_chat_meta- GIN index on metadata JSONBidx_chat_user_id- Partial index on user_id
- Migrations are automatically executed on startup via
ensure_schema() - Migration files located in
orchestrator/backend/migrations/ - Uses
IF NOT EXISTSclauses for idempotent execution - Graceful error handling for existing schemas
For existing installations upgrading to include database persistence:
-
Automatic Migration: The system will automatically create the schema on next startup if
ENABLE_DB_STORAGE=true -
Manual Migration (if needed):
docker exec -i postgres psql -U ${POSTGRES_USER} -d ${POSTGRES_DATABASE} < orchestrator/backend/migrations/001_create_chat_table.sql
-
Adding Source Column (for databases created before source column was added):
docker exec -i postgres psql -U ${POSTGRES_USER} -d ${POSTGRES_DATABASE} < scripts/db/002_add_source_column.sql
None - This is a backward-compatible addition. Existing functionality remains unchanged.
None
- Removed duplicate
check_opencode.sqlfile (functionality preserved incheck_opencode_chats.sql)
- Fixed script paths in test scripts after reorganization
- Updated documentation references to reflect new script locations
- Database credentials are managed via environment variables
- Connection pooling with configurable pool size
- Graceful degradation when database is unavailable (service opencodes without persistence)