UX Overhaul: CLI, templates, TUI, local mode, and docs#226
Draft
RyanTheRobothead wants to merge 38 commits intounstablefrom
Draft
UX Overhaul: CLI, templates, TUI, local mode, and docs#226RyanTheRobothead wants to merge 38 commits intounstablefrom
RyanTheRobothead wants to merge 38 commits intounstablefrom
Conversation
Implement the E2E test harness and validation framework as the foundation for validating all future UX improvements. This infrastructure prevents regressions and serves as living documentation. E2E Test Harness (madsci.common.testing): - types.py: Pydantic models for test definitions, steps, and validations - validators.py: 14 validator types (exit_code, file_exists, http_health, json_contains, python_syntax, ruff_check, etc.) - runner.py: E2ETestRunner with pure Python and Docker mode support - template_validator.py: Template validation with syntax/lint checking Test Infrastructure: - 43 passing unit tests for validators and runner - tests/e2e/ directory for tutorial test definitions - YAML-based test definitions for human-readable test cases CI Integration: - .github/workflows/validation.yml for automated validation - Runs E2E framework tests, tutorial tests, and template validation Code Quality: - All code passes ruff linting with no warnings - Full docstrings on public classes and methods - Refactored complex methods for maintainability Documentation: - docs/guides/ux_overhaul_progress.md tracking implementation status
This update establishes a clear separation between nodes (runtime servers that communicate with workcells) and modules (complete packages containing node, interfaces, drivers, types, tests, and deployment configs). Key changes: - CLI: Add `madsci new module` and `madsci new interface` commands - Templates: Restructure from node/ to module/ hierarchy with interface variants - TUI: Update wizard from "New Node" to "New Module" with interface selection - Registry: Add "module" as a component type with interface_variants metadata - Migration: Add `madsci migrate module` for standardizing external repos - Settings: Add ModuleSettings, InterfaceSettings, and foo_types.py pattern This clarifies the Equipment Integrator workflow and enables testing without hardware via fake/sim interfaces.
- Add concrete deprecation timeline: deprecated in v0.7.0, removed in v0.8.0 - Define TUI phased delivery: MVP in Phase 1, full features in later phases - Add multi-workcell support to ID Registry design - Add air-gapped environment support for template installation - Improve docker-compose migration safety (ruamel.yaml, validation, rollback) - Add OTEL integration notes across CLI, TUI, and ID Registry designs - Correct documentation URL to https://ad-sdl.github.io/MADSci/ - Add madsci run workflow convenience command - Clarify Windows compatibility requirements (filelock, pathlib) - Add schema export capability to Settings Consolidation - Add per-step timeout support to E2E test harness - Expand ExperimentCampaign design notes
Add unified madsci CLI entry point with the following commands: - madsci version: Display installed MADSci packages and versions - madsci doctor: System diagnostics (Python, Docker, ports) - madsci status: Service health status with watch mode - madsci logs: Log viewing from Event Manager with filtering - madsci tui: Launch interactive Textual-based TUI Core infrastructure includes: - AliasedGroup for command aliases support - MadsciCLIConfig for TOML-based configuration - Rich-based output formatting utilities - Textual TUI with dashboard, status, and logs screens Also adds comprehensive test suite for CLI commands.
This commit completes Phase 2 of the MADSci UX overhaul plan: ## ID Registry System - Add LocalRegistryManager for file-based name-to-ID mappings - Add LockManager with heartbeat-based distributed locking - Add IdentityResolver for lab-level coordination - Add registry CLI commands (resolve, list, export, import, clean) ## Migration Tools - Add MigrationScanner to detect definition files needing migration - Add MigrationConverter to convert definitions to new format - Add MigrationRollback for safe rollback capability - Add migrate CLI commands (scan, convert, status, finalize, rollback) ## Template System - Add TemplateEngine for rendering Jinja2 templates - Add TemplateRegistry for discovering bundled/user templates - Add template types with validation and hooks ## Settings Consolidation - Add ModuleSettings and NodeModuleSettings for node development - Add interface settings hierarchy (Serial, Socket, USB, HTTP) - Add settings export endpoint to AbstractManagerBase ## Deprecation Layer - Add deprecation utilities with timeline (deprecated v0.7.0, removed v0.8.0) - Integrate warnings into manager definition loading ## Code Quality - Fix all ruff linting issues (36 errors) - Convert all f-string logging to structured kwargs (37 violations) - Add comprehensive test coverage for new components
Add madsci new command group with subcommands for creating MADSci components from templates: - madsci new module: Create complete module with node, interfaces, types - madsci new interface: Add interface variants to existing modules - madsci new node/experiment/workflow/workcell/lab: Component scaffolding - madsci new list: Discover available templates Add bundled templates in madsci_common: - module/basic: Complete module with fake interface for testing - interface/fake: Simulated interface for testing without hardware - experiment/script: Simple run-once experiment - workflow/basic: Single-step workflow YAML - workcell/basic: Workcell configuration - lab/minimal: Lab configuration without Docker All commands support interactive and non-interactive modes with Rich prompts for parameter collection and file preview. Includes 11 CLI tests, all passing.
Refactor experiment execution infrastructure to support different execution contexts using composition over inheritance: - ExperimentBase: Core class using MadsciClientMixin composition with lifecycle methods (start, end, pause, cancel, fail) and manage_experiment() context manager - ExperimentScript: Simple run-once experiments with run() and main() methods - ExperimentNotebook: Jupyter notebook support with start()/end() pattern, run_workflow() convenience method, and Rich display integration - ExperimentTUI: Interactive terminal UI using Textual (optional dependency) with status display, log viewer, and control buttons - ExperimentNode: Server mode exposing run_experiment as REST API action - Updated experiment templates to use new modalities - Added deprecation warning to ExperimentApplication (deprecated v0.7.0, removed v0.8.0) - 31 tests for all new modalities
Reduce test suite runtime from ~32 minutes to ~6 minutes (5.2x faster) by changing cleanup fixtures from function to module scope. Key changes: - Root conftest.py: Add module-scoped cleanup_resources_after_module fixture - Root conftest.py: Only run gc.collect() between modules, not every test - madsci_node_module conftest.py: Change cleanup_temp_files to module scope - madsci_client conftest.py: Change cleanup_temp_files and cleanup_logging_handlers to module scope - test_stress.py: Mark test_burst_traffic with @pytest.mark.slow - test_middleware.py: Reduce rate limit windows for faster tests The function-scoped fixtures were causing massive I/O overhead (989s system time) from running glob operations, file deletion, and gc.collect() after every single test (~1920 tests). Module scope reduces this to ~50-100 cleanup operations instead. Also includes EventClient test isolation fixes to prevent tests from attempting to connect to real event servers.
Add comprehensive tutorials and persona-based guides: - Tutorials (docs/tutorials/): - 01-exploration.md: CLI, TUI, and concepts introduction - 02-first-node.md: Module creation with fake interface - 03-first-experiment.md: Experiment scripts and notebooks - 04-first-workcell.md: Multi-node coordination with workflows - 05-full-lab.md: Complete lab deployment with Docker - Equipment Integrator Guide (docs/guides/integrator/): - README.md: Guide overview and quick reference - 01-understanding-modules.md: Node/Module/Interface concepts - 02-creating-a-module.md: Module scaffolding walkthrough - 03-developing-interfaces.md: Interface patterns - 04-fake-interfaces.md: Simulated interface patterns - Lab Operator Guide (docs/guides/operator/): - README.md: Quick reference for daily operations - Experimentalist Guide (docs/guides/experimentalist/): - README.md: Quick reference for running experiments - Tutorial automation: - tutorial_02_first_node.tutorial.yaml: E2E test for module creation - Updated myst.yml with new navigation structure - Updated ux_overhaul_progress.md to reflect Phase 5 progress (75%)
Fix 3 failing tests in tests/e2e/test_tutorials.py by correcting the
tutorial YAML files to match the E2ETestDefinition Pydantic schema and
fixing a bug in the template engine where rendered source paths were
passed to Jinja2 get_template() instead of the original unrendered
paths (which match the actual filenames on disk with {{variable}}
placeholders).
…tes, and template tests CLI improvements: - Fix migrate.py and registry.py to respect global --no-color/--quiet/--json flags - Replace module-level Console() with context-aware _get_console(ctx) - Add @click.pass_context to all migrate (5) and registry (6) subcommands - Replace raise SystemExit(1) with ctx.exit(1) for proper Click lifecycle - Replace literal Unicode characters with escape sequences - Fix variable shadowing (error -> err) in migrate commands New templates (8 -> 11 total): - experiment/tui: ExperimentTUI modality with pause/cancel support - experiment/node: ExperimentNode modality with REST server - workflow/multi_step: 3-step workflow with node parameters - module/device: 11-file device module using @action decorator pattern, resource management with Slots, full device lifecycle, fake interface with command_history tracking Bug fix: - Fix workflow/basic template referencing undefined author_name variable Tests: - Add 74 template engine/registry tests across 5 test classes - Validate all 11 templates for existence, defaults, rendering, and syntax
… CLI imports, and E2E tests Add missing node/basic template that was referenced by CLI but empty, enhance lab/minimal template with .gitignore, pyproject.toml, and example workflow. Implement lazy imports across all CLI commands for faster startup performance. Add 3 new E2E tutorial tests covering workflow/workcell/lab creation and full lifecycle validation. Update template tests accordingly.
…bility, and documentation accuracy Fix 3 failing E2E tutorial tests: - Fix generate_from_template() to map --name flag to all template parameter types (lab_name, workflow_name, workcell_name) using category-based resolution instead of hardcoded parameter list - Fix device module template E501 lint error (line too long) - Fix tutorial_05 Python 3.10 compatibility (tomllib fallback) Fix documentation accuracy across tutorials and guides: - Replace non-existent --output flag with positional DIRECTORY arg - Fix ExperimentDesign field names and import paths - Fix run() -> run_experiment() method override - Fix --template -> --modality for experiment CLI - Fix invalid --check all, --service, --verbose CLI flags - Fix broken workflow_schema.md link Add .scratch/ to .gitignore for local testing.
…, redaction, utcnow, deprecation warnings, lazy imports
Resolves all issues identified in the ux_overhaul merge code review:
- CR-1 (High): Fix stdlib logging with kwargs in 7 files (~37 call sites).
Converted structlog-style logger.warning("msg", key=val) to stdlib-compatible
logger.warning("msg: key=%s", val) pattern to prevent runtime TypeError.
- CR-2 (Medium): Fix _sync_to_lab in identity_resolver.py to include
component_id in the POST body, enabling distributed ID coordination.
- CR-3 (Medium): Fix over-aggressive settings redaction in manager_base.py.
Replace broad substring matching with word-boundary-aware patterns to
avoid redacting legitimate fields like primary_key or auth_enabled.
- CR-4 (Medium): Replace all datetime.utcnow() calls across 6 files with
datetime.now(tz=timezone.utc) for Python 3.12+ compatibility. Uses
timezone.utc (not UTC constant) for Python 3.10 compatibility.
- CR-5 (Low): Fix DeprecatedClass double-warning on nested subclasses by
checking cls.__dict__ instead of hasattr() and tracking wrapped state.
- CR-6 (Low): Implement true lazy CLI command loading via AliasedGroup.
get_command() with _LAZY_COMMANDS registry, replacing eager imports
that defeated the stated lazy loading intent.
All 2004 tests pass.
The structlog ConsoleRenderer with colors=True was writing ANSI escape codes to both console and file handlers. Add AnsiStrippingFormatter to the file handler so log files remain clean while console output stays colorized. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix double-escaped unicode in doctor command that printed literal "\u2713" instead of the checkmark character. Add version field to ManagerHealth and populate it via the health endpoint using importlib.metadata, with a fallback for __main__ module resolution so it works when managers are run with python -m. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add 59 new tests covering the three CLI commands that previously had zero dedicated tests (registry, migrate, tui), plus integration tests validating the migration tool against example_lab. All 2063 tests pass. - test_registry.py: 21 tests covering all 6 subcommands (list, resolve, rename, clean, export, import) with empty/populated registries, JSON output, type filtering, and error cases - test_migrate.py: 21 tests covering all 5 subcommands (scan, convert, status, finalize, rollback) with empty dirs, definition files, and edge cases - test_migrate_example_lab.py: 12 integration tests validating the scanner finds all 20 definition files (7 manager + 6 node + 7 workflow), extracts component IDs, plans migration actions, and dry-run doesn't modify files - test_tui.py: 5 tests covering help, screen choices, mocked launch, graceful import error handling, and the 'ui' alias - Update completion plan to mark Phase A complete, unblocking Phase B - Add .scratch directory directive to AGENTS.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds start, stop, init, validate, run, completion, and backup commands to the madsci CLI, bringing CLI command coverage from 53% to 100%. All commands follow existing patterns (lazy loading, Click, Rich output) and include dedicated test files (47 new tests, 2110 total passing). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…est coverage Add 14 new templates across 4 categories (module, interface, lab, comm), bringing the template library from 12 to 26 templates. Add COMM category to TemplateCategory enum. Expand template test suite from ~85 to 150 tests covering existence, defaults validation, rendering, and Python syntax for all templates. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…eming, and command palette - D.1: Trogon command palette via `madsci commands` CLI + Ctrl+P in TUI - D.2: Auto-refresh (5s) on Dashboard and Status screens with 'a' toggle - D.3: Node management screen (nodes from Workcell Manager, detail panel) - D.4: Workflow visualization screen (active/queued workflows, step progress, pause/resume/cancel) - D.5: External CSS theming (styles/theme.tcss), removed all inline DEFAULT_CSS - Updated app.py with 5 screens, new keybindings (n/w/Ctrl+P), updated help - 14 TUI tests (up from 5), 2119 total tests passing, ruff clean Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add migration guide (docs/guides/migration.md) documenting the transition from definition YAML files to settings+registry configuration. Create 8 integration tests that run the converter against a copy of example_lab, validating backup creation, ID registration, deprecation markers, and rollback. Update example_lab/.env with commented-out settings-equivalent environment variables. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rides Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Enable running the full MADSci stack without Docker via `madsci start --mode=local`. All 7 managers run in a single process with in-memory replacements for Redis, MongoDB, and PostgreSQL (SQLite), requiring zero changes to manager business logic. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ification
Replace implicit auto-writing of definition/info files with explicit CLI
commands and metadata-driven secret handling. G.1 adds model_dump_safe()
and json_schema_extra={"secret": True} annotations across all manager
settings. G.2-G.3 remove auto-writing behavior (managers/nodes). G.4 adds
madsci config export/create commands. G.5 adds NodeInfo.from_config()
factory with identity fields on NodeConfig. G.6 adds migration guide and
deprecation warnings. 54 new tests, 2359 total passing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ger/node start, and TUI wizard H.1: Update README quick start, create CLI reference, template catalog, CHANGELOG, and update operator guide to use madsci CLI commands as primary workflow. H.2: Add `just verify-ux` target, CLI smoke test for all 17 commands, and fix CI template validation to use the real test suite. H.3: Add health check polling after `madsci start -d` with Rich Live display, --wait/--no-wait and --wait-timeout options. H.4: Convert start/stop to Click groups with manager/node subcommands and PID tracking in ~/.madsci/pids/. H.5: Add `madsci new --tui` Textual template browser with category tree, searchable table, and detail panel. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…y improvements Fix shell injection in template hooks (shlex.split + shell=False), add no_color context propagation, nested secret redaction for MadsciBaseSettings, detached process logging, stop-manager name validation, importlib.resources path fix, parameterized DB credentials in distributed lab template, defensive imports, PATH parameter validation, min_length=0 truthiness fix, shared get_console utility, shared TUI constants, bounded seen_ids, --screen option wiring, manager_base cleanup, updated docs/CI, and strengthened test assertions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…y improvements Harden process management (start_new_session, SIGTERM→SIGKILL escalation, close log handles), add template security (path traversal prevention, SandboxedEnvironment for user templates), fix local backend shared locks and Redlock cleanup, handle list-of-model secret redaction, wire MadsciCLIConfig to all CLI commands and TUI screens, and clean up dead code and console access patterns. All 2427 tests pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ovements from PR review Security fixes: - Add path traversal validation in uninstall_template (registry.py) - Add source path traversal check in template engine render (engine.py) - Use model_dump_safe() instead of model_dump() for startup settings logging (manager_base.py) - Add secret annotations to auth_password and auth_token fields (interface_types.py) Robustness fixes: - Store process metadata in PID files and verify identity before signaling (start.py) - Wrap Popen in try/except to prevent file handle leak on error (start.py) - Always double-quote .env values and escape special characters (converter.py) - Fix double datetime.now() call in _mark_deprecated (converter.py) - Use .pop() for thread-safe heartbeat cleanup (lock_manager.py) Consistency improvements: - Replace swallowed exception with debug logging in config export (config.py) - Warn on stderr when TOML config file cannot be parsed (utils/config.py) - Catch invalid regex in --grep with user-friendly error (logs.py) - Fix config save serialization to only stringify Path/AnyUrl types (utils/config.py) - Use ctx.get_parameter_source() for lab URL override detection (cli/__init__.py) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…packaging Migration tests used exact file counts that differed between local (where a gitignored default.node.yaml exists) and CI. Changed to subset checks with minimum count assertions. Module template pyproject.toml used packages.find which couldn't discover loose .py modules in src/; replaced with explicit package-dir and py-modules configuration across all 6 module templates. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove src/__init__.py from module templates (inconsistent with py-modules setuptools config). Use PYTHONPATH in tutorial test to ensure module imports work regardless of pip install location (PDM+uv venvs may not include pip). Print subprocess stdout/stderr on step failure for CI debuggability. Update CHANGELOG test count to 2427+. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The CI venv (PDM with uv backend) does not include pip, so `python -m pip install -e` fails. The PYTHONPATH approach from the previous commit already handles imports correctly, so the pip install step is no longer needed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Complete UX overhaul of the MADSci framework across 8 phases (A-H), delivering a production-ready CLI, template system, TUI dashboard, and comprehensive documentation.
init,new,start,stop,status,doctor,run,validate,config,logs,backup,version,completion,commands,tui,registry,migrate)madsci start --mode=local) with in-memory drop-in backends for MongoDB and Redis — no Docker required for developmentmodel_dump_safe()),madsci config export/create, and no auto-writingmadsci start manager event -d,madsci stop manager event)madsci start -dwith Rich Live progress displayjust verify-uxtarget, fixed template validation workflowPhase breakdown
Test plan
pytest— all 2427+ tests passjust checks— ruff check and format passjust verify-ux— combined checks + testsmadsci --help— CLI loads and shows all 17 commandsmadsci start --mode=local— all managers start in-processmadsci new list— shows all 26 templatesmadsci tui— TUI dashboard launchesmadsci doctor— environment diagnostics runmadsci config export— exports current config with secret redaction🤖 Generated with Claude Code