UX Overhaul: CLI, templates, TUI, local mode, and docs by RyanTheRobothead · Pull Request #226 · AD-SDL/MADSci

RyanTheRobothead · 2026-02-13T19:25:29Z

Summary

Complete UX overhaul of the MADSci framework across 8 phases (A-H), delivering a production-ready CLI, template system, TUI dashboard, and comprehensive documentation.

17 CLI commands with lazy loading, shell completion, and consistent error handling (init, new, start, stop, status, doctor, run, validate, config, logs, backup, version, completion, commands, tui, registry, migrate)
26 scaffolding templates across 8 categories (module, interface, node, experiment, workflow, workcell, lab, communication) with Jinja2 engine, parameter validation, and interactive wizard
6 TUI screens (dashboard, status, logs, nodes, workflows, template browser) with Textual, auto-refresh, CSS theming, and Trogon command palette
Pure Python local mode (madsci start --mode=local) with in-memory drop-in backends for MongoDB and Redis — no Docker required for development
Explicit configuration management with secret classification (model_dump_safe()), madsci config export/create, and no auto-writing
Individual manager/node start/stop with PID tracking (madsci start manager event -d, madsci stop manager event)
Health check polling after madsci start -d with Rich Live progress display
Security hardening: path traversal validation, SandboxedEnvironment for user templates, secret annotations on auth fields, PID identity verification, SIGTERM→SIGKILL escalation
Documentation: CLI reference, template catalog, CHANGELOG, updated README and operator guide
CI: CLI smoke tests for all 17 commands, just verify-ux target, fixed template validation workflow
2427+ tests passing with full ruff compliance

Phase breakdown

Phase	Description
A	Test coverage and stability foundation
B	7 missing CLI lifecycle commands
C	Template library expansion to 26 templates
D	TUI enhancements (5 screens, auto-refresh, theming, Trogon)
E	Settings consolidation, registry resolution, migration validation
F	Pure Python local mode with in-memory backends
G	Explicit config management with secret classification
H	Docs, CI smoke tests, health polling, manager/node start/stop, TUI wizard
Review fixes	Security hardening, robustness improvements, CI test fixes

Test plan

pytest — all 2427+ tests pass
just checks — ruff check and format pass
just verify-ux — combined checks + tests
madsci --help — CLI loads and shows all 17 commands
madsci start --mode=local — all managers start in-process
madsci new list — shows all 26 templates
madsci tui — TUI dashboard launches
madsci doctor — environment diagnostics run
madsci config export — exports current config with secret redaction

🤖 Generated with Claude Code

Implement the E2E test harness and validation framework as the foundation for validating all future UX improvements. This infrastructure prevents regressions and serves as living documentation. E2E Test Harness (madsci.common.testing): - types.py: Pydantic models for test definitions, steps, and validations - validators.py: 14 validator types (exit_code, file_exists, http_health, json_contains, python_syntax, ruff_check, etc.) - runner.py: E2ETestRunner with pure Python and Docker mode support - template_validator.py: Template validation with syntax/lint checking Test Infrastructure: - 43 passing unit tests for validators and runner - tests/e2e/ directory for tutorial test definitions - YAML-based test definitions for human-readable test cases CI Integration: - .github/workflows/validation.yml for automated validation - Runs E2E framework tests, tutorial tests, and template validation Code Quality: - All code passes ruff linting with no warnings - Full docstrings on public classes and methods - Refactored complex methods for maintainability Documentation: - docs/guides/ux_overhaul_progress.md tracking implementation status

This update establishes a clear separation between nodes (runtime servers that communicate with workcells) and modules (complete packages containing node, interfaces, drivers, types, tests, and deployment configs). Key changes: - CLI: Add `madsci new module` and `madsci new interface` commands - Templates: Restructure from node/ to module/ hierarchy with interface variants - TUI: Update wizard from "New Node" to "New Module" with interface selection - Registry: Add "module" as a component type with interface_variants metadata - Migration: Add `madsci migrate module` for standardizing external repos - Settings: Add ModuleSettings, InterfaceSettings, and foo_types.py pattern This clarifies the Equipment Integrator workflow and enables testing without hardware via fake/sim interfaces.

- Add concrete deprecation timeline: deprecated in v0.7.0, removed in v0.8.0 - Define TUI phased delivery: MVP in Phase 1, full features in later phases - Add multi-workcell support to ID Registry design - Add air-gapped environment support for template installation - Improve docker-compose migration safety (ruamel.yaml, validation, rollback) - Add OTEL integration notes across CLI, TUI, and ID Registry designs - Correct documentation URL to https://ad-sdl.github.io/MADSci/ - Add madsci run workflow convenience command - Clarify Windows compatibility requirements (filelock, pathlib) - Add schema export capability to Settings Consolidation - Add per-step timeout support to E2E test harness - Expand ExperimentCampaign design notes

Add unified madsci CLI entry point with the following commands: - madsci version: Display installed MADSci packages and versions - madsci doctor: System diagnostics (Python, Docker, ports) - madsci status: Service health status with watch mode - madsci logs: Log viewing from Event Manager with filtering - madsci tui: Launch interactive Textual-based TUI Core infrastructure includes: - AliasedGroup for command aliases support - MadsciCLIConfig for TOML-based configuration - Rich-based output formatting utilities - Textual TUI with dashboard, status, and logs screens Also adds comprehensive test suite for CLI commands.

This commit completes Phase 2 of the MADSci UX overhaul plan: ## ID Registry System - Add LocalRegistryManager for file-based name-to-ID mappings - Add LockManager with heartbeat-based distributed locking - Add IdentityResolver for lab-level coordination - Add registry CLI commands (resolve, list, export, import, clean) ## Migration Tools - Add MigrationScanner to detect definition files needing migration - Add MigrationConverter to convert definitions to new format - Add MigrationRollback for safe rollback capability - Add migrate CLI commands (scan, convert, status, finalize, rollback) ## Template System - Add TemplateEngine for rendering Jinja2 templates - Add TemplateRegistry for discovering bundled/user templates - Add template types with validation and hooks ## Settings Consolidation - Add ModuleSettings and NodeModuleSettings for node development - Add interface settings hierarchy (Serial, Socket, USB, HTTP) - Add settings export endpoint to AbstractManagerBase ## Deprecation Layer - Add deprecation utilities with timeline (deprecated v0.7.0, removed v0.8.0) - Integrate warnings into manager definition loading ## Code Quality - Fix all ruff linting issues (36 errors) - Convert all f-string logging to structured kwargs (37 violations) - Add comprehensive test coverage for new components

Add madsci new command group with subcommands for creating MADSci components from templates: - madsci new module: Create complete module with node, interfaces, types - madsci new interface: Add interface variants to existing modules - madsci new node/experiment/workflow/workcell/lab: Component scaffolding - madsci new list: Discover available templates Add bundled templates in madsci_common: - module/basic: Complete module with fake interface for testing - interface/fake: Simulated interface for testing without hardware - experiment/script: Simple run-once experiment - workflow/basic: Single-step workflow YAML - workcell/basic: Workcell configuration - lab/minimal: Lab configuration without Docker All commands support interactive and non-interactive modes with Rich prompts for parameter collection and file preview. Includes 11 CLI tests, all passing.

Refactor experiment execution infrastructure to support different execution contexts using composition over inheritance: - ExperimentBase: Core class using MadsciClientMixin composition with lifecycle methods (start, end, pause, cancel, fail) and manage_experiment() context manager - ExperimentScript: Simple run-once experiments with run() and main() methods - ExperimentNotebook: Jupyter notebook support with start()/end() pattern, run_workflow() convenience method, and Rich display integration - ExperimentTUI: Interactive terminal UI using Textual (optional dependency) with status display, log viewer, and control buttons - ExperimentNode: Server mode exposing run_experiment as REST API action - Updated experiment templates to use new modalities - Added deprecation warning to ExperimentApplication (deprecated v0.7.0, removed v0.8.0) - 31 tests for all new modalities

Reduce test suite runtime from ~32 minutes to ~6 minutes (5.2x faster) by changing cleanup fixtures from function to module scope. Key changes: - Root conftest.py: Add module-scoped cleanup_resources_after_module fixture - Root conftest.py: Only run gc.collect() between modules, not every test - madsci_node_module conftest.py: Change cleanup_temp_files to module scope - madsci_client conftest.py: Change cleanup_temp_files and cleanup_logging_handlers to module scope - test_stress.py: Mark test_burst_traffic with @pytest.mark.slow - test_middleware.py: Reduce rate limit windows for faster tests The function-scoped fixtures were causing massive I/O overhead (989s system time) from running glob operations, file deletion, and gc.collect() after every single test (~1920 tests). Module scope reduces this to ~50-100 cleanup operations instead. Also includes EventClient test isolation fixes to prevent tests from attempting to connect to real event servers.

Add comprehensive tutorials and persona-based guides: - Tutorials (docs/tutorials/): - 01-exploration.md: CLI, TUI, and concepts introduction - 02-first-node.md: Module creation with fake interface - 03-first-experiment.md: Experiment scripts and notebooks - 04-first-workcell.md: Multi-node coordination with workflows - 05-full-lab.md: Complete lab deployment with Docker - Equipment Integrator Guide (docs/guides/integrator/): - README.md: Guide overview and quick reference - 01-understanding-modules.md: Node/Module/Interface concepts - 02-creating-a-module.md: Module scaffolding walkthrough - 03-developing-interfaces.md: Interface patterns - 04-fake-interfaces.md: Simulated interface patterns - Lab Operator Guide (docs/guides/operator/): - README.md: Quick reference for daily operations - Experimentalist Guide (docs/guides/experimentalist/): - README.md: Quick reference for running experiments - Tutorial automation: - tutorial_02_first_node.tutorial.yaml: E2E test for module creation - Updated myst.yml with new navigation structure - Updated ux_overhaul_progress.md to reflect Phase 5 progress (75%)

Fix 3 failing tests in tests/e2e/test_tutorials.py by correcting the tutorial YAML files to match the E2ETestDefinition Pydantic schema and fixing a bug in the template engine where rendered source paths were passed to Jinja2 get_template() instead of the original unrendered paths (which match the actual filenames on disk with {{variable}} placeholders).

@action

…tes, and template tests CLI improvements: - Fix migrate.py and registry.py to respect global --no-color/--quiet/--json flags - Replace module-level Console() with context-aware _get_console(ctx) - Add @click.pass_context to all migrate (5) and registry (6) subcommands - Replace raise SystemExit(1) with ctx.exit(1) for proper Click lifecycle - Replace literal Unicode characters with escape sequences - Fix variable shadowing (error -> err) in migrate commands New templates (8 -> 11 total): - experiment/tui: ExperimentTUI modality with pause/cancel support - experiment/node: ExperimentNode modality with REST server - workflow/multi_step: 3-step workflow with node parameters - module/device: 11-file device module using @action decorator pattern, resource management with Slots, full device lifecycle, fake interface with command_history tracking Bug fix: - Fix workflow/basic template referencing undefined author_name variable Tests: - Add 74 template engine/registry tests across 5 test classes - Validate all 11 templates for existence, defaults, rendering, and syntax

… CLI imports, and E2E tests Add missing node/basic template that was referenced by CLI but empty, enhance lab/minimal template with .gitignore, pyproject.toml, and example workflow. Implement lazy imports across all CLI commands for faster startup performance. Add 3 new E2E tutorial tests covering workflow/workcell/lab creation and full lifecycle validation. Update template tests accordingly.

…bility, and documentation accuracy Fix 3 failing E2E tutorial tests: - Fix generate_from_template() to map --name flag to all template parameter types (lab_name, workflow_name, workcell_name) using category-based resolution instead of hardcoded parameter list - Fix device module template E501 lint error (line too long) - Fix tutorial_05 Python 3.10 compatibility (tomllib fallback) Fix documentation accuracy across tutorials and guides: - Replace non-existent --output flag with positional DIRECTORY arg - Fix ExperimentDesign field names and import paths - Fix run() -> run_experiment() method override - Fix --template -> --modality for experiment CLI - Fix invalid --check all, --service, --verbose CLI flags - Fix broken workflow_schema.md link Add .scratch/ to .gitignore for local testing.

…, redaction, utcnow, deprecation warnings, lazy imports Resolves all issues identified in the ux_overhaul merge code review: - CR-1 (High): Fix stdlib logging with kwargs in 7 files (~37 call sites). Converted structlog-style logger.warning("msg", key=val) to stdlib-compatible logger.warning("msg: key=%s", val) pattern to prevent runtime TypeError. - CR-2 (Medium): Fix _sync_to_lab in identity_resolver.py to include component_id in the POST body, enabling distributed ID coordination. - CR-3 (Medium): Fix over-aggressive settings redaction in manager_base.py. Replace broad substring matching with word-boundary-aware patterns to avoid redacting legitimate fields like primary_key or auth_enabled. - CR-4 (Medium): Replace all datetime.utcnow() calls across 6 files with datetime.now(tz=timezone.utc) for Python 3.12+ compatibility. Uses timezone.utc (not UTC constant) for Python 3.10 compatibility. - CR-5 (Low): Fix DeprecatedClass double-warning on nested subclasses by checking cls.__dict__ instead of hasattr() and tracking wrapped state. - CR-6 (Low): Implement true lazy CLI command loading via AliasedGroup. get_command() with _LAZY_COMMANDS registry, replacing eager imports that defeated the stated lazy loading intent. All 2004 tests pass.

The structlog ConsoleRenderer with colors=True was writing ANSI escape codes to both console and file handlers. Add AnsiStrippingFormatter to the file handler so log files remain clean while console output stays colorized. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix double-escaped unicode in doctor command that printed literal "\u2713" instead of the checkmark character. Add version field to ManagerHealth and populate it via the health endpoint using importlib.metadata, with a fallback for __main__ module resolution so it works when managers are run with python -m. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add 59 new tests covering the three CLI commands that previously had zero dedicated tests (registry, migrate, tui), plus integration tests validating the migration tool against example_lab. All 2063 tests pass. - test_registry.py: 21 tests covering all 6 subcommands (list, resolve, rename, clean, export, import) with empty/populated registries, JSON output, type filtering, and error cases - test_migrate.py: 21 tests covering all 5 subcommands (scan, convert, status, finalize, rollback) with empty dirs, definition files, and edge cases - test_migrate_example_lab.py: 12 integration tests validating the scanner finds all 20 definition files (7 manager + 6 node + 7 workflow), extracts component IDs, plans migration actions, and dry-run doesn't modify files - test_tui.py: 5 tests covering help, screen choices, mocked launch, graceful import error handling, and the 'ui' alias - Update completion plan to mark Phase A complete, unblocking Phase B - Add .scratch directory directive to AGENTS.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Adds start, stop, init, validate, run, completion, and backup commands to the madsci CLI, bringing CLI command coverage from 53% to 100%. All commands follow existing patterns (lazy loading, Click, Rich output) and include dedicated test files (47 new tests, 2110 total passing). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…est coverage Add 14 new templates across 4 categories (module, interface, lab, comm), bringing the template library from 12 to 26 templates. Add COMM category to TemplateCategory enum. Expand template test suite from ~85 to 150 tests covering existence, defaults validation, rendering, and Python syntax for all templates. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…eming, and command palette - D.1: Trogon command palette via `madsci commands` CLI + Ctrl+P in TUI - D.2: Auto-refresh (5s) on Dashboard and Status screens with 'a' toggle - D.3: Node management screen (nodes from Workcell Manager, detail panel) - D.4: Workflow visualization screen (active/queued workflows, step progress, pause/resume/cancel) - D.5: External CSS theming (styles/theme.tcss), removed all inline DEFAULT_CSS - Updated app.py with 5 screens, new keybindings (n/w/Ctrl+P), updated help - 14 TUI tests (up from 5), 2119 total tests passing, ruff clean Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add migration guide (docs/guides/migration.md) documenting the transition from definition YAML files to settings+registry configuration. Create 8 integration tests that run the converter against a copy of example_lab, validating backup creation, ID registration, deprecation markers, and rollback. Update example_lab/.env with commented-out settings-equivalent environment variables. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…rides Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Enable running the full MADSci stack without Docker via `madsci start --mode=local`. All 7 managers run in a single process with in-memory replacements for Redis, MongoDB, and PostgreSQL (SQLite), requiring zero changes to manager business logic. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ification Replace implicit auto-writing of definition/info files with explicit CLI commands and metadata-driven secret handling. G.1 adds model_dump_safe() and json_schema_extra={"secret": True} annotations across all manager settings. G.2-G.3 remove auto-writing behavior (managers/nodes). G.4 adds madsci config export/create commands. G.5 adds NodeInfo.from_config() factory with identity fields on NodeConfig. G.6 adds migration guide and deprecation warnings. 54 new tests, 2359 total passing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ger/node start, and TUI wizard H.1: Update README quick start, create CLI reference, template catalog, CHANGELOG, and update operator guide to use madsci CLI commands as primary workflow. H.2: Add `just verify-ux` target, CLI smoke test for all 17 commands, and fix CI template validation to use the real test suite. H.3: Add health check polling after `madsci start -d` with Rich Live display, --wait/--no-wait and --wait-timeout options. H.4: Convert start/stop to Click groups with manager/node subcommands and PID tracking in ~/.madsci/pids/. H.5: Add `madsci new --tui` Textual template browser with category tree, searchable table, and detail panel. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…y improvements Fix shell injection in template hooks (shlex.split + shell=False), add no_color context propagation, nested secret redaction for MadsciBaseSettings, detached process logging, stop-manager name validation, importlib.resources path fix, parameterized DB credentials in distributed lab template, defensive imports, PATH parameter validation, min_length=0 truthiness fix, shared get_console utility, shared TUI constants, bounded seen_ids, --screen option wiring, manager_base cleanup, updated docs/CI, and strengthened test assertions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…y improvements Harden process management (start_new_session, SIGTERM→SIGKILL escalation, close log handles), add template security (path traversal prevention, SandboxedEnvironment for user templates), fix local backend shared locks and Redlock cleanup, handle list-of-model secret redaction, wire MadsciCLIConfig to all CLI commands and TUI screens, and clean up dead code and console access patterns. All 2427 tests pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ovements from PR review Security fixes: - Add path traversal validation in uninstall_template (registry.py) - Add source path traversal check in template engine render (engine.py) - Use model_dump_safe() instead of model_dump() for startup settings logging (manager_base.py) - Add secret annotations to auth_password and auth_token fields (interface_types.py) Robustness fixes: - Store process metadata in PID files and verify identity before signaling (start.py) - Wrap Popen in try/except to prevent file handle leak on error (start.py) - Always double-quote .env values and escape special characters (converter.py) - Fix double datetime.now() call in _mark_deprecated (converter.py) - Use .pop() for thread-safe heartbeat cleanup (lock_manager.py) Consistency improvements: - Replace swallowed exception with debug logging in config export (config.py) - Warn on stderr when TOML config file cannot be parsed (utils/config.py) - Catch invalid regex in --grep with user-friendly error (logs.py) - Fix config save serialization to only stringify Path/AnyUrl types (utils/config.py) - Use ctx.get_parameter_source() for lab URL override detection (cli/__init__.py) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…packaging Migration tests used exact file counts that differed between local (where a gitignored default.node.yaml exists) and CI. Changed to subset checks with minimum count assertions. Module template pyproject.toml used packages.find which couldn't discover loose .py modules in src/; replaced with explicit package-dir and py-modules configuration across all 6 module templates. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove src/__init__.py from module templates (inconsistent with py-modules setuptools config). Use PYTHONPATH in tutorial test to ensure module imports work regardless of pip install location (PDM+uv venvs may not include pip). Print subprocess stdout/stderr on step failure for CI debuggability. Update CHANGELOG test count to 2427+. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The CI venv (PDM with uv backend) does not include pip, so `python -m pip install -e` fails. The PYTHONPATH approach from the previous commit already handles imports correctly, so the pip install step is no longer needed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-02-14T01:49:13Z

Coverage report

Warning

The diff for this PR is too large to be retrieved from GitHub's API (maximum 300 files). Diff coverage is not available for this PR.

This PR does not seem to contain any modification to coverable code.

RyanTheRobothead and others added 30 commits February 8, 2026 00:05

Initial ux_overhaul design plan

080805f

Address some minor logging annoyances

9614e74

Regenerate docs

61d7503

Merge branch 'unstable' into ux_overhaul

52936c8

UX Overhaul Completion plan added

b58f342

Complete Phase E.2: add opt-in registry resolution to manager base

65e13b4

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Complete Phase E.3: settings consolidation for structural config over…

0bfb202

…rides Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

RyanTheRobothead and others added 8 commits February 13, 2026 13:13

Add yarn and pre-commit tools to devbox

8c810f6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UX Overhaul: CLI, templates, TUI, local mode, and docs#226

UX Overhaul: CLI, templates, TUI, local mode, and docs#226
RyanTheRobothead wants to merge 38 commits intounstablefrom
ux_overhaul

RyanTheRobothead commented Feb 13, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

RyanTheRobothead commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Phase breakdown

Test plan

Uh oh!

github-actions bot commented Feb 14, 2026

Coverage report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

RyanTheRobothead commented Feb 13, 2026 •

edited

Loading