Skip to content

UX Overhaul: CLI, templates, TUI, local mode, and docs#226

Draft
RyanTheRobothead wants to merge 38 commits intounstablefrom
ux_overhaul
Draft

UX Overhaul: CLI, templates, TUI, local mode, and docs#226
RyanTheRobothead wants to merge 38 commits intounstablefrom
ux_overhaul

Conversation

@RyanTheRobothead
Copy link
Member

@RyanTheRobothead RyanTheRobothead commented Feb 13, 2026

Summary

Complete UX overhaul of the MADSci framework across 8 phases (A-H), delivering a production-ready CLI, template system, TUI dashboard, and comprehensive documentation.

  • 17 CLI commands with lazy loading, shell completion, and consistent error handling (init, new, start, stop, status, doctor, run, validate, config, logs, backup, version, completion, commands, tui, registry, migrate)
  • 26 scaffolding templates across 8 categories (module, interface, node, experiment, workflow, workcell, lab, communication) with Jinja2 engine, parameter validation, and interactive wizard
  • 6 TUI screens (dashboard, status, logs, nodes, workflows, template browser) with Textual, auto-refresh, CSS theming, and Trogon command palette
  • Pure Python local mode (madsci start --mode=local) with in-memory drop-in backends for MongoDB and Redis — no Docker required for development
  • Explicit configuration management with secret classification (model_dump_safe()), madsci config export/create, and no auto-writing
  • Individual manager/node start/stop with PID tracking (madsci start manager event -d, madsci stop manager event)
  • Health check polling after madsci start -d with Rich Live progress display
  • Security hardening: path traversal validation, SandboxedEnvironment for user templates, secret annotations on auth fields, PID identity verification, SIGTERM→SIGKILL escalation
  • Documentation: CLI reference, template catalog, CHANGELOG, updated README and operator guide
  • CI: CLI smoke tests for all 17 commands, just verify-ux target, fixed template validation workflow
  • 2427+ tests passing with full ruff compliance

Phase breakdown

Phase Description
A Test coverage and stability foundation
B 7 missing CLI lifecycle commands
C Template library expansion to 26 templates
D TUI enhancements (5 screens, auto-refresh, theming, Trogon)
E Settings consolidation, registry resolution, migration validation
F Pure Python local mode with in-memory backends
G Explicit config management with secret classification
H Docs, CI smoke tests, health polling, manager/node start/stop, TUI wizard
Review fixes Security hardening, robustness improvements, CI test fixes

Test plan

  • pytest — all 2427+ tests pass
  • just checks — ruff check and format pass
  • just verify-ux — combined checks + tests
  • madsci --help — CLI loads and shows all 17 commands
  • madsci start --mode=local — all managers start in-process
  • madsci new list — shows all 26 templates
  • madsci tui — TUI dashboard launches
  • madsci doctor — environment diagnostics run
  • madsci config export — exports current config with secret redaction

🤖 Generated with Claude Code

RyanTheRobothead and others added 30 commits February 8, 2026 00:05
Implement the E2E test harness and validation framework as the foundation
for validating all future UX improvements. This infrastructure prevents
regressions and serves as living documentation.

E2E Test Harness (madsci.common.testing):
- types.py: Pydantic models for test definitions, steps, and validations
- validators.py: 14 validator types (exit_code, file_exists, http_health,
  json_contains, python_syntax, ruff_check, etc.)
- runner.py: E2ETestRunner with pure Python and Docker mode support
- template_validator.py: Template validation with syntax/lint checking

Test Infrastructure:
- 43 passing unit tests for validators and runner
- tests/e2e/ directory for tutorial test definitions
- YAML-based test definitions for human-readable test cases

CI Integration:
- .github/workflows/validation.yml for automated validation
- Runs E2E framework tests, tutorial tests, and template validation

Code Quality:
- All code passes ruff linting with no warnings
- Full docstrings on public classes and methods
- Refactored complex methods for maintainability

Documentation:
- docs/guides/ux_overhaul_progress.md tracking implementation status
This update establishes a clear separation between nodes (runtime servers
that communicate with workcells) and modules (complete packages containing
node, interfaces, drivers, types, tests, and deployment configs).

Key changes:
- CLI: Add `madsci new module` and `madsci new interface` commands
- Templates: Restructure from node/ to module/ hierarchy with interface variants
- TUI: Update wizard from "New Node" to "New Module" with interface selection
- Registry: Add "module" as a component type with interface_variants metadata
- Migration: Add `madsci migrate module` for standardizing external repos
- Settings: Add ModuleSettings, InterfaceSettings, and foo_types.py pattern

This clarifies the Equipment Integrator workflow and enables testing
without hardware via fake/sim interfaces.
- Add concrete deprecation timeline: deprecated in v0.7.0, removed in v0.8.0
- Define TUI phased delivery: MVP in Phase 1, full features in later phases
- Add multi-workcell support to ID Registry design
- Add air-gapped environment support for template installation
- Improve docker-compose migration safety (ruamel.yaml, validation, rollback)
- Add OTEL integration notes across CLI, TUI, and ID Registry designs
- Correct documentation URL to https://ad-sdl.github.io/MADSci/
- Add madsci run workflow convenience command
- Clarify Windows compatibility requirements (filelock, pathlib)
- Add schema export capability to Settings Consolidation
- Add per-step timeout support to E2E test harness
- Expand ExperimentCampaign design notes
Add unified madsci CLI entry point with the following commands:
- madsci version: Display installed MADSci packages and versions
- madsci doctor: System diagnostics (Python, Docker, ports)
- madsci status: Service health status with watch mode
- madsci logs: Log viewing from Event Manager with filtering
- madsci tui: Launch interactive Textual-based TUI

Core infrastructure includes:
- AliasedGroup for command aliases support
- MadsciCLIConfig for TOML-based configuration
- Rich-based output formatting utilities
- Textual TUI with dashboard, status, and logs screens

Also adds comprehensive test suite for CLI commands.
This commit completes Phase 2 of the MADSci UX overhaul plan:

## ID Registry System
- Add LocalRegistryManager for file-based name-to-ID mappings
- Add LockManager with heartbeat-based distributed locking
- Add IdentityResolver for lab-level coordination
- Add registry CLI commands (resolve, list, export, import, clean)

## Migration Tools
- Add MigrationScanner to detect definition files needing migration
- Add MigrationConverter to convert definitions to new format
- Add MigrationRollback for safe rollback capability
- Add migrate CLI commands (scan, convert, status, finalize, rollback)

## Template System
- Add TemplateEngine for rendering Jinja2 templates
- Add TemplateRegistry for discovering bundled/user templates
- Add template types with validation and hooks

## Settings Consolidation
- Add ModuleSettings and NodeModuleSettings for node development
- Add interface settings hierarchy (Serial, Socket, USB, HTTP)
- Add settings export endpoint to AbstractManagerBase

## Deprecation Layer
- Add deprecation utilities with timeline (deprecated v0.7.0, removed v0.8.0)
- Integrate warnings into manager definition loading

## Code Quality
- Fix all ruff linting issues (36 errors)
- Convert all f-string logging to structured kwargs (37 violations)
- Add comprehensive test coverage for new components
Add madsci new command group with subcommands for creating MADSci
components from templates:
- madsci new module: Create complete module with node, interfaces, types
- madsci new interface: Add interface variants to existing modules
- madsci new node/experiment/workflow/workcell/lab: Component scaffolding
- madsci new list: Discover available templates

Add bundled templates in madsci_common:
- module/basic: Complete module with fake interface for testing
- interface/fake: Simulated interface for testing without hardware
- experiment/script: Simple run-once experiment
- workflow/basic: Single-step workflow YAML
- workcell/basic: Workcell configuration
- lab/minimal: Lab configuration without Docker

All commands support interactive and non-interactive modes with
Rich prompts for parameter collection and file preview.

Includes 11 CLI tests, all passing.
Refactor experiment execution infrastructure to support different execution
contexts using composition over inheritance:

- ExperimentBase: Core class using MadsciClientMixin composition with
  lifecycle methods (start, end, pause, cancel, fail) and manage_experiment()
  context manager

- ExperimentScript: Simple run-once experiments with run() and main() methods

- ExperimentNotebook: Jupyter notebook support with start()/end() pattern,
  run_workflow() convenience method, and Rich display integration

- ExperimentTUI: Interactive terminal UI using Textual (optional dependency)
  with status display, log viewer, and control buttons

- ExperimentNode: Server mode exposing run_experiment as REST API action

- Updated experiment templates to use new modalities
- Added deprecation warning to ExperimentApplication (deprecated v0.7.0,
  removed v0.8.0)
- 31 tests for all new modalities
Reduce test suite runtime from ~32 minutes to ~6 minutes (5.2x faster)
by changing cleanup fixtures from function to module scope.

Key changes:
- Root conftest.py: Add module-scoped cleanup_resources_after_module fixture
- Root conftest.py: Only run gc.collect() between modules, not every test
- madsci_node_module conftest.py: Change cleanup_temp_files to module scope
- madsci_client conftest.py: Change cleanup_temp_files and
  cleanup_logging_handlers to module scope
- test_stress.py: Mark test_burst_traffic with @pytest.mark.slow
- test_middleware.py: Reduce rate limit windows for faster tests

The function-scoped fixtures were causing massive I/O overhead
(989s system time) from running glob operations, file deletion, and
gc.collect() after every single test (~1920 tests). Module scope
reduces this to ~50-100 cleanup operations instead.

Also includes EventClient test isolation fixes to prevent tests
from attempting to connect to real event servers.
Add comprehensive tutorials and persona-based guides:

- Tutorials (docs/tutorials/):
  - 01-exploration.md: CLI, TUI, and concepts introduction
  - 02-first-node.md: Module creation with fake interface
  - 03-first-experiment.md: Experiment scripts and notebooks
  - 04-first-workcell.md: Multi-node coordination with workflows
  - 05-full-lab.md: Complete lab deployment with Docker

- Equipment Integrator Guide (docs/guides/integrator/):
  - README.md: Guide overview and quick reference
  - 01-understanding-modules.md: Node/Module/Interface concepts
  - 02-creating-a-module.md: Module scaffolding walkthrough
  - 03-developing-interfaces.md: Interface patterns
  - 04-fake-interfaces.md: Simulated interface patterns

- Lab Operator Guide (docs/guides/operator/):
  - README.md: Quick reference for daily operations

- Experimentalist Guide (docs/guides/experimentalist/):
  - README.md: Quick reference for running experiments

- Tutorial automation:
  - tutorial_02_first_node.tutorial.yaml: E2E test for module creation

- Updated myst.yml with new navigation structure
- Updated ux_overhaul_progress.md to reflect Phase 5 progress (75%)
Fix 3 failing tests in tests/e2e/test_tutorials.py by correcting the
tutorial YAML files to match the E2ETestDefinition Pydantic schema and
fixing a bug in the template engine where rendered source paths were
passed to Jinja2 get_template() instead of the original unrendered
paths (which match the actual filenames on disk with {{variable}}
placeholders).
…tes, and template tests

CLI improvements:
- Fix migrate.py and registry.py to respect global --no-color/--quiet/--json flags
- Replace module-level Console() with context-aware _get_console(ctx)
- Add @click.pass_context to all migrate (5) and registry (6) subcommands
- Replace raise SystemExit(1) with ctx.exit(1) for proper Click lifecycle
- Replace literal Unicode characters with escape sequences
- Fix variable shadowing (error -> err) in migrate commands

New templates (8 -> 11 total):
- experiment/tui: ExperimentTUI modality with pause/cancel support
- experiment/node: ExperimentNode modality with REST server
- workflow/multi_step: 3-step workflow with node parameters
- module/device: 11-file device module using @action decorator pattern,
  resource management with Slots, full device lifecycle, fake interface
  with command_history tracking

Bug fix:
- Fix workflow/basic template referencing undefined author_name variable

Tests:
- Add 74 template engine/registry tests across 5 test classes
- Validate all 11 templates for existence, defaults, rendering, and syntax
… CLI imports, and E2E tests

Add missing node/basic template that was referenced by CLI but empty,
enhance lab/minimal template with .gitignore, pyproject.toml, and example
workflow. Implement lazy imports across all CLI commands for faster startup
performance. Add 3 new E2E tutorial tests covering workflow/workcell/lab
creation and full lifecycle validation. Update template tests accordingly.
…bility, and documentation accuracy

Fix 3 failing E2E tutorial tests:
- Fix generate_from_template() to map --name flag to all template
  parameter types (lab_name, workflow_name, workcell_name) using
  category-based resolution instead of hardcoded parameter list
- Fix device module template E501 lint error (line too long)
- Fix tutorial_05 Python 3.10 compatibility (tomllib fallback)

Fix documentation accuracy across tutorials and guides:
- Replace non-existent --output flag with positional DIRECTORY arg
- Fix ExperimentDesign field names and import paths
- Fix run() -> run_experiment() method override
- Fix --template -> --modality for experiment CLI
- Fix invalid --check all, --service, --verbose CLI flags
- Fix broken workflow_schema.md link

Add .scratch/ to .gitignore for local testing.
…, redaction, utcnow, deprecation warnings, lazy imports

Resolves all issues identified in the ux_overhaul merge code review:

- CR-1 (High): Fix stdlib logging with kwargs in 7 files (~37 call sites).
  Converted structlog-style logger.warning("msg", key=val) to stdlib-compatible
  logger.warning("msg: key=%s", val) pattern to prevent runtime TypeError.

- CR-2 (Medium): Fix _sync_to_lab in identity_resolver.py to include
  component_id in the POST body, enabling distributed ID coordination.

- CR-3 (Medium): Fix over-aggressive settings redaction in manager_base.py.
  Replace broad substring matching with word-boundary-aware patterns to
  avoid redacting legitimate fields like primary_key or auth_enabled.

- CR-4 (Medium): Replace all datetime.utcnow() calls across 6 files with
  datetime.now(tz=timezone.utc) for Python 3.12+ compatibility. Uses
  timezone.utc (not UTC constant) for Python 3.10 compatibility.

- CR-5 (Low): Fix DeprecatedClass double-warning on nested subclasses by
  checking cls.__dict__ instead of hasattr() and tracking wrapped state.

- CR-6 (Low): Implement true lazy CLI command loading via AliasedGroup.
  get_command() with _LAZY_COMMANDS registry, replacing eager imports
  that defeated the stated lazy loading intent.

All 2004 tests pass.
The structlog ConsoleRenderer with colors=True was writing ANSI escape
codes to both console and file handlers. Add AnsiStrippingFormatter to
the file handler so log files remain clean while console output stays
colorized.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix double-escaped unicode in doctor command that printed literal "\u2713"
instead of the checkmark character. Add version field to ManagerHealth and
populate it via the health endpoint using importlib.metadata, with a
fallback for __main__ module resolution so it works when managers are run
with python -m.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add 59 new tests covering the three CLI commands that previously had zero
dedicated tests (registry, migrate, tui), plus integration tests validating
the migration tool against example_lab. All 2063 tests pass.

- test_registry.py: 21 tests covering all 6 subcommands (list, resolve,
  rename, clean, export, import) with empty/populated registries, JSON
  output, type filtering, and error cases
- test_migrate.py: 21 tests covering all 5 subcommands (scan, convert,
  status, finalize, rollback) with empty dirs, definition files, and
  edge cases
- test_migrate_example_lab.py: 12 integration tests validating the scanner
  finds all 20 definition files (7 manager + 6 node + 7 workflow), extracts
  component IDs, plans migration actions, and dry-run doesn't modify files
- test_tui.py: 5 tests covering help, screen choices, mocked launch,
  graceful import error handling, and the 'ui' alias
- Update completion plan to mark Phase A complete, unblocking Phase B
- Add .scratch directory directive to AGENTS.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds start, stop, init, validate, run, completion, and backup commands
to the madsci CLI, bringing CLI command coverage from 53% to 100%.
All commands follow existing patterns (lazy loading, Click, Rich output)
and include dedicated test files (47 new tests, 2110 total passing).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…est coverage

Add 14 new templates across 4 categories (module, interface, lab, comm),
bringing the template library from 12 to 26 templates. Add COMM category
to TemplateCategory enum. Expand template test suite from ~85 to 150 tests
covering existence, defaults validation, rendering, and Python syntax for
all templates.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…eming, and command palette

- D.1: Trogon command palette via `madsci commands` CLI + Ctrl+P in TUI
- D.2: Auto-refresh (5s) on Dashboard and Status screens with 'a' toggle
- D.3: Node management screen (nodes from Workcell Manager, detail panel)
- D.4: Workflow visualization screen (active/queued workflows, step progress, pause/resume/cancel)
- D.5: External CSS theming (styles/theme.tcss), removed all inline DEFAULT_CSS
- Updated app.py with 5 screens, new keybindings (n/w/Ctrl+P), updated help
- 14 TUI tests (up from 5), 2119 total tests passing, ruff clean

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add migration guide (docs/guides/migration.md) documenting the transition from
definition YAML files to settings+registry configuration. Create 8 integration
tests that run the converter against a copy of example_lab, validating backup
creation, ID registration, deprecation markers, and rollback. Update
example_lab/.env with commented-out settings-equivalent environment variables.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rides

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Enable running the full MADSci stack without Docker via `madsci start --mode=local`.
All 7 managers run in a single process with in-memory replacements for Redis,
MongoDB, and PostgreSQL (SQLite), requiring zero changes to manager business logic.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ification

Replace implicit auto-writing of definition/info files with explicit CLI
commands and metadata-driven secret handling. G.1 adds model_dump_safe()
and json_schema_extra={"secret": True} annotations across all manager
settings. G.2-G.3 remove auto-writing behavior (managers/nodes). G.4 adds
madsci config export/create commands. G.5 adds NodeInfo.from_config()
factory with identity fields on NodeConfig. G.6 adds migration guide and
deprecation warnings. 54 new tests, 2359 total passing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
RyanTheRobothead and others added 8 commits February 13, 2026 13:13
…ger/node start, and TUI wizard

H.1: Update README quick start, create CLI reference, template catalog, CHANGELOG,
and update operator guide to use madsci CLI commands as primary workflow.

H.2: Add `just verify-ux` target, CLI smoke test for all 17 commands, and fix CI
template validation to use the real test suite.

H.3: Add health check polling after `madsci start -d` with Rich Live display,
--wait/--no-wait and --wait-timeout options.

H.4: Convert start/stop to Click groups with manager/node subcommands and PID
tracking in ~/.madsci/pids/.

H.5: Add `madsci new --tui` Textual template browser with category tree, searchable
table, and detail panel.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…y improvements

Fix shell injection in template hooks (shlex.split + shell=False), add no_color
context propagation, nested secret redaction for MadsciBaseSettings, detached
process logging, stop-manager name validation, importlib.resources path fix,
parameterized DB credentials in distributed lab template, defensive imports,
PATH parameter validation, min_length=0 truthiness fix, shared get_console
utility, shared TUI constants, bounded seen_ids, --screen option wiring,
manager_base cleanup, updated docs/CI, and strengthened test assertions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…y improvements

Harden process management (start_new_session, SIGTERM→SIGKILL escalation, close log handles),
add template security (path traversal prevention, SandboxedEnvironment for user templates),
fix local backend shared locks and Redlock cleanup, handle list-of-model secret redaction,
wire MadsciCLIConfig to all CLI commands and TUI screens, and clean up dead code and
console access patterns. All 2427 tests pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ovements from PR review

Security fixes:
- Add path traversal validation in uninstall_template (registry.py)
- Add source path traversal check in template engine render (engine.py)
- Use model_dump_safe() instead of model_dump() for startup settings logging (manager_base.py)
- Add secret annotations to auth_password and auth_token fields (interface_types.py)

Robustness fixes:
- Store process metadata in PID files and verify identity before signaling (start.py)
- Wrap Popen in try/except to prevent file handle leak on error (start.py)
- Always double-quote .env values and escape special characters (converter.py)
- Fix double datetime.now() call in _mark_deprecated (converter.py)
- Use .pop() for thread-safe heartbeat cleanup (lock_manager.py)

Consistency improvements:
- Replace swallowed exception with debug logging in config export (config.py)
- Warn on stderr when TOML config file cannot be parsed (utils/config.py)
- Catch invalid regex in --grep with user-friendly error (logs.py)
- Fix config save serialization to only stringify Path/AnyUrl types (utils/config.py)
- Use ctx.get_parameter_source() for lab URL override detection (cli/__init__.py)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…packaging

Migration tests used exact file counts that differed between local (where a
gitignored default.node.yaml exists) and CI. Changed to subset checks with
minimum count assertions. Module template pyproject.toml used packages.find
which couldn't discover loose .py modules in src/; replaced with explicit
package-dir and py-modules configuration across all 6 module templates.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove src/__init__.py from module templates (inconsistent with py-modules
setuptools config). Use PYTHONPATH in tutorial test to ensure module imports
work regardless of pip install location (PDM+uv venvs may not include pip).
Print subprocess stdout/stderr on step failure for CI debuggability. Update
CHANGELOG test count to 2427+.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The CI venv (PDM with uv backend) does not include pip, so
`python -m pip install -e` fails. The PYTHONPATH approach from
the previous commit already handles imports correctly, so the
pip install step is no longer needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link

Coverage report

Warning

The diff for this PR is too large to be retrieved from GitHub's API (maximum 300 files). Diff coverage is not available for this PR.

This PR does not seem to contain any modification to coverable code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant