Skip to content

arnavbathla/lunar-mining-ai-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Lunar MineOps AI OS

Mission Readiness Agent for Lunar ISRU. Validate whether an autonomous lunar excavation-to-processing concept closes operationally before launch.

Simulation only. Not flight critical. No hardware control.

This README explains what problem the product addresses, how it solves it (product + engineering), how the Claude AI agent works step by step, and includes screenshots of the running UI. Technical architecture, environment variables, and API tables follow below.


Screenshots

Landing page & system status

Backend connectivity, demo mission seed, and whether Anthropic Claude is configured (ANTHROPIC_API_KEY in apps/api/.env). Deterministic features work without a key; agent endpoints require one.

Landing page with system status panel

Mission dashboard — brief & agent entry point

The Shackleton Ridge ISRU Demo brief (targets, duration, slope limit, battery margin). Run Mission Readiness Analysis kicks off the server-side Claude tool-use loop when the API key is present; otherwise you still get seeded site, plan, and simulation via REST.

Mission dashboard — mission brief

Readiness verdict & executive recommendation

Five-dimension readiness cards (Site Mineability, Production Target, Power Budget, Autonomy Risk, Mission Readiness) plus the executive summary strip. With ANTHROPIC_API_KEY, Claude fills Executive Recommendation, source grounding, next actions, and tool-call trace. Without the key, deterministic scoring still populates verdicts after a plan + simulation exist.

Readiness verdict and executive recommendation

Lunar site map (mineability layer)

Interactive 30×30 polar grid with layer toggles (mineability, resource, hazard, illumination, slope, comms). Useful for visually grounding dig zones, placements, and hazard corridors alongside the numeric simulation.

Lunar site map — mineability layer


The problem

Autonomous lunar ISRU missions combine terrain, power, mobility, processing, autonomy, and failure modes (dust, thermal, comms, battery). Teams often decide on architectures using slides or partial analyses, then discover integration issues after hardware commitment.

The core question this MVP attacks:

Before spending millions on hardware, can we show evidence that a specific excavation-to-processing concept closes operationally under stated constraints—using traceable inputs (public NASA/PDS context), reproducible terrain and simulation, and reviewable outputs (verdicts, telemetry, anomalies, approvals, reports)?


The solution (product)

Lunar MineOps AI OS is a single closed-loop product:

  1. Ingest & cache public NASA/PDS pages as structured, timestamped context.
  2. Seed a demo mission on a deterministic synthetic lunar polar site.
  3. Plan a balanced excavation/processing schedule and draft autonomy artifacts (behavior tree, ROS-shaped messages, state machine, runbook).
  4. Simulate 168 hours hour-by-hour with power, production, battery, and deterministic anomalies.
  5. Score readiness across five dimensions with an overall Go / Conditional Go / No-Go style outcome.
  6. Use Claude with tools (server-side only) to run readiness analysis, anomaly response, mission report, and optional source refresh—only invoking backend functions so numbers stay honest.
  7. Support human approvals when an anomaly path requires operator sign-off.
  8. Export a markdown mission readiness report.

The UI ties this together in one mission dashboard; the API supports the same story without the frontend.


How the AI agent works (step by step)

A. What you do in the UI

  1. Open the app — health panel shows backend online and Claude configured or missing key.
  2. Launch Demo Mission — seeds mission, 30×30 site, assets, and sources.
  3. Run Mission Readiness Analysis (requires API key) — Claude follows the scripted tool order in app/agents/prompts.py: load source + mission context, score mineability, build plan, generate autonomy artifacts, run the deterministic simulator, pull simulation details, then return structured JSON (verdict narrative, grounding, next actions).
  4. Review readiness cards, source evidence, map, plan & risks, simulation charts, anomalies, approvals, report.
  5. Ask Claude for Response on an anomaly — Claude reads telemetry + uses recommend_anomaly_response; may create an approval when required.
  6. Generate Mission Readiness Report — Claude orchestrates context + generate_mission_report tool; download .md when available.

B. What happens on the server

  1. FastAPI receives /agent/* requests only if ANTHROPIC_API_KEY is set (otherwise HTTP 400).
  2. agent_loop.run_agent_loop calls the Anthropic Messages API with CORE_SYSTEM_PROMPT, the task-specific user prompt, and tool definitions from tool_registry.py.
  3. Each turn: if Claude emits tool_use, the backend runs execute_tool → real services (planning_service, simulation.engine, readiness_service, report_service, etc.). Results are returned as tool_result blocks (max 8 iterations).
  4. The final assistant message is parsed for JSON (typically inside a Markdown fenced code block) when required; responses are persisted as AgentRun rows for audit.

Security model: the browser talks only to FastAPI. No API key is exposed to Next.js or any NEXT_PUBLIC_* variable.


What this is (data-flow snapshot)

Lunar MineOps AI OS is a narrow-scope MVP that closes one end-to-end loop:

NASA / PDS public source context  →  seeded lunar ISRU mission
        →  synthetic 30×30 polar site  →  mineability scoring
        →  Claude-guided balanced plan  →  deterministic 168 h simulation
        →  five-dimension readiness verdict (Go / Conditional Go / No-Go)
        →  Claude anomaly response  →  human approvals
        →  Claude-generated mission readiness report (markdown download)

The Claude agent runs server-side only. The frontend never sees the Anthropic API key.

Scope

In scope (built):

  • One seeded mission ("Shackleton Ridge ISRU Demo")
  • One deterministic synthetic lunar polar site (30×30, seedable)
  • One balanced plan strategy
  • One deterministic hourly simulation engine (168 h)
  • One readiness scoring policy (5 dimensions)
  • One Anthropic Claude tool-use agent loop with 13 real tools
  • One markdown report
  • One mission dashboard UX

Out of scope (intentionally not built): user auth, teams, billing, multi-user collaboration, scenario lab, custom asset editor, raw DEM ingestion, real rover integration, flight software, hardware control.

Architecture

┌────────────────────────────────┐         ┌──────────────────────────┐
│  Next.js dashboard             │         │  FastAPI backend         │
│  (apps/web, port 3000)         │ ──REST─►│  (apps/api, port 8000)   │
│                                │         │                          │
│  - Landing page                │         │  - SQLite persistence    │
│  - Mission dashboard           │         │  - Source ingestion      │
│  - Recharts                    │         │  - Site generator        │
│                                │         │  - Mineability scoring   │
│  No Anthropic calls here.      │         │  - Balanced planner      │
│  Only NEXT_PUBLIC_API_BASE_URL │         │  - Simulation engine     │
└────────────────────────────────┘         │  - Anomaly + readiness   │
                                           │  - Markdown report       │
                                           │                          │
                                           │   ┌──────────────────┐   │
                                           │   │ Claude tool-use  │   │
                                           │   │ agent loop       │   │
                                           │   │ (13 tools)       │   │
                                           │   └────────┬─────────┘   │
                                           └────────────┼─────────────┘
                                                        │
                                                        ▼
                                              ┌──────────────────┐
                                              │ Anthropic API    │
                                              │ (server-side)    │
                                              └──────────────────┘
                                                ▲
                                                │ httpx + bs4
                                              ┌─┴─────────────┐
                                              │ NASA / PDS    │
                                              │ public pages  │
                                              └───────────────┘

How source ingestion works

The backend fetches six official public pages with httpx, strips nav/footer/script/style chrome with BeautifulSoup, classifies sentences into nine categories (moon_to_mars_architecture, subarchitecture, isru, topography_data, power, mobility, autonomy, logistics, infrastructure), and persists each as a SourceDocument row with fetched_at, content_hash, is_fallback. If a fetch fails the backend inserts a fallback snapshot and clearly labels it. The frontend shows freshness and the fallback flag.

Sources used:

How the Claude tool-use agent loop works

apps/api/app/agents/agent_loop.py implements the iterative tool-use loop:

  1. Build messages: system prompt + user prompt + tools.
  2. Call client.messages.create(...) (real Anthropic SDK in production, a scriptable fake client in tests).
  3. If Claude returns tool_use blocks, validate the tool name and input, execute the corresponding backend tool, and append the result as a tool_result block.
  4. Stop when Claude returns final text. Hard cap of 8 iterations.
  5. Persist an AgentRun row with input prompt, tool calls, and final response.

Tools exposed (apps/api/app/agents/tool_registry.py):

refresh_public_sources            generate_balanced_mission_plan
get_source_context                run_mission_simulation
get_mission_context               get_simulation_details
generate_synthetic_lunar_site     recommend_anomaly_response
score_site_mineability            create_approval
seed_default_assets               generate_autonomy_artifacts
                                  generate_mission_report

Each tool maps to real backend logic (no static stubs).

Required environment variables

.env.example:

ANTHROPIC_API_KEY=
ANTHROPIC_MODEL=claude-sonnet-4-6
DATABASE_URL=sqlite:///./lunar_mineops.db
CORS_ORIGINS=http://localhost:3000
NEXT_PUBLIC_API_BASE_URL=http://localhost:8000
SOURCE_REFRESH_MODE=live
  • ANTHROPIC_API_KEY is required only for /agent/* endpoints. The backend boots and serves all deterministic routes without it.
  • NEXT_PUBLIC_API_BASE_URL is the only env var the frontend reads. It is safe to expose. Do not add ANTHROPIC_API_KEY to the web container or any NEXT_PUBLIC_* variable.

Local development

Backend

cd apps/api
python3.11 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp ../../.env.example .env       # add ANTHROPIC_API_KEY here
uvicorn main:app --reload --port 8000

Health check: curl localhost:8000/health{"status":"ok","anthropic_configured":true|false}.

Frontend

cd apps/web
npm install
NEXT_PUBLIC_API_BASE_URL=http://localhost:8000 npm run dev

Open http://localhost:3000.

Docker

ANTHROPIC_API_KEY=<your-api-key> docker compose up --build

Brings up api (port 8000, SQLite volume api_data) and web (port 3000).

Tests

cd apps/api
. .venv/bin/activate
pytest -q

Tests cover source ingestion (with fallback + stub fetcher), site generation, mineability scoring, asset seeding, planning, simulation, readiness scoring, anomaly cadence, agent tools, agent loop (using the fake Anthropic client), and report markdown shape.

Make targets

make install   # set up backend venv + frontend node_modules
make api       # run backend
make web       # run frontend
make test      # run pytest
make seed      # POST /demo/seed via curl

Demo walkthrough

  1. Add ANTHROPIC_API_KEY to apps/api/.env. Set ANTHROPIC_MODEL=claude-sonnet-4-6.
  2. Run the backend: cd apps/api && . .venv/bin/activate && uvicorn main:app --reload --port 8000.
  3. Run the frontend: cd apps/web && npm run dev.
  4. Open http://localhost:3000.
  5. Click Launch Demo Mission.
  6. The dashboard loads with mission, site, assets, and source context (live or fallback).
  7. Click Run Mission Readiness Analysis. Claude will call the source + mission tools, generate the plan, run the simulation, and return the readiness verdict.
  8. Review the five readiness verdict cards (Site, Production, Power, Autonomy, Mission).
  9. Inspect source context, refresh sources if needed.
  10. Inspect the lunar map: toggle layers (mineability, resource, hazard, illumination, slope, comms), click cells.
  11. Inspect the plan timeline, risk register, and autonomy artifacts.
  12. Review production / battery margin / power Recharts.
  13. Click Ask Claude for Response on any anomaly.
  14. Approve or reject any pending approvals (operator note optional).
  15. Click Generate Mission Readiness Report, then Download .md.

API endpoint overview

Path Description
GET /health Health + anthropic_configured flag
GET /sources List source documents
POST /sources/refresh Refresh source documents
GET /sources/context Source-grounded facts and freshness
POST /demo/seed Idempotent demo mission/site/assets seed
POST /demo/reset Wipe all mission-scoped state
GET /demo/default-dashboard Snapshot of latest mission state
GET /missions/{id} Mission bundle (mission, site, assets, latest plan/sim, anomalies)
GET /missions/{id}/plans Plans for mission
GET /plans/{plan_id} Plan detail
GET /simulation-runs/{id} Simulation summary
GET /simulation-runs/{id}/telemetry Telemetry + curves
GET /simulation-runs/{id}/anomalies Anomaly list
GET /simulation-runs/{id}/approvals Approvals for simulation
POST /approvals/{id}/approve Approve with optional operator note
POST /approvals/{id}/reject Reject with optional operator note
GET /simulation-runs/{id}/report.md Deterministic markdown report
GET /agent/status anthropic_configured, model, tools
POST /agent/run-readiness-analysis Full Claude workflow
POST /agent/anomaly-response Claude anomaly response + optional approval
POST /agent/report Claude-generated markdown report
POST /agent/refresh-sources Claude-driven source refresh

Known limitations

  • Synthetic terrain. The MVP does not ingest raw LOLA DEM products.
  • Public-source context only. No proprietary mission data.
  • Simulation only. No hardware control. Autonomy artifacts are draft.
  • Single seeded mission with one balanced plan strategy.
  • Optional operator chat is intentionally not implemented to keep scope narrow.
  • Live source fetches depend on the network. Fallback snapshots are used on failure.

Production hardening roadmap

  • Replace SQLite with Postgres; add Alembic migrations.
  • Add user auth and audit logs for approvals.
  • Replace synthetic terrain with LOLA / NAC pipelines.
  • Replace JSON snapshots with proper telemetry storage (TimescaleDB or DuckDB).
  • Container hardening: rootless, non-root user, read-only FS for web.
  • Add rate limiting and observability (OpenTelemetry).
  • Replace the fake Anthropic test client with VCR-style recording for integration tests.
  • Add Playwright end-to-end tests for the dashboard.
  • Replace the deterministic anomaly cadence with a calibrated risk model.
  • Add multi-mission and multi-plan workflows.
  • Sign and store agent runs alongside cryptographic provenance.

Security notes

  • The Anthropic API key is read only from apps/api/.env and never forwarded to the web container.
  • The frontend reads only NEXT_PUBLIC_API_BASE_URL.
  • All Claude calls happen inside FastAPI; frontend talks to FastAPI only.
  • No flight-critical commands are emitted. All artifacts are draft.

About

A demo version of an AI Agent harness for Lunar Mining Location Readiness

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors