Hackathon Demo: AI-Powered Incident Response System

Overview

This demo showcases a complete incident response workflow using Motia's event-driven architecture with three key steps:

Ingest (API Step) - Receives incident alerts
Analyze (Event Step) - AI-powered analysis and decision making
Remediate (Event Step) - Durable workflow with idempotency

Architecture

HTTP POST → 1-ingest.step.ts → [incident.detected] → 2-analyze.step.ts → [fix.approved] → 3-remediate.step.ts

Files Created

steps/1-ingest.step.ts - API endpoint for incident ingestion
steps/2-analyze.step.ts - AI analysis engine
steps/3-remediate.step.ts - Durable remediation workflow

How to Run

1. Start the Development Server

npm run dev

This will start:

The Motia backend server
The Workbench UI (visual workflow designer)

2. Test the Workflow

Send an Incident Alert

curl -X POST http://localhost:3000/incidents \
  -H "Content-Type: application/json" \
  -d '{
    "serviceName": "payment-service",
    "severity": "critical",
    "message": "High memory usage detected - 95% utilization"
  }'

Expected response:

{
  "status": "accepted",
  "incidentId": "incident-1234567890-abc123"
}

3. Watch the Logs

You'll see the workflow progress through the logs:

Ingest Step: Incident received and emitted
Analyze Step (after ~2 seconds): AI analysis complete, fix approved
Remediate Step (after ~10 seconds): Remediation completed

4. Test Durability (The Cool Part!)

To demonstrate durability and idempotency:

Send an incident alert (as above)
Watch the logs - you'll see "⚠️ DURABILITY TEST: Kill the server now to test recovery! ⚠️"
Kill the server (Ctrl+C) during the 10-second wait
Restart the server with npm run dev
The remediation step will automatically resume from where it left off!

The step checks the state and logs: "Resuming remediation after server restart..."

Key Features Demonstrated

1. Event-Driven Architecture

API Step emits incident.detected event
Analyze Step subscribes to incident.detected, emits fix.approved
Remediate Step subscribes to fix.approved

2. Type Safety

All steps use Zod schemas for validation
TypeScript types auto-generated in types.d.ts
Full type inference across the workflow

3. Durability & Idempotency

State management tracks remediation progress
Server crashes don't lose work
Steps can resume from checkpoints

4. AI Simulation

2-second delay simulates AI processing
Decision logic based on severity:
- critical → restart pod
- warning → scale resources
- info → monitor only

Testing Different Scenarios

Critical Incident (triggers restart)

curl -X POST http://localhost:3000/incidents \
  -H "Content-Type: application/json" \
  -d '{
    "serviceName": "auth-service",
    "severity": "critical",
    "message": "Service unresponsive"
  }'

Warning (triggers scaling)

curl -X POST http://localhost:3000/incidents \
  -H "Content-Type: application/json" \
  -d '{
    "serviceName": "api-gateway",
    "severity": "warning",
    "message": "High latency detected"
  }'

Info (monitoring only, no remediation)

curl -X POST http://localhost:3000/incidents \
  -H "Content-Type: application/json" \
  -d '{
    "serviceName": "cache-service",
    "severity": "info",
    "message": "Cache hit rate below threshold"
  }'

Viewing in Workbench

Open the Workbench UI (URL shown in terminal after npm run dev)
Navigate to the workflow visualization
See the three steps connected by events
Watch real-time execution as incidents flow through

State Management

The remediation step uses Motia's state management:

Group ID: remediation-status
Key: fix-{serviceName}
Values: rebooting → healthy

Check state in logs or via Workbench state inspector.

Next Steps for Hackathon

Potential enhancements:

Add real LLM integration (OpenAI, Anthropic)
Connect to actual Kubernetes API
Add streaming status updates to frontend
Implement rollback logic
Add approval workflow before remediation
Create dashboard for incident history

Troubleshooting

Types not found?

npm run generate-types

Server won't start?

Check if port 3000 is available
Ensure Redis is running (if using BullMQ)

Steps not executing?

Check logs for errors
Verify event topic names match between emits and subscribes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hackathon Demo: AI-Powered Incident Response System

Overview

Architecture

Files Created

How to Run

1. Start the Development Server

2. Test the Workflow

Send an Incident Alert

3. Watch the Logs

4. Test Durability (The Cool Part!)

Key Features Demonstrated

1. Event-Driven Architecture

2. Type Safety

3. Durability & Idempotency

4. AI Simulation

Testing Different Scenarios

Critical Incident (triggers restart)

Warning (triggers scaling)

Info (monitoring only, no remediation)

Viewing in Workbench

State Management

Next Steps for Hackathon

Troubleshooting

FilesExpand file tree

HACKATHON_DEMO.md

Latest commit

History

HACKATHON_DEMO.md

File metadata and controls

Hackathon Demo: AI-Powered Incident Response System

Overview

Architecture

Files Created

How to Run

1. Start the Development Server

2. Test the Workflow

Send an Incident Alert

3. Watch the Logs

4. Test Durability (The Cool Part!)

Key Features Demonstrated

1. Event-Driven Architecture

2. Type Safety

3. Durability & Idempotency

4. AI Simulation

Testing Different Scenarios

Critical Incident (triggers restart)

Warning (triggers scaling)

Info (monitoring only, no remediation)

Viewing in Workbench

State Management

Next Steps for Hackathon

Troubleshooting