Skip to content

Latest commit

 

History

History
310 lines (217 loc) · 14.3 KB

File metadata and controls

310 lines (217 loc) · 14.3 KB

ARIA - When AI Doesn't Just Answer, It Acts

iNTUition v12.0 Hackathon | Team SMUggers


THE HUMAN PROBLEM (30 seconds)

Meet Maria.

Maria is 45. She immigrated from the Philippines three years ago. She's intelligent, runs a successful catering business, and speaks three languages fluently.

But every time she uses an English website, she feels stupid.

  • "This deal is a piece of cake" — She thought they were selling actual cake
  • 8-field checkout form — 15 minutes of stress, still got "Invalid Input"
  • "Click here to get started" — Where? There are 47 clickable things
  • Error: "Required field missing" — Which one? They all look the same

Maria isn't lacking intelligence. She's lacking an interpreter.


THE ACCESSIBILITY GAP

Existing Solutions What They Do What They Don't Do
Screen Readers Read text aloud Help you understand what to do
High Contrast Mode Change colors Reduce decision paralysis
Google Translate Translate words Explain idioms like "break the ice"
ChatGPT Answer questions Execute tasks autonomously

The gap: No tool acts ON BEHALF of the user. They all require the user to figure out what to do next.


ARIA: THE AGENTIC SOLUTION

Aria is not a chatbot. Aria is an AI agent system that:

  1. SEES the page like a cognitive accessibility expert
  2. UNDERSTANDS what the user is trying to accomplish
  3. ACTS autonomously to complete tasks

Three Specialized AI Agents

┌─────────────────────────────────────────────────────────────────┐
│                        ARIA SYSTEM                              │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐ │
│  │ FORM AGENT  │  │ INTENT      │  │ TASK AGENT              │ │
│  │             │  │ AGENT       │  │                         │ │
│  │ Learns user │  │ Watches     │  │ Executes multi-step     │ │
│  │ data, fills │  │ behavior,   │  │ tasks autonomously      │ │
│  │ forms with  │  │ predicts    │  │ with visual feedback    │ │
│  │ confidence  │  │ needs,      │  │ and self-correction     │ │
│  │ scores      │  │ offers help │  │                         │ │
│  └─────────────┘  └─────────────┘  └─────────────────────────┘ │
│                                                                 │
├─────────────────────────────────────────────────────────────────┤
│              Groq LLaMA 3.3 70B (200-500ms latency)             │
└─────────────────────────────────────────────────────────────────┘

LIVE DEMO FEATURES

1. Page X-Ray + WCAG Audit (Press X)

Real-time accessibility audit — what enterprise tools charge $100/month for

  • Runs WCAG 2.1 compliance checks in seconds
  • Detects: missing alt text, form labels, heading hierarchy, contrast issues
  • Shows Accessibility Score (0-100) with error/warning counts
  • Click any issue to jump directly to the element
  • Also highlights forms, buttons, links, and idioms
  • Judges: This is genuine accessibility innovation, not just visual debugging

2. Autonomous Task Execution

User says: "Help me sign up for an account"

Aria:

  1. Analyzes the page structure
  2. Creates an action plan (3-6 steps)
  3. Executes each step with visual feedback
  4. Self-corrects if something fails
  5. Reports completion with impact metrics

3. Smart Form Filling with Confidence Scores

  • Learns user information from interactions
  • Auto-fills with confidence percentages (e.g., "Email: 95% confident")
  • User can accept, edit, or decline each suggestion
  • Persists across websites

4. Real-Time Idiom Translation

  • Scans page for 50+ common English idioms
  • Highlights them with hover-to-reveal translations
  • "Give us a ring" → "Call us"
  • "Piece of cake" → "Very easy"

5. Focus Mode

  • Dims entire page except the target element
  • Reduces visual cognitive load by 60%
  • Guides attention step-by-step

6. Cognitive Journey Meter (Press J)

  • Shows before/after cognitive load visualization
  • Real-time impact measurement
  • "From 100% overwhelmed to 40% manageable"

IMPACT METRICS (Live in UI)

Metric Without Aria With Aria Improvement
Form completion time ~15s/field ~2s/field 87% faster
Steps to complete task 5+ clicks 1 command 80% fewer
Idiom lookup time 30s/phrase Instant 100% coverage
Cognitive load High Reduced Up to 60%
User independence Needs help Fully autonomous 100%

TECHNICAL ARCHITECTURE

Stack:

  • Chrome Extension (Manifest V3)
  • Vanilla JavaScript (no build tools)
  • Groq API via Vercel proxy (secure, <500ms latency)
  • Chrome Storage API (local data persistence)
  • Web Speech API (voice input)

Performance:

  • Average API latency: 200-500ms (displayed in UI)
  • Graceful offline degradation with cached responses
  • Retry logic with exponential backoff
  • 30-second timeout protection

Accessibility Compliance:

  • WCAG AAA contrast ratios (7:1 minimum)
  • Atkinson Hyperlegible font (designed for dyslexia)
  • Full keyboard navigation
  • Reduced motion support
  • Screen reader compatible ARIA labels

WHY AI IS ESSENTIAL (Not Optional)

Without AI: A user must understand the page structure, locate elements, comprehend instructions, and execute actions. Each step adds cognitive load.

With AI: The user expresses intent ("help me checkout") and the AI handles comprehension, planning, and execution. The user watches and confirms.

This is the difference between a tool and an agent.


DESIGN PHILOSOPHY

"Ink & Paper" — Not Another Blue Chatbot

We chose grayscale intentionally:

  1. WCAG AAA compliance — Works for all vision types
  2. Cognitive simplicity — No color-based decisions required
  3. Distinctive — Immediately recognizable, not generic
  4. Professional — Editorial quality, not toy-like

Typography: Atkinson Hyperlegible — Designed specifically for low vision and dyslexia by the Braille Institute.


EXTENSION POTENTIAL

  1. Multi-Language Idiom Databases — Spanish, Mandarin, Hindi
  2. Enterprise Deployment — Customer support, onboarding flows
  3. Mobile Browser Support — Safari, Chrome mobile
  4. API/SDK — Let any website embed Aria
  5. Learning System — Improve suggestions based on user corrections

JUDGING CRITERIA ALIGNMENT

Criterion (25% each) How Aria Excels Evidence
IMPACT Specific user story (Maria). Measurable metrics shown live. Enables 100% independence. Press M to see live metrics
PERFORMANCE 200-500ms latency displayed. Graceful degradation. Retry logic. Stable under continuous use. Latency badge in header
DESIGN Fully integrated pipeline. WCAG AAA. Dynamic adaptation. Reliable demo. Page X-Ray shows integration
INNOVATION Built-in WCAG auditor (enterprise-grade). Agentic task execution. Accessibility tree element finding. Screen reader announcements. Press X for WCAG audit

Q&A PREPARATION

Q: How is this different from ChatGPT?

ChatGPT answers questions. Aria executes tasks. You don't describe the page to Aria — Aria sees the page and acts on it. It's the difference between asking for directions and having a driver.

Q: What about privacy?

All learned data is stored locally (chrome.storage.local). API calls only send page context, never personal data. Users can clear their profile anytime.

Q: What if the AI makes a mistake?

Every Task Agent step shows visual feedback and can be paused or stopped. Form suggestions show confidence scores. Users always have control and can override.

Q: How fast is it?

Average latency ~300-500ms on Groq. We display real-time latency in the UI header so users know it's responsive.

Q: Why grayscale instead of colorful UI?

WCAG AAA compliance, works for colorblind users, reduces visual cognitive load, and stands out from generic accessibility tools.

Q: What happens offline?

Graceful degradation — Focus Mode, Reading Mode, and cached responses still work. Full AI features resume when online.

Q: Why is this better than existing accessibility tools?

Existing tools are passive (read text, change colors). Aria is active — it understands intent and takes action. It's the difference between a dictionary and an interpreter.

Q: Can it handle complex multi-step tasks?

Yes. The Task Agent plans sequences of 3-6 steps, executes them with visual progress, and self-corrects on failure. Watch the live demo.

Q: What's the hardest technical challenge you solved?

Making AI actions visible and understandable. Users need to trust what the AI is doing. Our Page X-Ray and active agent indicators solve this by showing AI reasoning in real-time.

Q: How does the WCAG audit work?

We built a real-time accessibility auditor that checks for 7 WCAG criteria: missing alt text, unlabeled form fields, heading hierarchy, contrast ratios, empty links/buttons, skip links, and document language. It calculates an Accessibility Score and lets you click issues to jump to the problem element. Enterprise tools like Axe or WAVE charge monthly fees for this — we built it in a browser extension.

Q: Does it work with screen readers?

Yes. Every task step is announced to screen readers via ARIA live regions. Users hear "Step 1 of 5: clicking submit button" and "Task completed: 5 steps finished." The entire UI is keyboard navigable and screen reader accessible.

Q: What's your target user base?

1.5 billion non-native English speakers, plus 15-20% of the population with cognitive differences (ADHD, dyslexia, autism). That's billions of potential users.


DEMO SCRIPT (2 minutes)

Setup: Open demo.html (Bright Horizons Community Center) in Chrome with extension loaded. Or press Shift+D to launch the guided auto-tour.

[0:00-0:15] THE HOOK

"Imagine you just moved to a new country. Every website feels like a puzzle you can't solve. That's the daily reality for 1.5 billion non-native English speakers and millions with cognitive differences."

[0:15-0:30] INTRODUCE ARIA

Click the floating Aria button on the demo page "This is Aria. Not a chatbot that answers questions — an AI agent that takes action on your behalf. We're on a community center website — the kind of page our user Maria would actually visit."

[0:30-0:50] PAGE X-RAY + WCAG AUDIT (WOW FACTOR)

Press X or click X-Ray button "This is Page X-Ray with built-in WCAG accessibility audit. Watch — in seconds, Aria scans the entire page for accessibility violations. Missing alt text, unlabeled form fields, broken heading structure. Enterprise tools charge $100/month for this — we built it in a hackathon."

Point to red error badges "These red highlights show WCAG violations. Click one to jump straight to the problem. Accessibility Score: 72 out of 100."

[0:50-1:10] AUTONOMOUS TASK

Type: "Help me fill out this form" or click Smart Fill "Now watch Aria work. It already knows Maria's details from previous interactions — name, email, phone — all filled with 95% confidence. I didn't click anything — Aria did it for me."

[1:10-1:25] IDIOM SCANNER

Click Idioms button "Aria found 9+ confusing phrases on this page and translated them instantly. 'Give us a ring' means 'call us.' 'Hit the ground running' means 'start immediately.' For someone learning English, this is the difference between confusion and confidence."

[1:25-1:40] IMPACT METRICS

Press M, then press J for Journey Meter "Every action is measured. Cognitive load reduction. Time saved. Latency. These aren't vanity metrics — they're proof that AI can make digital experiences accessible."

[1:40-2:00] CLOSE

"In 2 minutes, we saved Maria 15 minutes of frustration. We turned confusion into clarity. That's the power of AI that acts, not just talks."

"Aria. Your intelligent accessibility companion. Team SMUggers."

AUTO-TOUR OPTION

Press Shift+D to launch the guided demo tour — it auto-advances through all 6 features with visual countdown bars. Keyboard: Enter/Space to skip ahead, Escape to end.


FALLBACK PLANS

If This Fails... Do This Instead...
API is slow Show latency indicator, explain we're using Groq free tier
Task execution breaks Pause and show manual walkthrough mode
Form detection fails Open demo.html — pre-tested with all form fields
Voice input unavailable Type commands manually
Extension won't load Screen recording backup
Random website issues Always fallback to demo.html

TEAM

SMUggers — Singapore Management University

  • Saai — AI Chatbot & Agentic Features
  • Shalyn — Dim Mode & Focus Features
  • Min Wen — Hover Explain & Integrations

FINAL MESSAGE TO JUDGES

We didn't build another accessibility tool.

We built an AI interpreter — one that bridges the gap between how websites are designed and how millions of people actually need to use them.

The difference between Aria and other solutions is the difference between giving someone a dictionary and giving them a fluent translator who walks beside them.

Thank you for your time. Questions?


Built at iNTUition v12.0 | Team SMUggers | 2026