Skip to content

Latest commit

 

History

History
268 lines (219 loc) · 11.4 KB

File metadata and controls

268 lines (219 loc) · 11.4 KB

Clausefill-AI – Implementation Roadmap

Status: MVP Complete - Production Ready
Live URL: https://clausefill-ai.vercel.app/
Future Improvements: See future-enhancements.md

This roadmap is focused on execution and is meant to be extended as the project evolves. The conversational flow is implemented deterministically (no AI required), with optional AI integration as a later stretch.


Phase 0 – Project Setup

  • Initialize Next.js with TypeScript.
  • Add Tailwind CSS and basic layout components.
  • Wire environment variables for optional AI integrations (e.g. OPENAI_API_KEY).

Phase 1 – Upload & Parse Document

  • Build upload UI:
    • Drag-and-drop + click-to-upload for .docx.
    • File validation (type, size) and user-friendly error messages.
  • Implement /api/parse-document:
    • Accept .docx via FormData.
    • Convert to HTML/text using mammoth (or similar).
    • Return parsed content to the client.
  • Render a scrollable preview of the parsed document.
  • Add an optional "Use sample document" button for quick demos.

Phase 2 – Placeholder Detection & Highlighting

  • Implement regex-based detection of placeholder patterns, including:
    • Square-bracketed tokens like [Company Name], [Investor Name], [Date of Safe].
    • Bracketed blanks like $[_____________].
    • (Optional) curly-braced tokens like {company_name}.
  • Deduplicate placeholders and store them as a list of keys.
  • Highlight placeholders in the preview (e.g. colored spans with background color).
  • Show a sidebar or panel listing all detected placeholders and their fill status.
  • CRITICAL FOR REAL DOCS: Add support for additional common patterns:
    • Standalone underscores: _________ (minimum 3+ underscores)
    • Empty brackets: [ ] or [ ]
    • Placeholder indicators: [TBD], [INSERT], [FILL IN]

Phase 3 – Deterministic Conversational Flow (No AI)

Implement a scripted, state-driven chat experience that walks through placeholders one by one.

  • Define client-side state:
    • placeholders: string[]
    • answers: Record<string, string>
    • messages: { role: 'user' | 'assistant'; content: string }[]
    • currentPlaceholderIndex: number
  • Conversation logic:
    • After parsing, initialize placeholders and currentPlaceholderIndex = 0.
    • Append an assistant message introducing the flow.
    • For the current placeholder, generate a deterministic question, e.g.: "What is the Company Name?" from [Company Name].
    • On user answer:
      • Store value in answers[key].
      • Append user message to messages.
      • Re-render preview with updated values.
      • Increment currentPlaceholderIndex and ask the next question.
    • When all placeholders are filled, append a final assistant message and enable download.
  • UI polish:
    • Chat bubble styling (assistant vs user).
    • Basic typing indicator / loading affordance.
    • Allow jumping to a placeholder by clicking it in the sidebar and editing the answer.

Phase 4 – Completed Document & Download

  • Implement /api/generate-doc:
    • Accept original template text and the answers map.
    • Replace placeholders with the corresponding values, leaving unfilled ones visibly marked.
    • CRITICAL FIX: Use docxtemplater + pizzip to preserve original formatting.
    • Store original file buffer in client state.
    • Generate a .docx and return as downloadable file.
  • Add a "Download" button on the UI that calls this endpoint.
  • Use filename convention: {original-name}-clausefill-ai-v1.docx.
  • TESTING: Verify with real legal documents that formatting is preserved.

Phase 5 – Polish & QA

  • Improve error handling and empty states:
    • Invalid file type.
    • Parsing failures.
    • No placeholders detected with helpful guidance.
    • Large files (>4MB) with helpful guidance.
  • Responsive layout and basic accessibility checks.
  • MD3 DESIGN SYSTEM: Implement Material Design 3 color scheme with dark theme default.
    • MD3 color variables for light and dark themes.
    • Theme toggle component with localStorage persistence.
    • All UI components updated to use MD3 colors.
    • Smooth theme transitions.
  • Copy polish (helper text around upload, placeholders, and conversation).
  • USER GUIDANCE: Add help section explaining:
    • Supported placeholder formats with examples.
    • What to expect from the conversational flow (footer).
    • Privacy note (no data stored).
  • Loading states (download button, parsing indicator).
  • Keyboard accessibility (Enter to submit, autofocus).
  • UX IMPROVEMENTS (Post-Testing):
    • Auto-scroll chat messages as conversation progresses.
    • Change download button to success green (was red/tertiary).
    • Add skip placeholder functionality (type 'skip' or click button in sidebar).
    • Add reset button to clear document and start fresh without page refresh.
    • Add typing indicator animation (three bouncing dots) with 500ms delay.
    • CRITICAL FIX: Proper XML paragraph-level replacement to preserve formatting with any placeholder format.
  • Smoke test full flow with sample documents.
  • DEPLOYMENT: Deploy to Vercel for public URL access (https://clausefill-ai.vercel.app/).
  • REAL-WORLD TEST: Test with actual legal documents (SAFE agreement with 8 placeholders - SUCCESS!).

Phase 6 – Pre-Launch Checklist (Real-World Readiness)

Before sharing with unknown testers:

  • Landing page clarity:
    • Clear value proposition visible immediately.
    • Instructions on how to use the tool (collapsible panel).
    • Example placeholder formats shown.
    • Feature highlights (preserves formatting, no data stored, works in browser).
  • Error recovery:
    • If no placeholders detected, suggest manual placeholder format with examples.
    • If parsing fails, provide clear next steps and recovery instructions.
  • Core functionality tested:
    • Chrome browser tested and working
    • Real legal document (SAFE) tested successfully
    • All placeholder formats working
    • Formatting preservation verified

Note: Additional items (sample documents, cross-browser testing, analytics, performance testing) moved to future-enhancements.md as post-MVP improvements.


✅ MVP COMPLETE - READY FOR LAUNCH

All critical features implemented and tested. App is production-ready at https://clausefill-ai.vercel.app/


Phase 7 – AI-Enhanced Question Generation

Add OpenAI integration to generate contextual, natural questions instead of deterministic ones.

Setup & Configuration

  • Install OpenAI SDK: npm install openai
  • Support for user-provided API keys (in-app input field)
  • Support for default API key via environment variable
  • Rate limiting: 50 requests/hour per IP when using default key
  • Add OPENAI_API_KEY to .env.local for local development (optional)
  • Add OPENAI_API_KEY to Vercel environment variables for production (optional)
  • Create env.example file documenting required environment variables

Backend Implementation

  • Create /api/generate-question/route.ts endpoint:
    • Accept: { placeholder, documentContext, documentType? }
    • Check if OPENAI_API_KEY exists
    • If API key exists:
      • Call OpenAI API (GPT-4o-mini for cost efficiency)
      • System prompt: "You are a helpful assistant for legal document filling. Generate a clear, professional question."
      • User prompt: Include placeholder name and document context
      • Return natural language question
    • If no API key or API fails:
      • Fallback to deterministic question generation
      • Return simple "What is the [placeholder]?" format
    • Add error handling and logging
    • Add rate limiting considerations (handled by OpenAI)

Frontend Integration

  • Update generateQuestion function in app/page.tsx:
    • Make it async
    • Call /api/generate-question endpoint
    • Show loading state while waiting for AI response (typing indicator)
    • Handle errors gracefully with fallback
  • Update handleParsedDocument to use async question generation
  • Update handleSubmitAnswer to use async question generation
  • Update handleSkipPlaceholder to use async question generation
  • Ensure typing indicator shows during AI question generation

Testing & Polish

  • Test with API key present (AI-generated questions) ✅
  • Test without API key (deterministic fallback) ✅
  • Test API failure scenarios (network error, rate limit, etc.) ✅
  • Verify questions are contextual and professional ✅
  • Monitor API costs (~$0.0001 per question) ✅
  • Batch processing optimization (80% faster, 89% fewer API calls) ✅
  • Smart value normalization (states, dates, amounts, business entities) ✅
  • Markdown support in chat for better formatting ✅

Documentation

  • Update README with OpenAI setup instructions ✅
  • Document environment variable requirements ✅
  • Add note about optional AI features ✅
  • Include cost estimates for AI usage ✅
  • Document BYOK (Bring Your Own Key) feature ✅
  • Document rate limiting (50 requests/hour per IP) ✅

Status: ✅ COMPLETE
Actual Effort: ~5 hours
Cost Impact: ~$0.01 per 100 questions (with batch optimization)


Phase 7 Summary - What Was Built

🎯 Core Features

  1. Batch Question Generation - All questions generated in one API call (8x faster)
  2. Smart Field Detection - Auto-categorizes: company, person, date, amount, address, email, phone
  3. Question Caching - Questions generated once, retrieved instantly
  4. Rate Limiting - 50 AI questions/hour per IP (only for default key)
  5. BYOK Support - Users can provide their own API key (no rate limit)
  6. Graceful Fallbacks - Works without AI, handles all errors

🎨 UX Enhancements

  1. Markdown Chat - Proper formatting with bullets, lists, bold text
  2. Smart Value Normalization:
    • States: DEDelaware
    • Dates: tomorrowNovember 15, 2025
    • Amounts: 100000$100,000
    • Business entities: ABC llcABC LLC
  3. Better Error Messages - Helpful, actionable feedback
  4. Typing Indicators - Shows AI is "thinking"

📊 Performance

  • Before: 9 API calls × 2s = ~18 seconds
  • After: 1 API call × 4s = ~4 seconds
  • Improvement: 78% faster, 89% cost reduction

🔒 Security & Reliability

  • Rate limiting per IP address
  • API key validation
  • Error handling at every level
  • Fallback to deterministic questions
  • No data persistence

What's Next?

Optional Enhancements (Post-Launch)

See future-enhancements.md for:

  • PDF file support
  • Advanced AI features (context awareness, multi-turn conversations)
  • Analytics and usage tracking
  • Performance optimizations
  • Cross-browser testing
  • Sample document library

Ready for Production! 🚀

  • ✅ All MVP features complete
  • ✅ AI integration working perfectly
  • ✅ Rate limiting protecting API costs
  • ✅ Smart value normalization
  • ✅ Beautiful UX with markdown support
  • ✅ Comprehensive error handling
  • ✅ Documentation complete

Next Step: Deploy to Vercel with your OpenAI API key!