Skip to content

team-headstart/medsender-interview-challenge

Repository files navigation

Medical Document Demographics Extractor

🎯 Interview Challenge Overview

Time Limit: 45 minutes

Welcome to the medical document demographics extraction challenge! Your task is to implement pattern matching and text extraction logic to pull demographic information from various types of medical documents.

πŸ“‹ Challenge Description

You need to implement functions in src/lib/extractor.ts that can extract the following demographic information from medical documents:

  • Patient Name (First Name, Last Name)
  • Date of Birth (standardized to YYYY-MM-DD format)
  • Gender (standardized to "Male" or "Female")
  • Address (full address string)
  • Phone Number (various formats)
  • Medical Record Number (MRN/MR Number)

πŸš€ Getting Started

  1. Install dependencies:

    npm install
  2. Start the development server:

    npm run dev
  3. Open your browser: Navigate to http://localhost:3000

  4. Start implementing: Open src/lib/extractor.ts and implement the TODOs

πŸ“ Project Structure

src/
β”œβ”€β”€ lib/
β”‚   β”œβ”€β”€ mock-data.ts          # Sample medical documents with expected results
β”‚   └── extractor.ts          # 🎯 YOUR IMPLEMENTATION GOES HERE
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ api/
β”‚   β”‚   β”œβ”€β”€ documents/route.ts    # API to get test documents
β”‚   β”‚   └── extract/route.ts      # API to run extraction and calculate accuracy
β”‚   └── page.tsx              # Testing interface

🎯 What You Need to Implement

TODOs in src/lib/extractor.ts:

  1. extractName() - Extract patient name from various formats
  2. extractDateOfBirth() - Extract and standardize date of birth
  3. extractGender() - Extract and standardize gender
  4. extractAddress() - Extract full address information
  5. extractPhoneNumber() - Extract phone numbers in various formats
  6. extractMedicalRecordNumber() - Extract MRN/Medical Record Numbers
  7. extractDemographics() - Main function that orchestrates all extraction

πŸ“„ Sample Document Formats

The challenge includes 3 types of medical documents with varying formats:

1. Discharge Summary

Patient: John Smith
DOB: 03/15/1985
Gender: Male
MRN: MR-789456123
Address: 123 Main Street, Springfield, IL 62701
Phone: (555) 123-4567

2. Lab Report

Patient Name: Maria Rodriguez
Date of Birth: July 22, 1992
Gender: Female
Medical Record #: MR-456789012
Contact: 456 Oak Avenue, Denver, CO 80203, Tel: 555-987-6543

3. Consultation Note

PT: Robert Chen
D.O.B: 12/04/1968
Sex: M
MR Number: MR-123654789
Address: 789 Pine Road, Austin, TX 78701
Phone Number: (555) 456-7890

πŸ§ͺ Testing Your Implementation

Web Interface

  1. Use the browser interface at http://localhost:3000
  2. Select a test document or paste custom text
  3. Click "Extract" to test your implementation
  4. View extraction results and accuracy scores
  5. Use "Run All Tests" to test against all documents

API Endpoints

  • GET /api/documents - Get available test documents
  • POST /api/extract - Extract demographics from text
  • GET /api/extract/test - Run extraction on all test documents

πŸ“Š Scoring Criteria

Your implementation will be evaluated on:

  • Accuracy: How well your extraction matches expected results
  • Pattern Handling: Ability to handle different text formats
  • Edge Cases: Handling of missing or malformed data
  • Code Quality: Clean, readable implementation
  • Completeness: Implementation of all required functions

Target Accuracy: Aim for >80% overall accuracy across all test documents

πŸ’‘ Implementation Tips

Regex Pattern Examples

// Name patterns
/(?:Patient|Patient Name|PT):\s*([A-Za-z]+)\s+([A-Za-z]+)/

// Date patterns
/(?:DOB|Date of Birth|D\.O\.B):\s*(\d{1,2}\/\d{1,2}\/\d{4})/
/(?:Date of Birth):\s*([A-Za-z]+)\s+(\d{1,2}),?\s+(\d{4})/

// Phone patterns
/(?:Phone|Tel|Phone Number):\s*\(?(\d{3})\)?[\s-]?(\d{3})[\s-]?(\d{4})/

Date Standardization

// Convert MM/DD/YYYY to YYYY-MM-DD
const [month, day, year] = dateStr.split("/");
return `${year}-${month.padStart(2, "0")}-${day.padStart(2, "0")}`;

// Convert "July 22, 1992" to "1992-07-22"
const months = { January: "01", February: "02" /* ... */ };

Pattern Matching Strategy

  1. Start with specific patterns (exact keyword matches)
  2. Use capture groups to extract the actual data
  3. Handle case-insensitive matching with /i flag
  4. Test patterns against all sample documents
  5. Add fallback patterns for edge cases

πŸš€ Bonus Challenges (if time permits)

  1. Handle nickname variations (e.g., "Bob" for "Robert")
  2. Extract middle names or initials
  3. Parse international phone formats
  4. Handle multiple addresses (home vs. work)
  5. Extract additional demographics (age, occupation, etc.)

πŸ› Debugging

  • Use console.log() to debug your regex patterns
  • Test individual functions before integrating
  • Use the browser's developer tools to inspect API responses
  • Check the terminal for any server-side errors

πŸ“ Submission Notes

Focus on:

  1. Completing the core functionality within the time limit
  2. Testing your implementation against the provided documents
  3. Handling the most common patterns first
  4. Clean, readable code with appropriate comments

Good luck! πŸ€


Development Commands

npm run dev          # Start development server
npm run build        # Build for production
npm run start        # Start production server
npm run lint         # Run ESLint

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published