This feature allows users to upload handwritten historical documents and uses AI-powered OCR (Optical Character Recognition) to transcribe them. Users can then review, edit, and correct the transcriptions, and the system learns from these corrections to improve accuracy over time.
- Document Upload: Upload images of handwritten documents (JPG, PNG, etc.)
- AI Transcription: Automatic transcription using Google Cloud Vision API
- User Corrections: Easy-to-use interface for reviewing and correcting transcriptions
- Learning System: Tracks user corrections to improve future transcriptions
- Multi-Team Support: Transcriptions are team-scoped for proper access control
- Statistics Dashboard: View transcription stats including accuracy and completion rates
To enable handwriting recognition, you need to configure the Google Cloud Vision API:
- Go to Google Cloud Console
- Create a new project or select an existing one
- Enable the "Cloud Vision API"
- Create credentials (API Key)
- Add the API key to your
.envfile:
GOOGLE_VISION_API_KEY=your_api_key_hereEnsure your config/filesystems.php has the public disk configured:
'public' => [
'driver' => 'local',
'root' => storage_path('app/public'),
'url' => env('APP_URL').'/storage',
'visibility' => 'public',
],Run the storage link command:
php artisan storage:linkRun the migrations to create the required tables:
php artisan migrateThis will create:
document_transcriptions- Stores uploaded documents and transcriptionstranscription_corrections- Tracks user corrections for learning
Navigate to /transcriptions while logged in to access the transcription interface.
- Click "Choose an image file" to select a handwritten document
- Preview the image to ensure it uploaded correctly
- Click "Upload & Transcribe" to start the transcription process
- The AI will process the document and provide an initial transcription
- Select a transcription from the list on the left
- View the original document image and transcription side-by-side
- Click "Edit" to start making corrections
- Make your changes in the text editor
- Click "Save Correction" to save your changes
The system will:
- Save your corrected version
- Track the correction for future learning
- Update the transcription immediately
The dashboard shows:
- Total Transcriptions: All documents uploaded by your team
- Completed: Successfully transcribed documents
- Total Corrections: Number of user corrections made
- Avg. Confidence: Average AI confidence score (0-100%)
If you don't configure a Google Vision API key, the system will use a fallback mode that provides placeholder transcriptions. This is useful for:
- Development and testing
- Demonstrations
- Environments where the API is not available
You can extend the HandwritingRecognitionService to support other OCR services:
- Open
app/Services/HandwritingRecognitionService.php - Add a new method like
performAzureOCR()orperformAWSTextract() - Update the
performOCR()method to call your new service
-
DocumentTranscription: Represents an uploaded document and its transcription
team_id: Links to the team that owns the documentuser_id: User who uploaded the documentdocument_path: Path to the stored imageraw_transcription: Initial AI transcriptioncorrected_transcription: User-corrected versionmetadata: Additional data (confidence scores, processing time, etc.)status: Processing status (pending, processing, completed, failed)
-
TranscriptionCorrection: Tracks individual corrections
document_transcription_id: Links to the transcriptionuser_id: User who made the correctionoriginal_text: Text before correctioncorrected_text: Text after correctioncorrection_metadata: Additional context for learning
HandwritingRecognitionService provides:
processDocument(): Handles document upload and OCRperformOCR(): Orchestrates OCR with different providersapplyCorrection(): Saves user correctionslearnFromCorrection(): Implements learning logicgetTeamStats(): Calculates team statistics
DocumentTranscriptionComponent handles:
- Document upload with real-time preview
- Transcription list management
- Editing interface
- Real-time updates
Run the test suite:
php artisan test --filter=TranscriptionTests cover:
- Document upload and processing
- OCR functionality
- User corrections
- Team isolation
- Statistics calculation
- Component interactions
- Files are validated (images only, max 10MB)
- Team-based access control
- Soft deletes for data recovery
- API keys stored securely in environment variables
- User authentication required for all operations
- Image Optimization: Resize large images before upload to reduce processing time
- Batch Processing: For many documents, consider implementing a queue system
- Caching: API results can be cached to avoid redundant calls
- Background Jobs: Move OCR processing to background jobs for large documents
- Check file size (max 10MB)
- Verify file is an image format
- Ensure storage permissions are correct
- Ensure image is clear and high resolution
- Check lighting and contrast
- Try preprocessing images (enhance contrast, remove noise)
- Verify API key is correct
- Check API quota and billing
- Ensure network connectivity to Google Cloud
Potential improvements:
- Support for multi-page documents
- Batch upload and processing
- Export transcriptions to various formats
- Advanced learning with custom ML models
- Integration with genealogy records
- Collaborative correction features
- Mobile app support
For issues or questions:
- Check the logs:
storage/logs/laravel.log - Review the test suite for usage examples
- Consult the code comments in the service class
- Open an issue on GitHub
This feature is part of the Genealogy Laravel project and follows the same MIT license.