This guide will help you set up Google Cloud services for enhanced text extraction from multimedia files in Hackalyze.
We've integrated the following Google Cloud services to enhance the Hackalyze platform:
- Google Cloud Vision API - For extracting text from images with advanced OCR
- Google Cloud Speech-to-Text - For transcribing audio files with multilingual support
- Google Cloud Video Intelligence - For extracting both visible text and speech from videos
These services work alongside your existing Gemini 1.5 Flash API integration for hackathon submission evaluation, creating a more powerful and consistent user experience.
If you already have a Google Cloud project configured for Gemini API, you can use the same project.
- Go to Google Cloud Console
- Select your existing project or create a new one
For each service, you need to enable the corresponding API:
- Navigate to "APIs & Services" > "Library"
- Search for and enable the following APIs:
- Cloud Vision API
- Cloud Speech-to-Text API
- Video Intelligence API
- Go to "IAM & Admin" > "Service Accounts"
- Click "Create Service Account"
- Enter a name (e.g., "hackalyze-media-processor")
- Assign the following roles:
- Cloud Vision API User
- Cloud Speech-to-Text User
- Video Intelligence API User
- Complete the service account creation
- Click on the service account, go to the "Keys" tab
- Click "Add Key" > "Create new key" > JSON
- Save the JSON file securely
- Place the downloaded JSON credentials file in a secure location
- Update your
.envfile with the path to the credentials:
GOOGLE_APPLICATION_CREDENTIALS=/path/to/your/google-credentials.json
Google Cloud services are billed based on usage. To manage costs:
- Set up billing alerts in Google Cloud Console
- Consider implementing quotas for API requests
- Monitor usage regularly through Google Cloud Console
- For large files, processing might take longer
- Video analysis is particularly resource-intensive and may take several minutes
- Consider implementing caching for repeated access to the same content
The extracted text from files will be stored in the submission's description field, making it available to the Gemini 1.5 Flash API for ideation judging. This creates a seamless workflow:
- Students upload files (images, audio, video, documents)
- The system automatically extracts text content using Google Cloud services
- The extracted text is stored with the submission
- The Gemini API evaluates the content against hackathon criteria
- Teachers can view both the original file and extracted text on the redesigned TeacherHackathonPage
If you encounter issues:
- Check that the credential JSON file is accessible and properly formatted
- Verify that the required APIs are enabled in your Google Cloud project
- Ensure billing is enabled for your Google Cloud project
- Check the logs for specific error messages