Welcome to the Real-Time Avatar Interaction project! This application creates an interactive experience between the user and a real-time avatar using HeyGen's advanced avatar technology and OpenAI's language model. The project enables users to converse with an avatar in real-time, converting speech to text and generating responses through OpenAI's GPT models.
- Real-Time Avatar Interaction: Engage in live, face-to-face conversations with a virtual avatar
- Multiple Input Methods:
- Voice Input: Speak naturally - the app converts your speech to text using OpenAI Whisper
- Text Chat: Type messages directly in the chat panel
- Quick Prompts: Use pre-defined badges for instant conversation starters
- Multi-Language Support: Choose from 12 languages including English, Spanish, French, German, Italian, Portuguese, Japanese, Korean, Chinese, Arabic, Russian, and Hindi
- Natural Conversations: AI responds in a casual, conversational style with natural filler words and expressions
- Modern UI:
- Full-screen video display for immersive experience
- Glassmorphism design with backdrop blur effects
- Full-width bottom control bar
- Responsive chat panel with scrollable message history
- Beautiful landing page with feature showcase
- Next.js 14: React framework with App Router
- React 18: For building the user interface
- TypeScript: For type safety and better developer experience
- Tailwind CSS: For modern, responsive styling
- Shadcn UI: For beautiful, accessible UI components
- OpenAI API:
- GPT-4.1-mini for natural language generation
- Whisper for speech-to-text transcription
- HeyGen Streaming Avatar API: For real-time avatar video streaming
- Radix UI: For accessible component primitives
- Node.js 18+ and npm
- OpenAI API key
- HeyGen API key, Avatar ID, and Voice ID
-
Clone the repository:
git clone <repository-url> cd avatar
-
Install the dependencies:
npm install
-
Set up environment variables:
Create a
.env.localfile in the root directory with the following variables:NEXT_PUBLIC_OPENAI_API_KEY=your_openai_api_key_here NEXT_PUBLIC_HEYGEN_API_KEY=your_heygen_api_key_here NEXT_PUBLIC_HEYGEN_AVATARID=your_avatar_id_here NEXT_PUBLIC_HEYGEN_VOICEID=your_voice_id_here
Important:
- Make sure there are no quotes around the values
- No spaces around the
=sign - Restart the dev server after changing environment variables
-
Start the development server:
npm run dev
-
Open your browser:
Navigate to http://localhost:3000 to see the application in action.
-
Start a Conversation:
- Click "Start Conversation" on the landing page
- Wait for the avatar to initialize
-
Interact with the Avatar:
- Voice: Click the microphone button and speak naturally
- Text: Open the chat panel (bottom right) and type your message
- Quick Prompts: Click on badge prompts in the chat panel
-
Change Language:
- Use the language selector in the top-left corner (video view) or chat panel header
- Select from 12 available languages
- The avatar will respond in the selected language
-
End Conversation:
- Click the red phone button to stop the avatar
To build the application for production:
npm run build
npm startThe production build will be optimized and ready for deployment.
src/
├── app/
│ ├── layout.tsx # Root layout with metadata
│ ├── page.tsx # Main page component with avatar logic
│ └── globals.css # Global styles and Tailwind directives
├── components/
│ ├── reusable/
│ │ ├── Badges.tsx # Quick prompt badges
│ │ ├── ChatMessage.tsx # Individual chat message component
│ │ ├── LandingComponent.tsx # Landing page component
│ │ ├── LanguageSelector.tsx # Language selection dropdown
│ │ ├── MicButton.tsx # Microphone and phone controls
│ │ └── Video.tsx # Video player component
│ └── ui/ # Shadcn UI components
│ ├── avatar.tsx
│ ├── badge.tsx
│ ├── button.tsx
│ ├── card.tsx
│ ├── dropdown-menu.tsx
│ ├── toast.tsx
│ └── toaster.tsx
├── hooks/
│ └── use-toast.ts # Toast notification hook
├── lib/
│ └── utils.ts # Utility functions (cn helper)
└── services/
└── api.ts # HeyGen API service
- Supports 12 languages with native-level responses
- System prompts adapt to each language for natural conversation
- Language selector available in both video view and chat panel
- Scrollable message history
- Large message area to minimize scrolling
- Real-time message updates
- Quick prompt badges for common topics
- Automatic silence detection (stops recording after 2 seconds of silence)
- Real-time audio transcription
- Visual feedback with button state changes
- Full-screen video for immersive experience
- Glassmorphism effects with backdrop blur
- Smooth animations and transitions
- Responsive design for all screen sizes
- Modern dark theme with accent colors
-
400 Bad Request Error:
- Verify your environment variables are set correctly
- Check that Avatar ID and Voice ID are valid in your HeyGen account
- Ensure your API key has proper permissions
-
Avatar Not Starting:
- Check browser console for detailed error messages
- Verify all environment variables are loaded (check console logs)
- Restart the dev server after changing
.env.local
-
Audio Not Working:
- Grant microphone permissions in your browser
- Check browser console for permission errors
- Try refreshing the page
Contributions are welcome! Please feel free to submit a Pull Request.
This project is private and proprietary.
- HeyGen for the streaming avatar technology
- OpenAI for GPT and Whisper APIs
- Shadcn for the beautiful UI components