You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+70-25Lines changed: 70 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,20 +1,46 @@
1
-
# Veo 3 Gemini API Quickstart
1
+
# Gemini API Veo 3 & Nano Banana Quickstart
2
+
3
+
A NextJs quickstart for creating and editing images and videos using Google's latest Gemini API models including [Veo 3](https://ai.google.dev/gemini-api/docs/video), [Imagen 4](https://ai.google.dev/gemini-api/docs/imagen), and [Gemini 2.5 Flash Image aka nano banana](https://ai.google.dev/gemini-api/docs/image-generations).
> If you want a full studio, consider [Google's Flow](https://labs.google/fx/tools/flow) (a professional environment for Veo/Imagen). Use this repo as a lightweight studio to learn how to build your own UI that generates content with Google's AI models via the Gemini API.
2
27
3
-
[Veo 3](https://ai.google.dev/gemini-api/docs/video) is Google's state-of-the-art video generation model available in the Gemini API. This repository is a quickstart that demonstrates how to build a simple UI to generate videos with Veo 3, play them, and download the results. It also includes an image + text to video generation using the [Imagen 4](https://ai.google.dev/gemini-api/docs/imagen) model.
28
+
(This is not an official Google product.)
4
29
5
-

30
+
## Features
6
31
7
-
> [!NOTE]
8
-
> If you want a full studio, consider [Google's Flow](https://labs.google/fx/tools/flow) (a professional environment for Veo/Imagen). Use this repo as a lightweight quickstart to learn how to build your own UI that generates videos with Veo 3 via the Gemini API.
32
+
The quickstart provides a unified composer UI with different modes for content creation:
9
33
10
-
(This is not an official Google product.)
34
+
-**Create Image**: Generate images from text prompts using **Imagen 4** or **Gemini 2.5 Flash Image**.
35
+
-**Edit Image**: Edit an image based on a text prompt using **Gemini 2.5 Flash Image**.
36
+
-**Compose Image**: Combine multiple images with a text prompt to create a new image using **Gemini 2.5 Flash Image**.
37
+
-**Create Video**: Generate videos from text prompts or an initial image using **Veo 3**.
11
38
12
-
## Features
39
+
### Quick Actions & UI Features
40
+
- Seamless navigation between modes after generating content
41
+
- Download generated images & videos
42
+
- Cut videos directly in the browser to specific time ranges
13
43
14
-
- Generate videos from text prompts using the Veo-3 model.
15
-
- Generate videos from images + text prompts using the Imagen 4.0 model or upload a starting image.
16
-
- Play and download generated videos.
17
-
- Cut videos directly in the browser to a specific time range.
18
44
19
45
## Getting Started: Development and Local Testing
20
46
@@ -26,7 +52,7 @@ Follow these steps to get the application running locally for development and te
26
52
-**`GEMINI_API_KEY`**: The application requires a [GEMINI API key](https://aistudio.google.com/app/apikey). Either create a `.env` file in the project root and add your API key: `GEMINI_API_KEY="YOUR_API_KEY"` or set the environment variable in your system.
27
53
28
54
> [!WARNING]
29
-
> Google Veo 3 and Imagen 4 are both part of the Gemini API Paid tier. You will need to be on the paid tier to use these models.
55
+
> Google Veo 3, Imagen 4, and Gemini 2.5 Flash Image are part of the Gemini API Paid tier. You will need to be on the paid tier to use these models.
30
56
31
57
**2. Install Dependencies:**
32
58
@@ -46,11 +72,22 @@ Open your browser and navigate to `http://localhost:3000` to see the application
46
72
47
73
The project is a standard Next.js application with the following key directories:
48
74
49
-
-`app/`: Contains the main application logic, including the user interface and API routes.
50
-
-`api/`: API routes for generating videos and images, and checking operation status.
51
-
-`components/`: Reusable React components used throughout the application.
52
-
-`lib/`: Utility functions and schema definitions.
53
-
-`public/`: Static assets.
75
+
-`app/`: Contains the main application logic and pages
76
+
-`page.tsx`: Main page with the unified composer UI.
77
+
-`api/`: API routes for different operations
78
+
-`imagen/generate/`: Image generation with Imagen 4
79
+
-`gemini/generate/`: Image generation with Gemini 2.5 Flash Image
80
+
-`gemini/edit/`: Image editing/composition with Gemini 2.5 Flash Image
81
+
-`veo/generate/`: Video generation operations
82
+
-`veo/operation/`: Check video generation status
83
+
-`veo/download/`: Download generated videos
84
+
-`components/`: Reusable React components
85
+
-`ui/Composer.tsx`: The main unified composer for all interactions.
86
+
-`ui/VideoPlayer.tsx`: Video player with trimming
87
+
-`ui/ModelSelector.tsx`: Model selection component
88
+
-`ui/dropzone.tsx`: Drag-and-drop component for file uploads.
89
+
-`lib/`: Utility functions and schema definitions
90
+
-`public/`: Static assets
54
91
55
92
## Official Docs and Resources
56
93
@@ -62,17 +99,25 @@ The project is a standard Next.js application with the following key directories
62
99
63
100
The application uses the following API routes to interact with the Google models:
64
101
65
-
-`app/api/veo/generate/route.ts`: Handles video generation requests. It takes a text prompt as input and initiates a video generation operation with the Veo-3 model.
66
-
-`app/api/veo/operation/route.ts`: Checks the status of a video generation operation.
67
-
-`app/api/veo/download/route.ts`: Downloads the generated video.
68
-
-`app/api/imagen/generate/route.ts`: Handles image generation requests with the Imagen model.
102
+
### Image APIs
103
+
-`app/api/imagen/generate/route.ts`: Handles image generation requests with Imagen 4
104
+
-`app/api/gemini/generate/route.ts`: Handles image generation requests with Gemini 2.5 Flash Image
105
+
-`app/api/gemini/edit/route.ts`: Handles image editing and composition with Gemini 2.5 Flash (supports multiple images)
106
+
107
+
### Video APIs
108
+
-`app/api/veo/generate/route.ts`: Handles video generation requests with Veo 3
109
+
-`app/api/veo/operation/route.ts`: Checks the status of video generation operations
0 commit comments