Skip to content

Commit 6fb92de

Browse files
authored
Merge pull request #22 from shinpr/feature/nanobanana-pro-update
feat: Upgrade to Gemini 3 Pro Image (nano banana pro) with 4K support
2 parents b08630a + 6bd3626 commit 6fb92de

11 files changed

Lines changed: 327 additions & 205 deletions

File tree

README.md

Lines changed: 31 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,25 @@
11
# 🍌 MCP Image Generator
22

3-
> Powered by Gemini 2.5 Flash Image - Nano Banana 🍌
3+
> Powered by Gemini 3 Pro Image - Nano Banana Pro 🍌
44
5-
A powerful MCP (Model Context Protocol) server that enables AI assistants to generate and edit images using Google's Gemini 2.5 Flash Image (Nano Banana 🍌). Seamlessly integrate advanced image generation capabilities into Codex, Cursor, Claude Code, and other MCP-compatible AI tools.
5+
A powerful MCP (Model Context Protocol) server that enables AI assistants to generate and edit images using Google's Gemini 3 Pro Image (Nano Banana Pro 🍌). Seamlessly integrate advanced image generation capabilities into Codex, Cursor, Claude Code, and other MCP-compatible AI tools.
66

77
## ✨ Features
88

9-
- **AI-Powered Image Generation**: Create images from text prompts using Gemini 2.5 Flash Image (Nano Banana)
9+
- **AI-Powered Image Generation**: Create images from text prompts using Gemini 3 Pro Image (Nano Banana Pro)
1010
- **Intelligent Prompt Enhancement**: Automatically optimizes your prompts using Gemini 2.0 Flash for superior image quality
1111
- Adds photographic and artistic details
1212
- Enriches lighting, composition, and atmosphere descriptions
1313
- Preserves your intent while maximizing generation quality
1414
- **Image Editing**: Transform existing images with natural language instructions
1515
- Context-aware editing that preserves original style
1616
- Maintains visual consistency with source image
17-
- **Advanced Options**:
17+
- **High-Resolution Output**: Support for 2K and 4K image generation
18+
- Standard quality for fast generation
19+
- 2K resolution for enhanced detail
20+
- 4K resolution for professional-grade images with superior text rendering
21+
- **Flexible Aspect Ratios**: Multiple aspect ratio options (1:1, 16:9, 9:16, 21:9, and more)
22+
- **Advanced Options**:
1823
- Multi-image blending for composite scenes
1924
- Character consistency across generations
2025
- World knowledge integration for accurate context
@@ -134,18 +139,31 @@ The system automatically enhances this to include rich details about lighting, m
134139

135140
### Advanced Features
136141

142+
**Character Consistency:**
137143
```
138144
"Generate a portrait of a medieval knight, maintaining character consistency for future variations"
139145
(with maintainCharacterConsistency: true)
140146
```
141147

148+
**High-Resolution 4K Generation:**
149+
```
150+
"Generate a professional product photo of a smartphone with clear text on the screen"
151+
(with imageSize: "4K")
152+
```
153+
154+
**Custom Aspect Ratio:**
155+
```
156+
"Generate a cinematic landscape of a desert at golden hour"
157+
(with aspectRatio: "21:9")
158+
```
159+
142160
## 🔧 API Reference
143161

144162
### `generate_image` Tool
145163

146164
The MCP server exposes a single tool for all image operations. Internally, it uses a two-stage process:
147165
1. **Prompt Optimization**: Gemini 2.0 Flash analyzes and enriches your prompt
148-
2. **Image Generation**: Gemini 2.5 Flash Image creates the final image
166+
2. **Image Generation**: Gemini 3 Pro Image creates the final image
149167

150168
#### Parameters
151169

@@ -155,9 +173,10 @@ The MCP server exposes a single tool for all image operations. Internally, it us
155173
| `inputImagePath` | string | - | Absolute path to input image for editing |
156174
| `fileName` | string | - | Custom filename for output (auto-generated if not specified) |
157175
| `aspectRatio` | string | - | Aspect ratio for the generated image. Supported values: `1:1` (square, default), `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9` |
158-
| `blendImages` | boolean | - | Enable multi-image blending |
159-
| `maintainCharacterConsistency` | boolean | - | Maintain character appearance across generations |
160-
| `useWorldKnowledge` | boolean | - | Use real-world knowledge for context |
176+
| `imageSize` | string | - | Image resolution for high-quality output. Specify `2K` or `4K` for higher resolution images with better text rendering and fine details. Leave unspecified for standard quality. Supported values: `2K`, `4K` |
177+
| `blendImages` | boolean | - | Enable multi-image blending for combining multiple visual elements naturally |
178+
| `maintainCharacterConsistency` | boolean | - | Maintain character appearance consistency across different poses and scenes |
179+
| `useWorldKnowledge` | boolean | - | Use real-world knowledge for accurate context (recommended for historical figures, landmarks, or factual scenarios) |
161180

162181
#### Response
163182

@@ -170,7 +189,7 @@ The MCP server exposes a single tool for all image operations. Internally, it us
170189
"mimeType": "image/png"
171190
},
172191
"metadata": {
173-
"model": "gemini-2.5-flash-image",
192+
"model": "gemini-3-pro-image-preview",
174193
"processingTime": 5000,
175194
"timestamp": "2024-01-01T12:00:00.000Z"
176195
}
@@ -199,15 +218,17 @@ The MCP server exposes a single tool for all image operations. Internally, it us
199218

200219
- Image generation: 30-60 seconds typical (includes prompt optimization)
201220
- Image editing: 15-45 seconds typical (includes context analysis)
221+
- High-resolution generation (2K/4K): May take longer but provides superior quality
202222
- Simple prompts work great - the AI automatically adds professional details
203223
- Complex prompts are preserved and further enhanced
204224
- Consider enabling `useWorldKnowledge` for historical or factual subjects
225+
- Use `imageSize: "4K"` when text clarity and fine details are critical
205226

206227
## 💰 Usage Notes
207228

208229
- This MCP server uses the paid Gemini API for both prompt optimization and image generation
209230
- Gemini 2.0 Flash for intelligent prompt enhancement (minimal token usage)
210-
- Gemini 2.5 Flash Image for actual image generation
231+
- Gemini 3 Pro Image for actual image generation
211232
- Check current pricing and rate limits at [Google AI Studio](https://aistudio.google.com/)
212233
- Monitor your API usage to avoid unexpected charges
213234
- The prompt optimization step adds minimal cost while significantly improving output quality

0 commit comments

Comments
 (0)