Skip to content

Commit ceec947

Browse files
committed
Bump version to 0.0.1
1 parent 656ab60 commit ceec947

File tree

4 files changed

+366
-182
lines changed

4 files changed

+366
-182
lines changed

README.md

Lines changed: 48 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -31,12 +31,26 @@ pip install -e .
3131

3232
You need a Google Gemini API key to use this server. Get one from [Google AI Studio](https://aistudio.google.com/apikey).
3333

34-
Set the API key as an environment variable:
34+
## Environment Variables
35+
36+
| Variable | Required | Default | Description |
37+
|----------|----------|---------|-------------|
38+
| `GEMINI_API_KEY` | Yes | - | Your Google Gemini API key |
39+
| `GEMINI_DOWNLOAD_PATH` | No | `/tmp/gemini_gen_mcp` | Directory where generated files are saved |
40+
41+
Set the environment variables:
3542

3643
```bash
3744
export GEMINI_API_KEY='your-api-key-here'
45+
export GEMINI_DOWNLOAD_PATH='/path/to/downloads' # optional
3846
```
3947

48+
Generated files are organized by type and date:
49+
- Images: `$GEMINI_DOWNLOAD_PATH/images/YYYY-MM-DD/`
50+
- Audio: `$GEMINI_DOWNLOAD_PATH/audios/YYYY-MM-DD/`
51+
52+
Each generated file includes a companion `.info.json` file with generation metadata.
53+
4054
## Usage
4155

4256
### Running the Server
@@ -76,36 +90,60 @@ Add this to your `claude_desktop_config.json`:
7690

7791
#### text_to_image
7892

79-
Generate images from text descriptions.
93+
Generate images from text descriptions using Gemini's image generation models.
8094

8195
**Parameters:**
8296
- `prompt` (string, required): Text description of the image to generate
83-
- `model` (string, optional): Gemini model to use (default: "gemini-2.0-flash-exp")
84-
- `num_images` (integer, optional): Number of images to generate, 1-4 (default: 1)
97+
- `model` (string, optional): Gemini model to use
98+
- `gemini-2.5-flash-image` (default)
99+
- `gemini-3-pro-image-preview`
100+
- `aspect_ratio` (string, optional): Aspect ratio for the generated image (default: "1:1")
101+
- Supported: `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9`
102+
- `temperature` (float, optional): Sampling temperature for image generation (default: 1.0)
103+
- `top_p` (float, optional): Nucleus sampling parameter (optional)
85104

86105
**Example:**
87106
```json
88107
{
89108
"prompt": "A serene mountain landscape at sunset with a lake",
90-
"model": "gemini-2.0-flash-exp",
91-
"num_images": 1
109+
"model": "gemini-2.5-flash-image",
110+
"aspect_ratio": "16:9",
111+
"temperature": 1.0
92112
}
93113
```
94114

95115
#### text_to_audio
96116

97-
Generate audio/speech from text.
117+
Generate audio/speech from text using Gemini's TTS models. Output is saved as WAV format.
98118

99119
**Parameters:**
100120
- `text` (string, required): Text to convert to speech
101-
- `model` (string, optional): Gemini model to use (default: "gemini-2.0-flash-exp")
102-
- `voice` (string, optional): Voice to use for speech generation
121+
- `model` (string, optional): Gemini TTS model to use
122+
- `gemini-2.5-flash-preview-tts` (default)
123+
- `gemini-2.5-pro-preview-tts`
124+
- `voice` (string, optional): Voice to use for speech generation (default: "Kore")
125+
126+
**Available Voices:**
127+
128+
| Voice | Style | Voice | Style | Voice | Style |
129+
|-------|-------|-------|-------|-------|-------|
130+
| Zephyr | Bright | Puck | Upbeat | Charon | Informative |
131+
| Kore | Firm | Fenrir | Excitable | Leda | Youthful |
132+
| Orus | Firm | Aoede | Breezy | Callirrhoe | Easy-going |
133+
| Autonoe | Bright | Enceladus | Breathy | Iapetus | Clear |
134+
| Umbriel | Easy-going | Algieba | Smooth | Despina | Smooth |
135+
| Erinome | Clear | Algenib | Gravelly | Rasalgethi | Informative |
136+
| Laomedeia | Upbeat | Achernar | Soft | Alnilam | Firm |
137+
| Schedar | Even | Gacrux | Mature | Pulcherrima | Forward |
138+
| Achird | Friendly | Zubenelgenubi | Casual | Vindemiatrix | Gentle |
139+
| Sadachbia | Lively | Sadaltager | Knowledgeable | Sulafat | Warm |
103140

104141
**Example:**
105142
```json
106143
{
107144
"text": "Hello, this is a test of the Gemini text to speech system.",
108-
"model": "gemini-2.0-flash-exp"
145+
"model": "gemini-2.5-flash-preview-tts",
146+
"voice": "Kore"
109147
}
110148
```
111149

pyproject.toml

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
44

55
[project]
66
name = "gemini-gen-mcp"
7-
version = "0.1.0"
7+
version = "0.0.1"
88
description = "MCP Server for Gemini Image and Audio generation"
99
readme = "README.md"
1010
requires-python = ">=3.10"
@@ -43,3 +43,9 @@ include = [
4343
"/README.md",
4444
"/LICENSE",
4545
]
46+
47+
[dependency-groups]
48+
dev = [
49+
"pytest>=9.0.2",
50+
"pytest-asyncio>=1.3.0",
51+
]

0 commit comments

Comments
 (0)