A native AI PPT generation application based on nano banana pro🍌
Go from idea to presentation in minutes—no tedious formatting, verbalized edits, moving towards a true "Vibe PPT"
🚀 Online Demo • 📚 Documentation • English
If this project is helpful to you, please consider giving it a Star 🌟 & Fork 🍴
Have you ever found yourself in this predicament: the presentation is due tomorrow, but your PPT is still a blank slate; you have countless brilliant ideas in your head, but all your enthusiasm is drained by tedious layout and design?
We long to quickly create presentations that are both professional and well-designed. While traditional AI PPT generation apps generally meet the "speed" requirement, they still suffer from the following issues:
- 1️⃣ Limited to preset templates, with no flexibility to adjust styles
- 2️⃣ Low degree of freedom, making it difficult to perform multiple rounds of revisions
- 3️⃣ Similar visual results with severe homogenization
- 4️⃣ Lower quality of assets and a lack of specific relevance
- 5️⃣ Disjointed text-image layouts and poor design sense
These deficiencies make it difficult for traditional AI PPT generators to simultaneously satisfy our two major needs: "speed" and "beauty." Even those claiming to be "Vibe PPT" are, in my eyes, still far from being truly "Vibe."
However, the emergence of the nano banana🍌 model has changed everything. I tried using 🍌pro to generate PPT pages and found that the results were excellent in terms of quality, aesthetics, and consistency. It can accurately render almost all the text requested in the prompt while following the style of reference images. So, why not build a native "Vibe PPT" application based on 🍌pro?
- Beginners: Quickly generate beautiful PPTs with zero barrier to entry and no design experience required, reducing the hassle of choosing templates.
- PPT Professionals: Reference AI-generated layouts and combinations of graphics and text to quickly gain design inspiration.
- Educators: Quickly convert teaching content into illustrated lesson plan PPTs to enhance classroom effectiveness.
- Students: Quickly complete class presentations and focus energy on content rather than layout and beautification.
- Business Professionals: Quickly visualize business proposals and product introductions with fast adaptation to multiple scenarios.
🎯 Goal: Lower the barrier to entry for PPT creation, empowering everyone to quickly create beautiful and professional presentations.
See more Use Cases
Supports three starting modes: Idea, Outline, and Page Description, catering to different creative habits.
- One-sentence Generation: Enter a topic, and AI automatically generates a well-structured outline and page-by-page content descriptions.
- Natural Language Editing: Supports modifying the outline or description through conversational "Vibes" (e.g., "change the third page to a case study"), with AI responding and adjusting in real-time.
- Outline/Description Mode: Supports both one-click batch generation and manual adjustment of details.
- Multi-format Support: Upload PDF, Docx, MD, Txt, and other files for automatic background content parsing.
- Intelligent Extraction: Automatically identify key points, image links, and chart information in the text to provide rich source material for generation.
- Style Reference: Support uploading reference images or templates to customize PPT styles.
No longer limited by complex menu buttons, issue modification commands directly through natural language.
- In-painting: Perform conversational modifications on specific areas (e.g., "Change this chart to a pie chart").
- Full-page Optimization: Generate high-definition, stylistically consistent pages based on nano banana pro🍌.
- Multi-format Support: One-click export to standard PPTX or PDF files.
- Perfect Fit: Default 16:9 aspect ratio, no manual layout adjustments needed, ready for presentation.
- Export images as high-fidelity, clean-background PPT pages with freely editable images and text
- See Anionex#121 for related updates
🌟 Comparison with notebooklm slide deck features
| Feature | notebooklm | This Project |
|---|---|---|
| Page Limit | 15 pages | Unlimited |
| Secondary Editing | Prompt-based modification | Selection-based editing + Verbal editing |
| Asset Addition | Cannot add after generation | Add freely after generation |
| Export Formats | Supports PDF, (non-editable image) pptx | Export as PDF, (image or editable) pptx |
| Watermark | Watermarked in free version | No watermark, freely add/remove elements |
Note: This comparison may become outdated as new features are added
-
【2-9】:
- New Features
- Support for pasting images on the home page, outline, and description cards for immediate recognition, providing a better interactive experience.
- Manual Outline Editing: Supports manually adjusting the chapter (part) a page belongs to.
- Docker Multi-architecture: Images now support amd64 / arm64 builds.
- Internationalization + Dark Mode: Added Chinese-English switching; supports light/dark/follow-system themes; all components adapted for dark mode.
- Fixes and Experience Optimizations
- Fixed export-related 500 errors, reference file association timing, outline/page data misalignment, task polling errors, infinite polling in description generation, image preview memory leaks, and partial failure handling in batch deletion.
- Optimized format example tips, HTTP error message copy, Modal closing experience, cleaned up old project localStorage, and removed redundant prompts during initial project creation.
- Various other optimizations and fixes.
- New Features
-
【1-4】 : v0.4.0 Release: Major upgrade for editable pptx export:
- Supports maximum restoration of text font size, color, bolding, and other styles from images;
- Added support for recognizing text content within tables;
- More precise logic for text size and position restoration;
- Optimized export workflow, significantly reducing residual text on background images after export;
- Supports page multi-select logic, allowing flexible selection of specific pages to generate and export.
- Detailed effects and usage can be found at Anionex#121
-
【12-27】: Added support for a template-free mode and high-quality text presets; you can now control PPT page styles through pure text descriptions.
| Status | Milestone |
|---|---|
| ✅ Completed | Create PPT via three paths: idea, outline, and page description |
| ✅ Completed | Parse Markdown-formatted images in text |
| ✅ Completed | Add more assets to single PPT slides |
| ✅ Completed | Area selection and Vibe-style voice editing for single slides |
| ✅ Completed | Asset module: asset generation, upload, etc. |
| ✅ Completed | Support for multi-file upload and parsing |
| ✅ Completed | Support Vibe-style voice adjustments for outlines and descriptions |
| ✅ Completed | Preliminary support for exporting editable .pptx files |
| 🔄 In Progress | Support multi-layered, precise image cutout in editable .pptx exports |
| 🔄 In Progress | Web search |
| 🔄 In Progress | Agent mode |
| 🚍 Partial | Optimize frontend loading speed |
| 🧭 Planned | Online presentation/playback feature |
| 🧭 Planned | Simple animations and slide transitions |
| 🚍 Partial | Multi-language support |
| 🏢 Commercial Feature | User system |
This is the simplest method, requiring no Docker installation or project downloading. You can access the application directly after creation.
- Deploy and start this application with one click via Rainyun (High bandwidth, suitable for HD image generation and downloading. New users enjoy a 15-day free trial).
- Coming soon
Quickly start frontend and backend services using Docker Compose.
📒 Instructions for Windows/Mac Users
If you are using Windows or macOS, please install Docker Desktop first and ensure that Docker is running (check the system tray icon on Windows or the menu bar icon on macOS), then follow the same steps as in the documentation.
Tip: If you encounter issues, Windows users should enable the WSL 2 backend in Docker Desktop settings (recommended); also ensure that ports 3000 and 5000 are not occupied.
- Clone the Repository
git clone https://github.com/Anionex/banana-slides
cd banana-slides- Configure Environment Variables
Create the .env file (refer to .env.example):
cp .env.example .envEdit the .env file and configure the necessary environment variables:
The LLM API in this project follows the AIHubMix platform format. It is recommended to use AIHubMix (Click here to visit) to obtain an API key to reduce migration costs.
Friendly Reminder: The Google Nano Banana Pro model API costs are relatively high, please be mindful of usage costs.
# AI Provider Configuration Format (gemini / openai / vertex)
AI_PROVIDER_FORMAT=gemini
# Gemini Format Configuration (Used when AI_PROVIDER_FORMAT=gemini)
GOOGLE_API_KEY=your-api-key-here
GOOGLE_API_BASE=https://generativelanguage.googleapis.com
# Proxy Example: https://aihubmix.com/gemini
# OpenAI Format Configuration (Used when AI_PROVIDER_FORMAT=openai)
OPENAI_API_KEY=your-api-key-here
OPENAI_API_BASE=https://api.openai.com/v1
# Proxy Example: https://aihubmix.com/v1
# Vertex AI Configuration (AI_PROVIDER_FORMAT=vertex)
# GCP Project and Service Account Key Required
# VERTEX_PROJECT_ID=your-gcp-project-id
# VERTEX_LOCATION=global
# GOOGLE_APPLICATION_CREDENTIALS=./gcp-service-account.json
# Lazyllm Format Configuration (Used when AI_PROVIDER_FORMAT=lazyllm)
# Select vendors for text generation and image generation
TEXT_MODEL_SOURCE=deepseek # Text generation model provider
IMAGE_MODEL_SOURCE=doubao # Image editing model provider
IMAGE_CAPTION_MODEL_SOURCE=qwen # Image captioning model provider
# API Keys for Various Providers (Only configure the ones you want to use)
```env
DOUBAO_API_KEY=your-doubao-api-key # Volcengine/Doubao
DEEPSEEK_API_KEY=your-deepseek-api-key # DeepSeek
QWEN_API_KEY=your-qwen-api-key # Alibaba Cloud/Qwen
GLM_API_KEY=your-glm-api-key # Zhipu GLM
SILICONFLOW_API_KEY=your-siliconflow-api-key # SiliconFlow
SENSENOVA_API_KEY=your-sensenova-api-key # SenseTime SenseNova
MINIMAX_API_KEY=your-minimax-api-key # MiniMax
...Use the new version of the editable export configuration method to achieve better editable export results: You need to obtain an API KEY from the Baidu AI Cloud Platform (click here to enter) and fill it in the BAIDU_API_KEY field in the .env file (there is a sufficient free usage quota). For details, see the instructions in Anionex#121.
📒 Vertex AI Configuration Guide (for GCP users)
Google Cloud Vertex AI allows calling Gemini models through GCP service accounts; new users can use promotional credits. Configuration steps:
- Go to the GCP Console, create a service account, and download the JSON format key file.
- Save the key file as
gcp-service-account.jsonin the project root directory. - Set in
.env:AI_PROVIDER_FORMAT=vertex VERTEX_PROJECT_ID=your-gcp-project-id VERTEX_LOCATION=global
- If deploying with Docker, you also need to uncomment relevant sections in
docker-compose.ymlto mount the key file into the container and set theGOOGLE_APPLICATION_CREDENTIALSenvironment variable.
The
gemini-3-*series models requireVERTEX_LOCATION=global
- Start Service
⚡ Use Pre-built Images (Recommended)
The project provides pre-built frontend and backend images on Docker Hub (synced with the latest version of the main branch), allowing you to skip local build steps for rapid deployment:
# Launching with Pre-built Images (No Need to Build from Scratch)
```bash
docker compose -f docker-compose.prod.yml up -dImage names:
anoinex/banana-slides-frontend:latestanoinex/banana-slides-backend:latest
Build images from scratch
docker compose up -dTip
If you encounter network issues, you can uncomment the mirror source configuration in the .env file and then rerun the startup command:
# Uncomment the following in the .env file to use domestic mirror sources
DOCKER_REGISTRY=docker.1ms.run/
GHCR_REGISTRY=ghcr.nju.edu.cn/
APT_MIRROR=mirrors.aliyun.com
PYPI_INDEX_URL=https://mirrors.cloud.tencent.com/pypi/simple
NPM_REGISTRY=https://registry.npmmirror.com/- Access the Application
- Frontend: http://localhost:3000
- Backend API: http://localhost:5000
- View Logs
# View Backend Logs (Last 200 Lines)
docker logs --tail 200 banana-slides-backend
# Real-time View of Backend Logs (Last 100 Lines)
docker logs -f --tail 100 banana-slides-backend
# View Frontend Logs (Last 100 Lines)
docker logs --tail 100 banana-slides-frontend- Stop Services
docker compose down- Update Project
Using Pre-built Images (docker-compose.prod.yml)
docker compose -f docker-compose.prod.yml pull
docker compose -f docker-compose.prod.yml up -dUsing Local Build (docker-compose.yml)
Note: If the code has been manually modified, this method is not applicable. You need to revert the code to the version when it was pulled first.
git pull
docker compose down
docker compose build --no-cache
docker compose up -dNote: Thanks to the excellent developer friend @ShellMonster for providing a Newbie Deployment Tutorial. It is specifically designed for beginners without any server deployment experience. You can click the link to view it.
- Python 3.10 or higher
- uv - Python package manager
- Node.js 16+ and npm
- A valid Google Gemini API key
- (Optional) LibreOffice - Required when uploading PPTX files using the "PPT Refurbishment" feature, used for converting PPTX to PDF. It is recommended to convert PPTX to PDF locally before uploading. Reason: Server-side rendering by LibreOffice may cause layout misalignment due to missing fonts (e.g., Microsoft YaHei, Calibri, etc.) and cannot fully restore some special effects. LibreOffice is not required if you upload PDF files. For Docker users who still need to support PPTX uploads within the container, run:
docker exec -it banana-slides-backend bash -c "apt-get update && apt-get install -y libreoffice-impress && rm -rf /var/lib/apt/lists/*"
Note: LibreOffice installed this way will be lost when the container is rebuilt and must be reinstalled.
- Clone the repository
git clone https://github.com/Anionex/banana-slides
cd banana-slides- Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh- Install dependencies
Run the following command in the project root directory:
uv syncThis will automatically install all dependencies based on pyproject.toml.
- Configure environment variables
Copy the environment variable template:
cp .env.example .envEdit the .env file and configure your API keys:
The LLM interfaces in this project follow the AIHubMix platform standard. It is recommended to use AIHubMix to obtain API keys to minimize migration costs.
# AI Provider Format Configuration (gemini / openai / vertex)
AI_PROVIDER_FORMAT=gemini
# Gemini Format Configuration (Used when AI_PROVIDER_FORMAT=gemini)
GOOGLE_API_KEY=your-api-key-here
GOOGLE_API_BASE=https://generativelanguage.googleapis.com
# Proxy Example: https://aihubmix.com/gemini
# OpenAI Format Configuration (Used when AI_PROVIDER_FORMAT=openai)
OPENAI_API_KEY=your-api-key-here
OPENAI_API_BASE=https://api.openai.com/v1
# Proxy Example: https://aihubmix.com/v1
# Vertex AI Configuration (AI_PROVIDER_FORMAT=vertex)
# Requires GCP Project and Service Account Key
# VERTEX_PROJECT_ID=your-gcp-project-id
# VERTEX_LOCATION=global
# GOOGLE_APPLICATION_CREDENTIALS=./gcp-service-account.json
# Modify this variable to control the backend service port
# Deployment Guide
## Environment Variable Configuration
Create a `.env` file in the root directory of the project and configure the following variables:
```bash
BACKEND_PORT=5000
...- Navigate to the frontend directory
cd frontend- Install dependencies
npm install- Configure API address
The frontend will automatically connect to the backend service at http://localhost:5000. If you need to modify this, please edit src/api/client.ts.
(Optional) If you have important local data, it is recommended to back up the database before upgrading:
cp backend/instance/database.db backend/instance/database.db.bak
cd backend
uv run alembic upgrade head && uv run python app.pyThe backend service will start at http://localhost:5000.
Visit http://localhost:5000/health to verify if the service is running correctly.
cd frontend
npm run devThe frontend development server will start at http://localhost:3000.
Open your browser to access and use the application.
- Framework: React 18 + TypeScript
- Build Tool: Vite 5
- State Management: Zustand
- Routing: React Router v6
- UI Components: Tailwind CSS
- Drag and Drop: @dnd-kit
- Icons: Lucide React
- HTTP Client: Axios
- Language: Python 3.10+
- Framework: Flask 3.0
- Package Management: uv
- Database: SQLite + Flask-SQLAlchemy
- AI Capabilities: Google Gemini API
- PPT Processing: python-pptx
- Image Processing: Pillow
- Concurrency Handling: ThreadPoolExecutor
- CORS Support: Flask-CORS
banana-slides/
├── frontend/ # React frontend application
│ ├── src/
│ │ ├── pages/ # Page components
│ │ │ ├── Home.tsx # Home (Create Project)
│ │ │ ├── OutlineEditor.tsx # Outline editing page
│ │ │ ├── DetailEditor.tsx # Detailed description editing page
│ │ │ ├── SlidePreview.tsx # Slide preview page
│ │ │ └── History.tsx # History version management page
│ │ ├── components/ # UI components
│ │ │ ├── outline/ # Outline-related components
│ │ │ │ └── OutlineCard.tsx
│ │ │ ├── preview/ # Preview-related components
│ │ │ │ ├── SlideCard.tsx
│ │ │ │ └── DescriptionCard.tsx
│ │ │ ├── shared/ # Shared components
│ │ │ │ ├── Button.tsx
│ │ │ │ ├── Card.tsx
│ │ │ │ ├── Input.tsx
│ │ │ │ ├── Textarea.tsx
│ │ │ │ ├── Modal.tsx
│ │ │ │ ├── Loading.tsx
│ │ │ │ ├── Toast.tsx
│ │ │ │ ├── Markdown.tsx
│ │ │ │ ├── MaterialSelector.tsx
│ │ │ │ ├── MaterialGeneratorModal.tsx
│ │ │ │ ├── TemplateSelector.tsx
│ │ │ │ ├── ReferenceFileSelector.tsx
│ │ │ │ └── ...
│ │ │ ├── layout/ # Layout components
│ │ │ └── history/ # History version components
│ │ ├── store/ # Zustand state management
│ │ │ └── useProjectStore.ts
│ │ ├── api/ # API interfaces
│ │ │ ├── client.ts # Axios client configuration
│ │ │ └── endpoints.ts # API endpoint definitions
│ │ ├── types/ # TypeScript type definitions
│ │ ├── utils/ # Utility functions
│ │ ├── constants/ # Constant definitions
│ │ └── styles/ # Style files
│ ├── public/ # Static resources
│ ├── package.json
│ ├── vite.config.ts
│ ├── tailwind.config.js # Tailwind CSS configuration
│ ├── Dockerfile
│ └── nginx.conf # Nginx configuration
│
├── backend/ # Flask backend application
│ ├── app.py # Flask application entry point
│ ├── config.py # Configuration file
│ ├── models/ # Database models
│ │ ├── project.py # Project model
│ │ ├── page.py # Page model (slide pages)
│ │ ├── task.py # Task model (asynchronous tasks)
│ │ ├── material.py # Material model (reference materials)
│ │ ├── user_template.py # UserTemplate model (user templates)
│ │ ├── reference_file.py # ReferenceFile model (reference files)
│ │ ├── page_image_version.py # PageImageVersion model (page versions)
│ ├── services/ # Service layer
│ │ ├── ai_service.py # AI generation service (Gemini integration)
│ │ ├── file_service.py # File management service
│ │ ├── file_parser_service.py # File parsing service
│ │ ├── export_service.py # PPTX/PDF export service
│ │ ├── task_manager.py # Asynchronous task management
│ │ ├── prompts.py # AI prompt templates
│ ├── controllers/ # API controllers
│ │ ├── project_controller.py # Project management
│ │ ├── page_controller.py # Page management
│ │ ├── material_controller.py # Material management
│ │ ├── template_controller.py # Template management
│ │ ├── reference_file_controller.py # Reference file management
│ │ ├── export_controller.py # Export functionality
│ │ └── file_controller.py # File upload
│ ├── utils/ # Utility functions
│ │ ├── response.py # Unified response format
│ │ ├── validators.py # Data validation
│ │ └── path_utils.py # Path handling
│ ├── instance/ # SQLite database (auto-generated)
│ ├── exports/ # Export files directory
│ ├── Dockerfile
│ └── README.md
│
├── tests/ # Test files directory
├── v0_demo/ # Early demo version
├── output/ # Output files directory
│
├── pyproject.toml # Python project configuration (uv management)
├── uv.lock # uv dependency lock file
├── docker-compose.yml # Docker Compose configuration
├── .env.example # Environment variable example
├── LICENSE # License
└── README.md # This file
To facilitate communication and mutual assistance, this WeChat group has been created.
Suggestions for new features or feedback are welcome. I will also answer questions in a laid-back manner.
-
Generated page text is garbled or blurry
- You can choose a higher resolution output (the OpenAI format may not support resolution adjustments; using the Gemini format is recommended). Based on testing, increasing the resolution from 1k to 2k before generating the page significantly improves text rendering quality.
- Please ensure that the specific text content to be rendered is included in the page description.
-
Poor results when exporting editable PPT, such as overlapping text or missing styles
- In 90% of cases, this is due to API configuration issues. Please refer to issue 121 for troubleshooting and solutions.
-
Does it support the free-tier Gemini API Key?
- The free tier only supports text generation and does not support image generation.
-
503 Error or Retry Error prompted during content generation
- You can check the Docker backend logs using the commands in the README to locate the detailed error for the 503 issue. This is generally caused by incorrect model configuration.
-
Why does the API Key set in .env not take effect?
- After editing the
.envfile during runtime, you need to restart the Docker container to apply the changes. - If parameters were previously configured on the web settings page, they will override the values in
.env. You can revert to the.envsettings by selecting "Restore Default Settings".
- After editing the
Welcome to contribute to this project via Issue and Pull Request!
Important: Please read CONTRIBUTING.md before contributing.
This project is open-sourced under the GNU Affero General Public License v3.0 (AGPL-3.0) and can be freely used for non-commercial purposes such as personal learning, research, experimentation, education, or non-profit scientific research.
Details
A Commercial License is required for commercial use (e.g., closed-source use, private deployment and delivery, integrating this project into closed-source products, or providing services without disclosing the corresponding source code). Please contact the author: anionex@qq.com - Contact: anionex@qq.comThanks to AI Fire for sponsoring this project
"Aggregating global multi-model API service providers. Enjoy secure, stable, and 24/7 access to the world's latest models at lower prices."- Project Contributors:
- Linux.do: A new ideal community
Open source is not easy 🙏 If this project is valuable to you, feel free to buy the developer a coffee ☕️
Thanks to the following friends for their voluntary sponsorship and support:
@雅俗共赏, @曹峥, @以年观日, @John, @胡yun星Ethan, @azazo1, @刘聪NLP, @🍟, @苍何, @万瑾, @biubiu, @law, @方源, @寒松Falcon If you have any questions about the sponsorship list, please contact the author



