Skip to content

πŸ–ΌοΈ A Kaggle Package project for converting natural language prompts into precise SVG code using Python.

License

Notifications You must be signed in to change notification settings

Harsh-BH/text-to-svg-genie

Repository files navigation

Text-to-SVG Genie

A machine learning solution for the Kaggle "Text-to-SVG Generation" competition that converts text descriptions into high-quality SVG images.

πŸš€ Overview

Text-to-SVG Genie is an AI system that generates Scalable Vector Graphics (SVG) code from text descriptions. Given a text prompt describing an image, the model generates SVG code that renders the described scene as accurately as possible.

This project was developed for the Kaggle competition aimed at building specialized solutions that outperform general-purpose LLMs in generating image-rendering code, providing greater transparency in the process.

πŸ“‹ Features

  • Transform text descriptions into SVG code
  • Ensure compliance with competition constraints
  • Generate aesthetically pleasing vector images
  • Optimize for both visual fidelity and description faithfulness

πŸ”§ Installation

Clone the repository and install the required dependencies:

git clone https://github.com/Harsh-BH/text-to-svg-genie.git
cd text-to-svg-genie
pip install -r requirements.txt

πŸ“¦ Dependencies

The project relies on the following key libraries:

  • kagglehub - For Kaggle package integration
  • numpy/pandas - For data handling
  • scipy/scikit-learn - For computational tasks
  • PyTorch - For deep learning models
  • Matplotlib - For visualization

🎯 Competition Constraints

Our model adheres to the following competition requirements:

  • Generated SVGs are less than 10,000 bytes
  • Only allowlisted SVG elements and attributes are used
  • No rasterized image data or external sources
  • No CSS style elements
  • SVG generation completes within 5 minutes per prompt

πŸ“Š Evaluation Metrics

The model is optimized for the SVG Image Fidelity Score, which combines:

  1. VQA task results using PaliGemma model
  2. OCR text detection (with penalties for excess text)
  3. CLIP-based Aesthetic Score
  4. Final score: harmonic mean of VQA and Aesthetic scores

πŸ’» Usage

Using the Model

# Import the model
from text_to_svg_genie.model import Model

# Initialize the model
model = Model()

# Generate SVG from a text description
text_prompt = "A red apple sitting on a wooden table"
svg_code = model.predict(text_prompt)

# Save the SVG to a file
with open("apple.svg", "w") as f:
    f.write(svg_code)

Testing with Sample Prompts

Run the demo script to test the model with sample prompts:

python demo.py

🧠 Technical Approach

Our approach combines:

  1. Text Understanding: Extracting key visual elements from descriptions
  2. Scene Composition: Determining optimal layout and element relationships
  3. SVG Generation: Creating vector elements that best represent the described scene
  4. Constraint Optimization: Ensuring outputs meet competition requirements

πŸ“‚ Project Structure

text-to-svg-genie/
β”œβ”€β”€ README.md              # Project documentation
β”œβ”€β”€ requirements.txt       # Dependencies
β”œβ”€β”€ .gitignore             # Git ignore file
β”œβ”€β”€ text_to_svg_genie/     # Main package directory
β”‚   β”œβ”€β”€ __init__.py        # Package initialization
β”‚   β”œβ”€β”€ model.py           # Model implementation
β”‚   β”œβ”€β”€ svg_generator.py   # SVG generation utilities
β”‚   └── utils.py           # Helper functions
β”œβ”€β”€ examples/              # Example SVGs and outputs
β”œβ”€β”€ demo.py                # Demonstration script
└── tests/                 # Test suite

πŸ“ˆ Performance

Our model aims to:

  • Generate accurate representations of described scenes
  • Create aesthetically pleasing vector graphics
  • Optimize for the competition evaluation metrics
  • Complete generation within required time constraints

πŸ“ License

MIT License

πŸ”— Resources

About

πŸ–ΌοΈ A Kaggle Package project for converting natural language prompts into precise SVG code using Python.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published