🚗 Synthetic Image Generation for Autonomous Vehicle Scenarios

This project provides an end-to-end solution for generating synthetic images of autonomous vehicle (AV) scenarios using a Retrieval-Augmented Generation (RAG) pipeline. It leverages scenario metadata, LangChain for retrieval, FAISS for similarity search, and Stable Diffusion for high-quality image generation.

The system allows you to input a text description of a driving scenario, retrieves related prompts from a dataset of scenarios, and generates high-resolution synthetic images based on the input and retrieved information.

query = "rural road sunny day"

📂 Project Structure

synthetic-data-av/
├── data/
│   └── scenario_prompts.csv            # Scenario prompts used for retrieval
├── generation/
│   ├── image_generation.py             # Image generation using Stable Diffusion
│   └── rag_pipeline.py                 # Retrieval-Augmented Generation pipeline
├── retrieval/
│   └── langchain_integration.py        # FAISS-based vector store using LangChain
├── streamlit_app.py                    # Interactive UI for image generation
├── environment.yml                     # Conda environment dependencies
├── requirements.txt                    # pip dependencies
└── README.md                           # Project documentation (this file)

🔧 Setup Instructions

1️⃣ Clone the Repository

git clone <repository_url>
cd synthetic-data-av

2️⃣ Create a Conda Environment (Recommended)

Create and activate a Conda environment for dependency management:

conda env create -f environment.yml
conda activate synthetic

🚀 How to Run the Project

1️⃣ Prepare Scenario Metadata

Create a CSV file (data/scenario_prompts.csv) with sample AV scenario prompts such as:

id,weather,location,time,description
1,rain,urban,night,"Scenario: Rainy night in an urban area with busy pedestrian crossings and low visibility."
2,sunny,suburban,day,"Scenario: Sunny day in a suburban area with moderate traffic and clear skies."
3,fog,rural,morning,"Scenario: Foggy morning on a rural road with low visibility and sparse traffic."

2️⃣ Test Image Generation (Standalone)

You can test Stable Diffusion image generation by running:

python generation/image_generation.py

This will generate an image based on a predefined prompt and save it as generated_sd_image.png.

3️⃣ Run the RAG Pipeline

To retrieve similar prompts and generate an augmented scenario image:

python generation/rag_pipeline.py

The resulting image will be saved as rag_generated_image.png.

4️⃣ Launch the Streamlit App (Interactive Mode)

To generate images interactively:

streamlit run streamlit_app.py

Open the local URL (usually http://localhost:8501) and input a description (e.g., "An autonomous vehicle driving on a rainy night in a busy urban area."). The app will retrieve similar prompts from the dataset and generate an image.

⚡ Performance Optimization

GPU Support:
Remove or comment out pipe.to("cpu") in image_generation.py if you have a GPU for faster image generation.
Lower Resolution for Faster Generation:
Modify the height and width parameters in generate_image() to 512 for faster inference:
```
height=512, width=512
```
Reduce Inference Steps:
Decrease num_inference_steps for faster, though lower-quality, results.

🛠️ Troubleshooting

Torch not compiled with CUDA enabled:
Make sure your PyTorch installation has GPU support. Reinstall PyTorch with CUDA:
```
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu118
```

sentencepiece Build Errors:
Install cmake before reinstalling sentencepiece:

brew install cmake  # macOS
sudo apt-get install cmake pkg-config  # Ubuntu

Streamlit Deprecation Warning:
Replace:

st.image(image, use_column_width=True)

with:

st.image(image, use_container_width=True)

🔮 Future Work (Checklist)

📄 License

This project is licensed under the CC0 1.0 Universal License.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
embeddings		embeddings
generation		generation
retrieval		retrieval
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.png		app.png
environment.yml		environment.yml
image.jpg		image.jpg
streamlit_app.py		streamlit_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚗 Synthetic Image Generation for Autonomous Vehicle Scenarios

📂 Project Structure

🔧 Setup Instructions

1️⃣ Clone the Repository

2️⃣ Create a Conda Environment (Recommended)

🚀 How to Run the Project

1️⃣ Prepare Scenario Metadata

2️⃣ Test Image Generation (Standalone)

3️⃣ Run the RAG Pipeline

4️⃣ Launch the Streamlit App (Interactive Mode)

⚡ Performance Optimization

🛠️ Troubleshooting

🔮 Future Work (Checklist)

📄 License

About

Uh oh!

Releases

Packages

Languages

License

nish-nm/synthetic-data-av

Folders and files

Latest commit

History

Repository files navigation

🚗 Synthetic Image Generation for Autonomous Vehicle Scenarios

📂 Project Structure

🔧 Setup Instructions

1️⃣ Clone the Repository

2️⃣ Create a Conda Environment (Recommended)

🚀 How to Run the Project

1️⃣ Prepare Scenario Metadata

2️⃣ Test Image Generation (Standalone)

3️⃣ Run the RAG Pipeline

4️⃣ Launch the Streamlit App (Interactive Mode)

⚡ Performance Optimization

🛠️ Troubleshooting

🔮 Future Work (Checklist)

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages