Here is the official repository for the paper Exploring Multimodal Prompt for Visualization Authoring with Large Language Models.
VisPilot is a system that enables users to create visualizations using multimodal prompts, including text, sketches, and direct manipulations on existing visualizations. This repository contains the source code for the VisPilot system, which explores the potential of multimodal prompting for visualization authoring with Large Language Models (LLMs).
- Multimodal visualization authoring with text, sketches, and direct manipulation
- Data table view for exploring datasets
- Chat interface for natural language interaction
- Design panel for customizing styles
- History panel for tracking visualization changes
- Interactive interface for trying the system - Online Demo
- Corpus view for exploring the research dataset - Corpus Page
VisPilot-Video.mp4
/src/app: Main application routes and pages/corpus: Visualization corpus analysis tools/demo: Interactive demo interface
/components: React components for the UI/public: Static assets including images
- Node.js (18.x or later)
- npm or yarn
- Clone the repository:
git clone https://github.com/KidsXH/vispilot.git
cd vispilot- Install dependencies:
npm install
# or
yarn install- Start the development server:
npm run dev
# or
yarn dev- Open your browser and navigate to
http://localhost:3000/vispilot
This project explores how LLMs interpret ambiguous or incomplete text prompts in the context of visualization authoring, and introduces visual prompts as a complementary input modality to improve user intent interpretation. Our research highlights the importance of multimodal prompting in enhancing human-AI collaboration for visualization tasks.
You can cite our work as follows:
@article{wen2025exploring,
title={Exploring Multimodal Prompt for Visualization Authoring with Large Language Models},
author={Zhen Wen and Luoxuan Weng and Yinghao Tang and Runjin Zhang and Yuxin Liu and Bo Pan and Minfeng Zhu and Wei Chen},
journal={arXiv preprint},
year={2025}
doi={10.48550/arXiv.2504.13700}
}This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
