VisPilot - Multimodal Visualization Authoring with LLMs

Here is the official repository for the paper Exploring Multimodal Prompt for Visualization Authoring with Large Language Models.

VisPilot is a system that enables users to create visualizations using multimodal prompts, including text, sketches, and direct manipulations on existing visualizations. This repository contains the source code for the VisPilot system, which explores the potential of multimodal prompting for visualization authoring with Large Language Models (LLMs).

Features

Multimodal visualization authoring with text, sketches, and direct manipulation
Data table view for exploring datasets
Chat interface for natural language interaction
Design panel for customizing styles
History panel for tracking visualization changes
Interactive interface for trying the system - Online Demo
Corpus view for exploring the research dataset - Corpus Page

Demo

VisPilot-Video.mp4

Project Structure

/src/app: Main application routes and pages
- /corpus: Visualization corpus analysis tools
- /demo: Interactive demo interface
/components: React components for the UI
/public: Static assets including images

Getting Started

Prerequisites

Node.js (18.x or later)
npm or yarn

Installation

Clone the repository:

git clone https://github.com/KidsXH/vispilot.git
cd vispilot

Install dependencies:

npm install
# or
yarn install

Start the development server:

npm run dev
# or
yarn dev

Open your browser and navigate to http://localhost:3000/vispilot

Research

This project explores how LLMs interpret ambiguous or incomplete text prompts in the context of visualization authoring, and introduces visual prompts as a complementary input modality to improve user intent interpretation. Our research highlights the importance of multimodal prompting in enhancing human-AI collaboration for visualization tasks.

Resources

Citation

You can cite our work as follows:

@article{wen2025exploring,
  title={Exploring Multimodal Prompt for Visualization Authoring with Large Language Models},
  author={Zhen Wen and Luoxuan Weng and Yinghao Tang and Runjin Zhang and Yuxin Liu and Bo Pan and Minfeng Zhu and Wei Chen},
  journal={arXiv preprint},
  year={2025}
  doi={10.48550/arXiv.2504.13700}
}

License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Name		Name	Last commit message	Last commit date
Latest commit History 139 Commits
.github/workflows		.github/workflows
public		public
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VisPilot - Multimodal Visualization Authoring with LLMs

Features

Demo

Project Structure

Getting Started

Prerequisites

Installation

Research

Resources

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VisPilot - Multimodal Visualization Authoring with LLMs

Features

Demo

Project Structure

Getting Started

Prerequisites

Installation

Research

Resources

Citation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages