Skip to content

KidsXH/vispilot

Repository files navigation

VisPilot - Multimodal Visualization Authoring with LLMs

Here is the official repository for the paper Exploring Multimodal Prompt for Visualization Authoring with Large Language Models.

VisPilot is a system that enables users to create visualizations using multimodal prompts, including text, sketches, and direct manipulations on existing visualizations. This repository contains the source code for the VisPilot system, which explores the potential of multimodal prompting for visualization authoring with Large Language Models (LLMs).

Features

  • Multimodal visualization authoring with text, sketches, and direct manipulation
  • Data table view for exploring datasets
  • Chat interface for natural language interaction
  • Design panel for customizing styles
  • History panel for tracking visualization changes
  • Interactive interface for trying the system - Online Demo
  • Corpus view for exploring the research dataset - Corpus Page

Demo

VisPilot-Video.mp4

Project Structure

  • /src/app: Main application routes and pages
    • /corpus: Visualization corpus analysis tools
    • /demo: Interactive demo interface
  • /components: React components for the UI
  • /public: Static assets including images

Getting Started

Prerequisites

  • Node.js (18.x or later)
  • npm or yarn

Installation

  1. Clone the repository:
git clone https://github.com/KidsXH/vispilot.git
cd vispilot
  1. Install dependencies:
npm install
# or
yarn install
  1. Start the development server:
npm run dev
# or
yarn dev
  1. Open your browser and navigate to http://localhost:3000/vispilot

Research

This project explores how LLMs interpret ambiguous or incomplete text prompts in the context of visualization authoring, and introduces visual prompts as a complementary input modality to improve user intent interpretation. Our research highlights the importance of multimodal prompting in enhancing human-AI collaboration for visualization tasks.

Resources

Citation

You can cite our work as follows:

@article{wen2025exploring,
  title={Exploring Multimodal Prompt for Visualization Authoring with Large Language Models},
  author={Zhen Wen and Luoxuan Weng and Yinghao Tang and Runjin Zhang and Yuxin Liu and Bo Pan and Minfeng Zhu and Wei Chen},
  journal={arXiv preprint},
  year={2025}
  doi={10.48550/arXiv.2504.13700}
}

License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

About

Leveraging Multimodal Prompt for Visualization Authoring with LLMs

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages