Skip to content

TIGER-AI-Lab/VisCoder2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VisCoder2: Building Multi-Language Visualization Coding Agents

🌐 Project Page | 📖 arXiv | 🤗 VisCode-Multi-679K | 🤗 VisPlotBench | 🤗 VisCoder2


🔔 News

  • 🔥 [2025-10-25] VisCode-Multi-679K, VisPlotBench and VisCode2 models are now publicly released! Check out our paper and Huggingface collections.

🧠 Introduction

Alt text VisCoder2 is an open-source family of multi-language visualization coding agents capable of iteratively generating, executing, rendering, and self-debugging visualization code.

This work addresses core challenges where existing models fail:

  • Limited language coverage
  • Unreliable code execution
  • Lack of iterative correction mechanisms

Unlike general code generation, visualization requires grounding across natural language, code, and rendered visual outputs.

To enable this, we introduce three complementary resources:

  1. VisCode-Multi-679K:
    A large-scale supervised dataset with 679K executable visualization samples and multi-turn correction dialogues across 12 programming languages, including Python, Vega-Lite, LaTeX, Mermaid, LilyPond, and more. Alt text
  2. VisPlotBench:
    A new benchmark spanning 8 languages and 13 visual categories, designed to systematically evaluate both initial code generation and multi-round self-debug capabilities. Alt text
  3. VisCoder2:
    The family of multi-language visualization models trained on VisCode-Multi-679K.

📊 Main Results on VisPlotBench

We evaluate VisCoder2 on VisPlotBench, our new benchmark for executable visualization code generation across 8 diverse languages.
The primary metric is Execution Pass Rate, which measures whether the code runs without error and produces a valid visual. Alt text

With iterative self-debug, VisCoder2-32B achieves an 82.4% overall execution pass rate, matching the performance of GPT-4.1 and significantly outperforming all open-source baselines.


🛠️ Training & Evaluation

We provide both the training dataset and evaluation benchmark for VisCoder2.

  • 📦 Training is performed using the ms-swift framework with full-parameter supervised fine-tuning on our new VisCode-Multi-679K dataset.
  • 📊 Evaluation is based on VisPlotBench, using a standardized execute–render–score pipeline that assesses models across 8 languages.
    This includes a self-debug evaluation mode that allows models to revise failed generations over multiple rounds.

See the following folders for details:

  • train/ : Training scripts and configurations based on ms-swift
  • VisPlotBench/ : Evaluation framework for VisPlotBench

📬 Contact


📖 Citation

BibTeX:

@article{ni2025viscoder2,
  title={VisCoder2: Building Multi-Language Visualization Coding Agents},
  author={Ni, Yuansheng and Cai, Songcheng and Chen, Xiangchao and Liang, Jiarong and Lyu, Zhiheng and Deng, Jiaqi and Zou, Kai and Nie, Ping and Yuan, Fei and Yue, Xiang and others},
  journal={arXiv preprint arXiv:2510.23642},
  year={2025}
}

@article{ni2025viscoder,
  title={VisCoder: Fine-Tuning LLMs for Executable Python Visualization Code Generation},
  author={Ni, Yuansheng and Nie, Ping and Zou, Kai and Yue, Xiang and Chen, Wenhu},
  journal={arXiv preprint arXiv:2506.03930},
  year={2025}
}

About

The official code of "VisCoder2: Building Multi-Language Visualization Coding Agents"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published