Name	Name	Last commit message	Last commit date
parent directory ..
assets	assets
datasets	datasets
demos	demos
scripts	scripts
readme.md	readme.md
requirements.txt	requirements.txt

WebDancer: Towards Autonomous Information Seeking Agency

🕺 Introduction

We propose WebDancer, a novel end-to-end agentic training framework designed to enhance the multi-step information-seeking capabilities of web-based agents.
We introduce a four-stage training paradigm comprising browsing data construction, trajectory sampling, supervised fine-tuning for effective cold start, and reinforcement learning for improved generalization, enabling the agent to autonomously acquire robust search and reasoning skills.
Our data-centric approach integrates trajectory-level supervision and online learning to develop a scalable pipeline for training agentic systems.
We instantiate this framework in a ReAct-based agent and conduct extensive experiments on GAIA and WebWalkerQA benchmarks. Results demonstrate that WebDancer achieves strong performance across diverse tasks, validating the effectiveness of our proposed paradigm and providing systematic insights for future agent development.

🚀 Performance

🚀 Quick Start

Step 0: Set Up the Environment

conda create -n webdancer python=3.12
pip install -r requirements.txt

Step 1: Deploy the Model

Download the WebDancer model from 🤗 HuggingFace and deploy it using the provided scripts with sglang.

cd scripts
bash depoly_model.sh WebDancer_PATH

Note: Replace WebDancer_PATH with the actual path to the downloaded model.

Step 2: Run the Demo

Edit the following keys in scripts/run_demo.sh:

GOOGLE_SEARCH_KEY, you can get it from serpapi or serper.
JINA_API_KEY, you can get it from jina.
DASHSCOPE_API_KEY, you can get it from dashscope.

Then, launch the demo with Gradio to interact with the WebDancer model:

cd scripts
bash run_demo.sh

🎥 Demos

We provide demos for WebWalkerQA, GAIA and Daily Use. Our model can execute the long-horizon tasks with multiple steps and complex reasoning, such as web traversal, information seeking and question answering.

WebWalkerQA

WebWalker_case.mp4

GAIA

WebWalker_case.mp4

Daily Use

User_case.mp4

⌛️ The deployment of models and demos will be updated soon.

Four-Stage Training Paradigm

1. Browsing Data Construction

The sampled QA data can be found in datasets/sample_qa.jsonl.

2. Trajectory Sampling

The sampled trajectory data for SFT can be found in datasets/sample_qa.jsonl.

3. Supervised Fine-Tuning

For SFT training, you can refer to the training scipts of LLaMA-Factory.

4. Reinforcement Learning

We use the modified verl for RL training.

🤩 Acknowledgements

This work is implemented based on LLaMA-Factory and verl. We greatly appreciate their valuable contributions to the community, especially for WebThinker.

📑 Citation

If this work is helpful, please kindly cite as:

@misc{wu2025webdancer,
      title={WebDancer: Towards Autonomous Information Seeking Agency},
      author={Jialong Wu and Baixuan Li and Runnan Fang and Wenbiao Yin and Liwen Zhang and Zhengwei Tao and Dingchu Zhang and Zekun Xi and Yong Jiang and Pengjun Xie and Fei Huang and Jingren Zhou},
      year={2025},
      eprint={2505.22648},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.22648},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

WebDancer: Towards Autonomous Information Seeking Agency

🕺 Introduction

🚀 Performance

🚀 Quick Start

Step 0: Set Up the Environment

Step 1: Deploy the Model

Step 2: Run the Demo

🎥 Demos

WebWalkerQA

GAIA

Daily Use

Four-Stage Training Paradigm

1. Browsing Data Construction

2. Trajectory Sampling

3. Supervised Fine-Tuning

4. Reinforcement Learning

🤩 Acknowledgements

📑 Citation

FilesExpand file tree

WebDancer

Directory actions

More options

Directory actions

More options

Latest commit

History

WebDancer

Folders and files

parent directory

readme.md

WebDancer: Towards Autonomous Information Seeking Agency

🕺 Introduction

🚀 Performance

🚀 Quick Start

Step 0: Set Up the Environment

Step 1: Deploy the Model

Step 2: Run the Demo

🎥 Demos

WebWalkerQA

GAIA

Daily Use

Four-Stage Training Paradigm

1. Browsing Data Construction

2. Trajectory Sampling

3. Supervised Fine-Tuning

4. Reinforcement Learning

🤩 Acknowledgements

📑 Citation