Spider Agent Quickstart

This is the sibling baseline we use during development; it produces a single, image-rich report (contrast with da-agent, which uses a three-stage flow for tighter control). You can (and should) refine the system prompts to better guide your model for your use case.

Download datasets
Follow ../dacomp-da/README.md to download:
- English: ../dacomp-da/tasks/dacomp-da.jsonl
- Chinese: ../dacomp-da/tasks_zh/dacomp-da-zh.jsonl
Configure your LLM
Update model endpoints/keys in spider_agent/agent/config.py as needed.

Install dependencies

pip install -r requirements.txt

python3 -m pip install -r requirements.txt

Run the agent
-s sets the experiment suffix (output subfolder), -t points to the task JSONL.

# English example
python3 run.py --model openai_qwen3-coder-plus -s both1 -t ../../dacomp-da/tasks/dacomp-da.jsonl --image_prompt
# Chinese example
python3 run.py --model openai_qwen3-coder-plus -s try1-zh -t ../../dacomp-da/tasks_zh/dacomp-da-zh.jsonl --language zh --image_prompt

Common flags:

--example_index: index range (e.g., 0-10, 2,3, or all)
--example_name: filter by substring in task id
--language: zh (default) or en
--image_prompt: enable image-enhanced prompt for design tasks
--overwriting / --retry_failed: control reruns when outputs exist

Export results to the evaluation suite
Collect a run’s outputs into ../dacomp-da/evaluation_suite/agent_results/:

python3 get_results.py openai_qwen3-coder-plus-test1-zh --output_dir ../../dacomp-da/evaluation_suite/agent_results
python3 get_results.py gemini-2.5-pro-both1 --output_dir ../../dacomp-da/evaluation_suite/agent_results

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spider Agent Quickstart

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Spider Agent Quickstart