This is the sibling baseline we use during development; it produces a single, image-rich report (contrast with da-agent, which uses a three-stage flow for tighter control). You can (and should) refine the system prompts to better guide your model for your use case.
-
Download datasets
Follow../dacomp-da/README.mdto download:- English:
../dacomp-da/tasks/dacomp-da.jsonl - Chinese:
../dacomp-da/tasks_zh/dacomp-da-zh.jsonl
- English:
-
Configure your LLM
Update model endpoints/keys inspider_agent/agent/config.pyas needed. -
Install dependencies
pip install -r requirements.txt python3 -m pip install -r requirements.txt
-
Run the agent
-ssets the experiment suffix (output subfolder),-tpoints to the task JSONL.# English example python3 run.py --model openai_qwen3-coder-plus -s both1 -t ../../dacomp-da/tasks/dacomp-da.jsonl --image_prompt # Chinese example python3 run.py --model openai_qwen3-coder-plus -s try1-zh -t ../../dacomp-da/tasks_zh/dacomp-da-zh.jsonl --language zh --image_prompt
Common flags:
--example_index: index range (e.g.,0-10,2,3, orall)--example_name: filter by substring in task id--language:zh(default) oren--image_prompt: enable image-enhanced prompt for design tasks--overwriting/--retry_failed: control reruns when outputs exist
- Export results to the evaluation suite
Collect a run’s outputs into../dacomp-da/evaluation_suite/agent_results/:python3 get_results.py openai_qwen3-coder-plus-test1-zh --output_dir ../../dacomp-da/evaluation_suite/agent_results python3 get_results.py gemini-2.5-pro-both1 --output_dir ../../dacomp-da/evaluation_suite/agent_results