SciReasoner

Multimodal physics problem solving — caption → reason → critic.

🥇 1st Place — ICML 2025 AI for Math Workshop, Track 2: Physics Reasoning with Diagrams and Expressions

ICML 2025 AI4Math First Place Certificate

What it does

A three-stage pipeline that solves multimodal physics problems by describing the figure, deriving an answer, and then critiquing and refining that answer:

Caption (image → text) → Reason (solve) → Critic (review & correct)

Default model: Gemini-3.1-Pro on any OpenAI-compatible endpoint.

Use it

git clone https://github.com/OpenDCAI/SciReasoner.git
cd SciReasoner
pip install -e .

export OPENAI_API_KEY=<your-key>
export OPENAI_BASE_URL=<endpoint>          # optional, OpenAI-compatible proxy

As a CLI:

scireasoner solve --problem "A 2 kg block slides from rest down a 30° incline of length 5 m, μ=√3/10, g=10. Find the speed at the bottom."
scireasoner solve --problem "Find I(t)." --image circuit.png

As a Python library:

from scireasoner import solve
res = solve(problem="...", image="figure.png")
print(res.answer, res.reasoning)

Inside Claude Code (one-click install):

cd plugins/claude-code/scireasoner && bash install.sh

Inside Codex (one-click install):

cd plugins/codex/scireasoner && bash install.sh

Both plugins expose three MCP tools — scireasoner_solve, scireasoner_caption, scireasoner_reason — and auto-trigger a solve-physics-problem skill when the user asks for help with a physics problem.

Reproduce SeePhys Pro 2026

pip install -e ".[batch]"
hf download Kun-Xiang/SeePhysPro --repo-type dataset --local-dir ./data/SeePhysPro

python seephys_pro_codabench/scripts/run_v2.py \
    --run v2_pub830 --split testmini --levels level1 level2 level3 level4 level5 \
    --caption-model gemini-3.1-pro-preview --reason-model gemini-3.1-pro-preview \
    --critic-model gemini-3.1-pro-preview --use-critic --k-samples 1 --workers 50

python seephys_pro_codabench/scripts/audit_fix.py --run output/v2_pub830
# Upload submission_audited.zip to https://www.codabench.org/competitions/16010/

Per-stage caches under output/<run>/cache/ make crash-resume automatic — re-run to continue.

Results

ICML 2025 SeePhys Challenge — 🥇 1st Place.

SeePhys Pro 2026 (Codabench 16010, public testmini) — current best:

#	Submission	Overall	L1	L2	L3	L4	L5
1	gemini baseline	0.7651	0.770	0.810	0.755	0.700	0.933
4	+ L4 verbatim caption	0.7747	0.765	0.810	0.760	0.740	0.933
8	+ L1 few-shot reason	0.7771	0.770	0.800	0.765	0.750	0.933

Full iteration log: seephys_pro_codabench/output/submissions/README.md.

Layout

scireasoner/              Python package (CLI + MCP server, thin shell)
plugins/claude-code/      Claude Code one-click install
plugins/codex/            Codex one-click install
seephys_pro_codabench/    Active 2026 competition workspace (don't touch)
caption.py / answer.py    Original 2025 single-script implementation

The scireasoner/ package imports stages directly from seephys_pro_codabench/scripts/run_v2.py without copying — so any improvement we push during the live competition flows to all downstream users on the next git pull.

Citation

@article{liang2025multimodal,
  title  = {Multimodal Reasoning for Science: Technical Report and 1st Place
            Solution to the ICML 2025 SeePhys Challenge},
  author = {Liang, Hao and Wu, Ruitao and Zeng, Bohan and Niu, Junbo
            and Zhang, Wentao and Dong, Bin},
  journal= {arXiv preprint arXiv:2509.06079},
  year   = {2025}
}

License

GPL-3.0. See LICENSE. Thanks to the ICML 2025 AI for Math Workshop organizers and Codabench / AWS for hosting.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
assets		assets
plugins		plugins
scireasoner		scireasoner
seephys_pro_codabench		seephys_pro_codabench
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
answer.py		answer.py
caption.py		caption.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SciReasoner

What it does

Use it

Reproduce SeePhys Pro 2026

Results

Layout

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SciReasoner

What it does

Use it

Reproduce SeePhys Pro 2026

Results

Layout

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages