Skip to content

JianJinglin/awesome-agentic-AIScientists

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Multimodal Agentic AI Scientists

A curated list of papers on Agentic Multimodal Large Language Models (MLLMs) for Scientific Discovery

🚀 Join us in building the AI for Science community! Know a great paper we missed? Open an issue — together, let's accelerate scientific discovery with AI!

This repository accompanies our survey paper: "Exploring Agentic Multimodal Large Language Models: A Survey for AIScientists"

AIScientist GitHub Repository Overview

What is an AIScientist?

AIScientists are autonomous agents powered by multimodal large language models (MLLMs) that can understand papers, generate hypotheses, plan and conduct experiments, analyze results, and draft manuscripts across the scientific research lifecycle. Recent systems span open-ended AI research (Lu et al., 2024; Lu et al., 2026), biomedical hypothesis generation (Gottweis et al., 2026), automated biology discovery (Ghareeb et al., 2026), and empirical software generation (Aygün et al., 2026). This survey summarizes a complete pipeline for developing multimodal agentic AIScientists, with representative studies spanning 10 scientific domains.

Comparison with Related Surveys

Prior surveys examine scientific AI agents by workflow stages, autonomy levels, domain resources, or automation-to-autonomy transitions. Our survey adds a pipeline-oriented view across modalities, agent training, inference-time methods, benchmarks, and human-AI collaboration, clarifying how multimodal scientific agents are built, where costs arise, and which human checkpoints remain necessary.

Paper Taxonomy Ag. DM. Method HCI Ben. #Dom.
Zhang et al. (2024) Domain Seq.+ Train. only 6
Gridach et al. (2025) Research Workflow Infer. only 4
Luo et al. (2025) Research Workflow
Zhang et al. (2025) Research Workflow Seq.+
Ren et al. (2025) Agent Composition Train. & Infer. 6+
Wei et al. (2025) Auto. & Domain Infer. only 4
Hu et al. (2025) Data & Domain 6+
Zheng et al. (2025) Research Workflow & Auto. Infer. only 6+
Zhou et al. (2025) Research Workflow Infer. only 6+
Ours ML & Research Pipeline Train. & Infer. 10

Ag. = Agentic AI; DM. = Data Modality; HCI = Human-Computer Interaction; Ben. = Benchmark; #Dom. = Number of domains; Seq.+ = Sequence and more modalities; Train. = Agent Training; Infer. = Agent Inference; Auto. = Autonomy Level

Ours: An End-to-End Developer Pipeline

Overview of the agentic MLLM framework for scientific discovery

Overview of our framework: Starting from diverse Input & Output modalities, through Agent Training and Inference methods, to Evaluation benchmarks, with Human-AI Collaboration integrated at every stage.


Table of Contents


⚙️ Methods for Scientific MLLM Agents

Scientific MLLM agents need more than generic instruction following: they must learn domain representations, call tools, ground decisions in evidence, and recover when execution contradicts the plan.

🏋️ Agent Training

Supervised Fine-Tuning & Tool Instruction

Reinforcement Learning & Verifier Feedback

Contrastive & Adversarial Learning

🚀 Agent Inference

Knowledge Grounding: RAG, Knowledge Graphs & ICL

Planning, Tool Use & Workflow Control

Full-Loop & Self-Correcting Agents

🤝 Multi-Agent Systems


📈 Benchmarks & Evaluation


🧑‍🔬 Human-AI Collaboration


License

This project is licensed under the MIT License - see the LICENSE file for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors