Welcome to Awesome-Agentic-Clinical-Dialogue. This repo includes papers about methods related to agentic clinical dialogue. We believe that the agentic paradigm is still a largely unexplored area, and we hope this repository will provide you with some valuable insights!
Read our survey paper here: Reinventing Clinical Dialogue: Agentic Paradigms for LLM‑Enabled Healthcare Communication
Courses&Tutorial (🤟Check it out!) •
Papers •
Datasets •
Leading Group
This framework facilitates a systematic analysis of the intrinsic trade-offs between creativity and reliability by categorizing methods into four archetypes: Latent Space Clinicians, Emergent Planners, Grounded Synthesizers, and Verifiable Workflow Automators. For each paradigm, we deconstruct the technical realization across the entire cognitive pipeline, encompassing strategic planning, memory management, action execution, collaboration, and evolution, to reveal how distinct architectural choices balance the tension between autonomy and safety. Furthermore, we bridge abstract design philosophies with the pragmatic implementation ecosystem. By mapping real-world applications to our taxonomy and systematically reviewing benchmarks and evaluation metrics specific to clinical agents, we provide a comprehensive reference for future development.
- Key Categories
- Start with Awesome Dataset
- Tutorial and Courses
- Leading Group
- Awesome Methods, Model and Resource List
- Contributing
- Citation
- 🤖Latent Space Clinicians (LSC). These agents leverage the LLM's vast internal knowledge for creative synthesis and forming a coherent understanding of a clinical situation. Their philosophy is to trust the model's emergent reasoning capabilities to function like an experienced clinical assistant providing insights. For example, the zero/few-shot reasoning capabilities of Med-PaLM or MedAgents exemplify this paradigm.
- 🤖Emergent Planners (EP). This paradigm grants the LLM a high degree of autonomy, allowing it to dynamically devise its own multi-step plan to achieve a complex clinical goal. The agent's behavior is emergent, as it independently determines the necessary steps and goals. Frameworks like AgentMD, which uses ReAct-style prompting.
- 🤖Grounded Synthesizers (GS). These agents operate under the principle that LLMs should function as powerful natural language interfaces to reliable external information rather than as knowledge creators. Their primary role is to retrieve, integrate, and accurately summarize information from verifiable sources like medical databases or imaging data. Exemplars include the foundational frameworks medical retrieval and indexing techniques such as Med-RAG and MA-COIR.
- 🤖Verifiable Workflow Automators (VWA). In this paradigm, agent autonomy is strictly constrained within pre-defined, verifiable clinical workflows or decision trees. The LLM acts as a natural language front-end to a structured process, executing tasks rather than making open-ended decisions, which ensures maximum safety and predictability. This approach is exemplified by commercial triage bots, the structured conversational framework of systems like Google's AMIE, and principles from classic task-oriented dialogue systems sush as MeDi-TODER.
| Institution | Leading Researcher/Group | Source |
|---|---|---|
| Google Health | Homepage | |
| NIH | Zhiyong Lu | Homepage |
| Open AI | Health AI Team | Homepage |
| Ant Group | AI for Science Team | Homepage |
| Alibaba | Tongyi Lab, Damo, AQ-Med Lab | Homepage, Homepage, Homepage |
| Shanghai AI Lab | AI for Science Team, AI4Med Team | Homepage, Homepage |
| Baichuan AI | AI Lab | Homepage |
| Meta | FAIR Team | Homepage |
| Tecent | Jarvislab, Xiaobin Hu | Homepage, Homepage |
| Huawei | NoAH | Homepage |
| ByteDance | Seed,AI for Science Team | Homepage |
| Microsoft Research | Hoifung Poon | Homepage |
| Harvard | Xiang Li, Faisal Mahmood Lab, Pranav Rajpurkar, Tianxi Cai | Homepage, Homepage, Homepage, Homepage |
| Maryland | Hanan Samet | Homepage |
| MIT | Paul Liang, Peter Szolovits | Homepage, Homepage |
| Oxford | Tingting Zhu, David A. Clifton, Alison Noble | Homepage, Homepage, Homepage |
| Cambridge | Vanderschaar-lab, Andreas Vlachos | Homepage, Homepage |
| NTU | Chunyan Miao | Homepage |
| Tsinghua University | Yang Liu, Hong-Yu Zhou, Weizhi Ma, Medical Informatics Lab | Homepage, Homepage, Homepage, Homepage |
| SJTU | Chaoyi Wu, Weidi Xie, MAGIC | Homepage, Homepage, Homepage |
| UNC | Tianlong Chen, Huaxiu Yao | Homepage, Homepage |
| Yale | Clinical NLP Lab | Homepage |
| UBC | Xiaoxiao Li | Homepage |
| UIUC | Jimeng Sun,Jiawei Han | Homepage, Homepage |
| ZJU | DCDmllm, Jian Wu | Homepage, Homepage |
| Notre Dame | SCLab | Homepage |
| Pennsylvania | Tianyu Han, Fenglong Ma, Lyle Ungar | Homepage, Homepage, Homepage |
| Emory | Carl Yang | Homepage |
| Stanford | SNAP, James Zou, Yejin Choi | Homepage, Homepage, Homepage |
| PKU | Liantao Ma, Yasha Wang | Homepage |
| TJU | ADM Group | Homepage |
| Edinburgh | Ewen M Harrison | Homepage |
| Virginia | Aidong Zhang, Xuan Wang | Homepage, Homepage |
| CUHK | Freedom AI, YuanWu, Michael R. Lyu, Benyou Wang | Homepage, Homepage, Homepage, Homepage |
| CityU | Xiangyu Zhao | Homepage |
| Houston Methodist | Wang Lab | Homepage |
| Mbzuai | Jianing Qiu | Homepage |
| DKFZ | German Cancer Research Center | Homepage |
| California | Yuyin Zhou | Homepage |
| ETH | Michael Moor | Homepage |
| JOHNS HOPKINS | Suchi Saria | Homepage |
| Cornell | Fei Wang, Claire Cardie | Homepage, Homepage |
| GE Healthcare | Xiao Cao | Homepage |
| Rutgers | Mu Zhou | Homepage |
| UT | Ying Ding, Wenqi Shi | Homepage, Homepage |
| UC Berkeley | Bin Yu | Homepage |
| UW | Hannaneh Hajishirzi | Homepage |
| LMU Munich | Volker Tresp | Homepage |
| SBU | Chenyu You | Homepage |
| FuDan | Zhongyu Wei | Homepage |
| Minnesota | Rui Zhang | Homepage |
| Monash | AIM Lab | Homepage |
| USYD | Med AI Lab | Homepage |
| Queensu | Medi Lab | Homepage |
| Open Source Platform | OpenMed Lab | Homepage |
-
BioGPT: generative pre-trained transformer for biomedical text generation and mining (Briefings Bioinf., 2023) paper, code
A domain-specific generative Transformer pre-trained on large-scale biomedical literature to achieve state-of-the-art performance in text generation and mining tasks.
-
BioBART: Pretraining and Evaluation of A Biomedical Generative Language Model (BioNLP, 2022) paper, code
Adapts the BART architecture to the biomedical domain with enhanced pre-training tasks, significantly improving performance on summarization and dialogue generation.
-
ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission (CHIL, 2020) paper, code
Develops contextual embeddings specifically for clinical notes to effectively predict hospital readmission and model long-term clinical dependencies.
-
BioMegatron: Larger Biomedical Domain Language Model (EMNLP, 2020) paper, code
Leverages the Megatron-LM infrastructure to train a large-scale biomedical language model, demonstrating improvements in named entity recognition and QA tasks.
-
Toward expert-level medical question answering with large language models (Nature, 2023) paper, code
Introduces Med-PaLM, utilizing instruction tuning and ensemble refinement to become the first AI to exceed the passing score on the USMLE.
-
CoD: Towards an Interpretable Medical Agent using Chain of Diagnosis (ICML AI4Science, 2024) paper, code
Proposes a Chain of Diagnosis (CoD) framework that breaks down the diagnostic process into interpretable steps to enhance transparency and accuracy.
-
HuaTuo: Tuning LLaMA Model with Chinese Medical Knowledge (EMNLP Findings, 2023) paper, code
Incorporates a structured medical knowledge graph into the LLaMA model via instruction tuning to significantly enhance Chinese medical QA capabilities.
-
Learning Causal Alignment for Reliable Disease Diagnosis (ICCV, 2023) paper, code
Introduces a causal alignment framework to mitigate confounding biases in medical data, ensuring more reliable and generalizable disease diagnosis.
-
Reasoning with large language models for medical question answering (npj Digit. Med., 2024) paper
systematically evaluates different reasoning strategies (like Chain-of-Thought) in LLMs to identify the most effective methods for complex medical QA.
-
Empowering biomedical discovery with AI agents (Nature, 2024) paper
Discusses the paradigm shift towards autonomous AI agents capable of planning and executing experiments to accelerate biomedical research and discovery.
-
A fast nonnegative autoencoder-based approach to latent feature analysis on high-dimensional and incomplete data (IEEE TNNLS, 2024) paper
Proposes a highly efficient nonnegative autoencoder designed to extract latent features from high-dimensional, sparse, and incomplete medical datasets.
-
Multiview latent space learning with progressively fine-tuned deep features for unsupervised domain adaptation (Inf. Sci., 2024) paper
Develops a method to align multiview latent spaces using progressively fine-tuned features, improving unsupervised domain adaptation in medical imaging analysis.
-
Autosurv: interpretable deep learning framework for cancer survival analysis incorporating clinical and multi-omics data (npj Precis. Oncol., 2023) paper, code
A comprehensive and interpretable deep learning framework that integrates clinical and multi-omics data to improve cancer survival prediction accuracy.
-
Qilin-Med: Multi-stage Knowledge Injection Advanced Medical Large Language Model (arXiv, 2023) paper, code
Presents a multi-stage training strategy to inject massive medical knowledge into LLMs, enhancing their reasoning and dialogue performance in Chinese medical contexts.
-
Counterfactual reasoning using causal Bayesian networks as a healthcare governance tool (Sci. Rep., 2024) paper
Applies causal Bayesian networks to perform counterfactual analysis, providing a quantitative tool for evaluating healthcare policies and governance decisions.
-
Large Language Models for Medical Forecasting - Foresight 2 (arXiv, 2024) paper
Introduces a generative foundation model trained on longitudinal patient records to forecast future medical events and health trajectories.
-
Ontology accelerates few-shot learning capability of large language model: A study in extraction of drug efficacy in a rare pediatric epilepsy (Comput. Methods Programs Biomed., 2025) paper
Demonstrates that integrating domain ontologies significantly boosts the few-shot learning performance of LLMs for information extraction in rare diseases.
-
A generalist medical language model for disease diagnosis assistance (Nat. Med., 2024) paper
Presents AMIE, a generalist medical AI system optimized for diagnostic dialogue that matches or exceeds primary care physicians in simulated diagnostic tasks.
-
Taiyi: A Bilingual Fine-Tuned Large Language Model for Diverse Biomedical Tasks (JAMIA, 2024) paper, code
A bilingual (English/Chinese) LLM specifically fine-tuned to handle a diverse range of biomedical tasks, including NER, RE, and QA.
-
Focus on What Matters: Enhancing Medical Vision-Language Models with Automatic Attention Alignment Tuning (arXiv, 2025) paper
Proposes an Automatic Attention Alignment (AAA) mechanism to align the visual attention of VLMs with clinical masks, enhancing interpretability and performance.
-
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? (EMNLP, 2022) paper, code
Demonstrates that the ground-truth accuracy of labels in demonstrations matters less than the label space and distribution, reshaping the understanding of in-context learning.
-
HuatuoGPT-II: One-stage Training for Medical Adaption of LLMs (ACL Findings, 2024) paper, code
Introduces a one-stage training protocol that unifies medical domain adaptation and general instruction following, simplifying the training pipeline.
-
Diagnostic reasoning prompts reveal the potential for large language model interpretability in medicine (npj Digit. Med., 2024) paper, code
Investigates the use of structured prompting strategies to elicit and visualize the diagnostic reasoning paths of LLMs, improving transparency.
-
AttriPrompter: Auto-Prompting with Attribute Semantics for Zero-shot Nuclei Detection via Visual-Language Pre-trained Models (MICCAI, 2024) paper, code
A zero-shot framework utilizing attribute-based text prompts to guide visual-language models in detecting nuclei without task-specific training.
-
A context-based chatbot surpasses radiologists and generic ChatGPT in following the ACR appropriateness guidelines (Sci. Rep., 2023) paper
Develops a specialized chatbot that leverages clinical context to adhere to ACR appropriateness guidelines more accurately than human radiologists.
-
MedVH: Towards Systematic Evaluation of Hallucination for Large Vision Language Models in the Medical Context (ECCV, 2024) paper, code
Establishes a comprehensive benchmark and evaluation dataset specifically designed to detect and analyze hallucinations in medical vision-language models.
-
The FAIIR conversational AI agent assistant for youth mental health service provision (npj Digit. Med., 2025) paper
Presents FAIIR, a conversational agent designed to assist in the triage and service provision for youth mental health, reducing clinician workload.
-
Galactica: A Large Language Model for Science (arXiv, 2022) paper, code
A large language model trained on a massive corpus of scientific knowledge, designed to store, reason, and generate scientific content.
-
Clinical ModernBERT: An efficient and long context encoder for biomedical text (arXiv, 2025) paper
Adapts the ModernBERT architecture to the clinical domain, offering a high-efficiency encoder capable of processing long-context electronic health records.
-
DK-BEHRT: Teaching language models international classification of disease (ICD) codes using known disease descriptions (CHIL, 2024) paper, code
Enhances the BEHRT model by incorporating textual descriptions of diseases, significantly improving the accuracy of automated ICD coding.
-
Context Clues: Evaluating Long Context Models for Clinical Prediction Tasks on EHRs (arXiv, 2024) paper
Benchmarks various long-context LLMs on their ability to extract relevant information from lengthy and complex electronic health records.
-
Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models (ACL Findings, 2024) paper, code
Proposes a recursive summarization technique to compress dialogue history, enabling LLMs to maintain long-term memory in medical consultations.
-
Adapted large language models can outperform medical experts in clinical text summarization (Nat. Med., 2024) paper
Provides empirical evidence that domain-adapted LLMs generate clinical summaries that are rated higher in quality and accuracy than those by human experts.
-
BioLORD-2023: Semantic Textual Representations Fusing LLM and Clinical Knowledge Graph Insights (EMNLP Findings, 2023) paper, code
Produces rich semantic textual representations by grounding LLM generation in definitions and relationships from clinical knowledge graphs.
-
AI-Enabled Conversational Journaling for Advancing Parkinson's Disease Symptom Tracking (arXiv, 2025) paper
Develops a conversational agent that engages patients in journaling to track and analyze Parkinson's disease symptoms over time.
-
MEDCO: Medical Education Copilots Based on A Multi-Agent Framework (ECCV Workshops, 2024) paper
Introduces a multi-agent educational copilot system comprising student, patient, and expert agents to simulate realistic clinical training scenarios.
-
ColaCare: Enhancing Electronic Health Record Modeling through Large Language Model-Driven Multi-Agent Collaboration (WWW, 2025) paper, code
Enhances EHR predictive modeling by using a multi-agent "medical team" (DoctorAgents and MetaAgent) to collaborate on patient data analysis.
-
ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse LLMs (ACL Findings, 2024) paper, code
A multi-agent framework where diverse LLMs engage in round-table discussions to reach consensus, significantly improving reasoning accuracy.
-
MAM: Modular Multi-Agent Framework for Multi-Modal Medical Diagnosis via Role-Specialized Collaboration (ACL Findings, 2025) paper, code
Decomposes diagnostic tasks into specialized agent roles (General Practitioner, Specialist, Radiologist) to handle multi-modal medical data effectively.
-
MDAgents: An Adaptive Collaboration of LLMs for Medical Decision-Making (NeurIPS, 2024) paper, code
Dynamically adapts the collaboration structure (solo vs. group) of LLM agents based on the medical complexity of the query.
-
Self-Evolving Multi-Agent Simulations for Realistic Clinical Interactions (MedAgentSim) (MICCAI, 2025) paper, code
Presents MedAgentSim, a framework where doctor and patient agents interact and evolve their diagnostic strategies through experience without human labeling.
-
MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning (arXiv, 2023) paper, code
Leveraging a multi-agent debate mechanism to enhance zero-shot clinical reasoning capabilities by simulating medical consultations.
-
AlphaEvolve: A coding agent for scientific and algorithmic discovery (arXiv, 2025) paper
An evolutionary coding agent from DeepMind capable of autonomously discovering novel algorithms and optimizing code for scientific problems.
-
Revolutionizing healthcare: the role of artificial intelligence in clinical practice (BMC Med. Educ., 2023) paper
A comprehensive review discussing the transformative impact and ethical implications of integrating AI agents into clinical workflows.
-
Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents (arXiv, 2024) paper, code
Simulates a full hospital environment where doctor agents continuously evolve and improve their diagnostic skills by treating patient agents.
-
STLLaVA-Med: Self-Training Large Language and Vision Assistant for Medical Question-Answering (EMNLP, 2024) paper, code
Uses a self-training pipeline with Direct Preference Optimization (DPO) to improve medical VLM performance using auto-generated data.
-
Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents (arXiv, 2025) paper
Proposes a framework for open-ended agent evolution where the system can rewrite its own code to continuously improve its learning and reasoning mechanisms.
-
Towards Medical Complex Reasoning with LLMs through Medical Verifiable Problems (ACL Findings, 2025) paper
Introduces the MedVP dataset, focusing on verifiable medical problems to benchmark and enhance the complex reasoning capabilities of LLMs.
-
Zhongjing: Enhancing the Chinese Medical Capabilities of Large Language Model through Expert Feedback (AAAI, 2024) paper, code
Enhances Chinese medical LLMs using a complete RLHF pipeline with expert doctors involved in the feedback loop to ensure professional accuracy.
-
Advancing Biomedical Claim Verification by Using Large Language Models with Better Structured Prompting Strategies (BioNLP, 2025) paper
Evaluates various prompting strategies, such as chain-of-thought and self-consistency, to improve the accuracy of biomedical claim verification.
-
Generating Explanations in Medical Question-Answering by Expectation Maximization Inference over Evidence (EMNLP Findings, 2023) paper, code
Proposes a latent variable model using Expectation Maximization to select relevant evidence and generate high-quality explanations for medical questions.
-
Self-Consistency Improves Chain of Thought Reasoning in Language Models (ICLR, 2023) paper, code
Introduces a decoding strategy that samples multiple reasoning paths and selects the most consistent answer, significantly boosting performance on reasoning tasks.
-
S2AF: An action framework to self-check the Understanding Self-Consistency of Large Language Models (Neural Netw., 2025) paper
Develops a framework that enables LLMs to self-evaluate their understanding and consistency through an action-based checking mechanism.
-
Ranked Voting based Self-Consistency of Large Language Models (arXiv, 2025) paper
Proposes a ranked voting mechanism to aggregate outputs from self-consistency sampling, offering better robustness than simple majority voting.
-
A comparative evaluation of chain-of-thought-based prompt engineering techniques for medical question answering (Sci. Rep., 2025) paper
Systematically benchmarks different Chain-of-Thought prompting variations to identify the most effective strategies for medical exams.
-
Tree-Planner: Efficient Close-loop Task Planning with Large Language Models (ICLR, 2024) paper, code
Formulates task planning as a tree search problem, allowing agents to perform efficient closed-loop planning and error correction.
-
Least-to-Most Prompting Enables Complex Reasoning in Large Language Models (ICLR, 2023) paper
A prompting strategy that decomposes complex problems into a sequence of simpler sub-problems, solving them sequentially to guide the model.
-
Prompt engineering in consistency and reliability with the evidence-based guideline for LLMs (npj Digit. Med., 2024) paper
Investigates how guideline-based prompting improves the consistency and clinical reliability of LLM responses in medical decision support.
-
Cost-Effective Framework with Optimized Task Decomposition and Batch Prompting for Medical Dialogue Summary (CIKM, 2023) paper
Proposes a framework that reduces API costs while maintaining summary quality by optimizing task decomposition and using batch prompting.
-
A brain-inspired agentic architecture to improve planning with LLMs (Nat. Commun., 2025) paper
Draws inspiration from human cognitive processes to design an agent architecture that separates planning, execution, and monitoring for better reliability.
-
Self-critiquing models for assisting human evaluators (NeurIPS, 2022) paper
Trains models to generate natural language critiques of their own or others' outputs, helping human annotators find errors more efficiently.
-
FRAME: Feedback-Refined Agent Methodology for Enhancing Medical Research Insights (arXiv, 2025) paper
An agentic framework that iteratively refines its analysis of medical research papers based on structured feedback loops.
-
Agentic Feedback Loop Modeling Improves Recommendation and User Simulation (WWW, 2025) paper
Models the interaction between recommender agents and user simulator agents as a feedback loop to improve long-term recommendation utility.
-
MOTOR: A Time-To-Event Foundation Model For Structured Medical Records (MLHC, 2023) paper, code
A foundation model pre-trained on longitudinal structured medical records to perform time-to-event prediction tasks with high accuracy.
-
Agentic LLM Workflows for Generating Patient-Friendly Medical Reports (arXiv, 2024) paper
Proposes a multi-agent workflow that transforms complex clinical notes into patient-friendly reports, improving accessibility and understanding.
-
Insights from high and low clinical users of telemedicine: a mixed-methods study of clinician workflows, sentiments, and user experiences (npj Digit. Med., 2025) paper
A mixed-methods study analyzing clinician workflows and sentiments to understand the factors driving high versus low adoption of telemedicine.
-
Evaluating large language model workflows in clinical decision support for triage and referral and diagnosis (npj Digit. Med., 2025) paper
Systematically evaluates LLM-based workflows in clinical decision support systems, specifically focusing on their safety and accuracy in triage and referral.
-
SoftTiger: A Clinical Foundation Model for Healthcare Workflows (arXiv, 2024) paper, code
Introduces a LLaMA-based clinical foundation model optimized to integrate seamlessly into various healthcare workflows, from summarization to triage.
-
STAF-LLM: A scalable and task-adaptive fine-tuning framework for large language models in medical domain (Expert Syst. Appl., 2025) paper
Presents a scalable framework for task-adaptive fine-tuning that efficiently adapts general LLMs to specific medical tasks with limited resources.
-
Addressing Overprescribing Challenges: Fine-Tuning Large Language Models for Medication Recommendation Tasks (arXiv, 2025) paper
Investigates fine-tuning strategies for LLMs to generate safer medication recommendations, specifically targeting the reduction of overprescribing errors.
-
From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain (Artif. Intell. Med., 2024) paper
Provides a comprehensive comparative analysis of pre-training versus fine-tuning strategies for adapting LLMs to biomedical downstream tasks.
-
Open-Ended Medical Visual Question Answering Through Prefix Tuning of Language Models (MICCAI, 2023) paper, code
Utilizes prefix tuning to adapt frozen language models for medical visual question answering, achieving high performance with few trainable parameters.
-
Diagnosing Transformers: Illuminating Feature Spaces for Clinical Decision-Making (NeurIPS, 2023) paper
Analyzes the internal feature spaces of Transformer models to interpret how they represent clinical concepts and make decisions.
-
Embedding dynamic graph attention mechanism into Clinical Knowledge Graph for enhanced diagnostic accuracy (Expert Syst. Appl., 2024) paper
Integrates a dynamic graph attention mechanism into clinical knowledge graphs to capture evolving patient states for more accurate diagnosis.
-
HALO: Hallucination Analysis and Learning Optimization to Empower LLMs with Retrieval-Augmented Context for Guided Clinical Decision Making (AAAI, 2025) paper
A framework designed to detect and mitigate hallucinations in clinical decision-making by optimizing the retrieval-augmented context.
-
Instruction Tuning and CoT Prompting for Contextual Medical QA with LLMs (arXiv, 2025) paper
Explores the synergistic effect of instruction tuning and Chain-of-Thought prompting to enhance the contextual understanding of medical QA models.
-
LIFE-CRAFT: A Multi-agentic Conversational RAG Framework for Lifestyle Medicine Coaching with Context Traceability and Case-Based Evidence Synthesis (HCII, 2024) paper
A multi-agent RAG system designed for lifestyle medicine coaching that ensures advice is traceable to case-based evidence.
-
MedLA: A Logic-Driven Multi-Agent Framework for Complex Medical Reasoning with Large Language Models (arXiv, 2025) paper
Proposes a logic-driven multi-agent framework where agents organize reasoning into explicit syllogistic trees to ensure transparent and verifiable medical decision-making.
-
ConfAgents: A Conformal-Guided Multi-Agent Framework for Cost-Efficient Medical Diagnosis (arXiv, 2025) paper
Introduces a conformal prediction-based triage mechanism that dynamically assigns cases to single agents or multi-agent teams, balancing accuracy and computational cost.
-
Advancing Healthcare Automation: Multi-Agent System for Medical Necessity Justification (BioNLP, 2024) paper
Deploys a multi-agent system to automate the labor-intensive process of prior authorization by justifying medical necessity against clinical guidelines.
-
A Two-Stage Proactive Dialogue Generator for Efficient Clinical Information Collection Using Large Language Model (Expert Syst. Appl., 2025) paper
Develops a diagnostic dialogue system with a two-stage recommendation structure to proactively collect critical patient information and mimic real-doctor conversational styles.
-
Mediator-Guided Multi-Agent Collaboration among Open-Source Models for Medical Decision-Making (arXiv, 2025) paper
Utilizes a mediator agent to facilitate Socratic dialogue and reflection among open-source Vision-Language Models (VLMs), enhancing multimodal diagnostic performance.
-
DynamiCare: A Dynamic Multi-Agent Framework for Interactive and Open-Ended Medical Decision-Making (arXiv, 2025) paper
Models clinical diagnosis as a dynamic, multi-round loop where the agent team iteratively queries a patient system (MIMIC-Patient) and adapts its strategy based on new findings.
-
MAS-PatientCare: Medical Diagnosis and Patient Management System Based on a Multi-agent Architecture (Springer CCIS, 2025) paper
Proposes a comprehensive multi-agent architecture for remote patient monitoring that integrates diagnostic reasoning with patient management workflows.
-
Inquire, Interact, and Integrate: A Proactive Agent Collaborative Framework for Zero-Shot Multimodal Medical Reasoning (arXiv, 2024) paper
A proactive framework that enables agents to autonomously inquire about missing modalities and integrate multimodal evidence for zero-shot medical reasoning.
-
Self-Evolving Multi-Agent Simulations for Realistic Clinical Interactions (MedAgentSim) (MICCAI, 2025) paper, code
Introduces a simulation environment where doctor and patient agents interact and self-evolve through experience replay and feedback, significantly improving diagnostic realism.
-
Integrating Dynamical Systems Learning with Foundational Models: A Meta-Evolutionary AI Framework for Clinical Trials (arXiv, 2025) paper
Combines dynamical systems theory with LLMs to create a meta-evolutionary framework that optimizes clinical trial designs and simulates patient trajectories.
-
MedPAO: A Protocol-Driven Agent for Structuring Medical Reports (HCII, 2025) paper
Presents an agent that strictly follows medical protocols to structure unstructured clinical reports, ensuring high compliance and data quality.
-
Agentic Surgical AI: Surgeon Style Fingerprinting and Privacy Risk Quantification via Discrete Diffusion in a Vision-Language-Action Framework (arXiv, 2025) paper
Explores the privacy risks of agentic surgical AI by demonstrating how "surgeon style" can be identified and protected using discrete diffusion models.
-
Improving Interactive Diagnostic Ability of a Large Language Model Agent Through Clinical Experience Learning (arXiv, 2025) paper
Enhances the initial diagnostic capabilities of LLM agents by simulating clinical experience learning, bridging the gap between passive knowledge and active inquiry.
-
Silence is Not Consensus: Disrupting Agreement Bias in Multi-Agent LLMs via Catfish Agent for Clinical Decision Making (arXiv, 2025) paper
Introduces a "Catfish Agent" designed to inject structured dissent into multi-agent discussions, preventing premature consensus (groupthink) in medical diagnosis.
-
HyKGE: A Hypothesis Knowledge Graph Enhanced Framework for Accurate and Reliable Medical LLMs Responses (ACL Findings, 2024) paper, code
Constructs a hypothesis-driven knowledge graph to verify intermediate reasoning steps, ensuring LLM responses are grounded in medical facts.
-
Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback (ICLR, 2023) paper, code
An iterative framework where the model retrieves external knowledge and refines its answer based on automated feedback to reduce hallucinations.
-
KGARevion: An AI Agent for Knowledge-Intensive Biomedical QA (arXiv, 2024) paper
An agentic system capable of reviewing and refining its own retrieval and reasoning processes for high-difficulty biomedical questions.
-
EvidenceMap: Learning Evidence Analysis to Unleash the Power of Small Language Models for Biomedical Question Answering (arXiv, 2025) paper
Maps complex evidence chains into structured representations, enabling smaller language models to perform expert-level evidence analysis.
-
Infusing Multi-Hop Medical Knowledge Into Smaller Language Models for Biomedical Question Answering (IEEE JBHI, 2025) paper
Proposes a method to inject structured multi-hop reasoning capabilities from Knowledge Graphs into smaller models to improve efficiency.
-
Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up Questions (EMNLP Findings, 2024) paper
Enhances RAG by generating iterative follow-up questions to clarify ambiguities and retrieve more precise medical context.
-
MedicalGLM: A Pediatric Medical Question Answering Model with a Quality Evaluation Mechanism (BMC Med. Inform. Decis. Mak., 2025) paper
A fine-tuned GLM for pediatrics equipped with a self-evaluation module that assesses the reliability of its own generated advice.
-
A cascaded retrieval-while-reasoning multi-document comprehension framework with incremental attention for medical question answering (Expert Syst. Appl., 2024) paper
Introduces a cascaded framework that interleaves retrieval and reasoning steps with incremental attention to handle multi-document contexts.
-
K-COMP: Retrieval-Augmented Medical Domain Question Answering With Knowledge-Injected Compressor (arXiv, 2025) paper
Uses a knowledge-injected compressor to condense retrieved documents, reducing noise and context length while retaining critical medical facts.
-
MEPNet: Medical Entity-balanced Prompting Network for Brain CT Report Generation (arXiv, 2025) paper
A prompting network designed to balance the generation of medical entities in CT reports, ensuring comprehensive and accurate reporting.
-
Knowledge-Induced Medicine Prescribing Network for Medication Recommendation (Artif. Intell. Med., 2025) paper
Integrates pharmaceutical knowledge graphs into a deep learning network to provide safe and effective medication combinations.
-
Improving Clinical Question Answering with Multi-Task Learning: A Joint Approach for Answer Extraction and Medical Categorization (arXiv, 2025) paper
A multi-task learning framework that jointly optimizes for answer extraction and medical category classification to improve overall QA performance.
-
Improving Reliability and Explainability of Medical Question Answering through Atomic Fact Checking in Retrieval-Augmented LLMs (arXiv, 2025) paper
Decomposes model responses into atomic facts and verifies them against retrieved evidence to enhance reliability and explainability.
-
Bias Evaluation and Mitigation in Retrieval-Augmented Medical Question-Answering Systems (arXiv, 2025) paper
Systematically evaluates sources of bias in medical RAG systems and proposes mitigation strategies to ensure equitable healthcare advice.
-
Rationale-Guided Retrieval Augmented Generation for Medical Question Answering (NAACL, 2025) paper
Generates rationales first to guide the retrieval process, ensuring that retrieved documents support the logical reasoning path.
-
Infusing Multi-Hop Medical Knowledge Into Smaller Language Models for Biomedical Question Answering (IEEE JBHI, 2025) paper
(See Planning section) Enhances memory capacity of small models by embedding multi-hop relations from medical KGs.
-
Seek Inner: LLM-Enhanced Information Mining for Medical Visual Question Answering (ACM MM, 2024) paper
Mines implicit medical knowledge from Large Language Models to supplement visual features in Medical VQA tasks.
-
MMedAgent: Learning to Use Medical Tools with Multi-modal Agent (NeurIPS, 2024) paper, code
A multimodal agent framework that learns to retrieve and utilize external medical tools (like calculators and search) to solve complex cases.
-
ReflecTool: Towards Reflection-Aware Tool-Augmented Clinical Agents (ACL Findings, 2025) paper
Enables clinical agents to reflect on the sufficiency of their current information and autonomously decide when to use tools.
-
RGAR: Recurrence Generation-augmented Retrieval for Factual-aware Medical Question Answering (arXiv, 2025) paper
Introduces a recurrence mechanism where the model's own generation is used to refine subsequent retrieval queries for better factuality.
-
Adaptive Knowledge Graphs Enhance Medical Question Answering: Bridging the Gap Between LLMs and Evolving Medical Knowledge (arXiv, 2025) paper
Proposes a framework where the medical knowledge graph is adaptively updated based on new findings to keep the QA system current.
-
MedEx: Enhancing Medical Question-Answering with First-Order Logic based Reasoning and Knowledge Injection (COLING, 2025) paper
Combines neural generation with symbolic First-Order Logic to inject strict medical constraints and knowledge into the memory of the QA system.
-
Explainable Knowledge-Based Learning for Online Medical Question Answering (PRICAI, 2024) paper
An online learning approach that updates the model's knowledge base continuously while providing explainable reasoning paths.
-
Efficient Medical Question Answering with Knowledge-Augmented Question Generation (ClinicalNLP, 2024) paper
Augments the training data (memory) of QA models by generating diverse synthetic medical questions grounded in knowledge bases.
-
Leveraging long context in retrieval augmented language models for medical question answering (npj Digit. Med., 2025) paper
Investigates the trade-offs and synergies between using long-context windows and RAG for accessing vast medical knowledge.
-
KoSEL: Knowledge subgraph enhanced large language model for medical question answering (Artif. Intell. Med., 2024) paper
Retrieves relevant subgraphs from a medical knowledge graph to provide structured context, enhancing the LLM's reasoning for medical QA.
-
Are my answers medically accurate? Exploiting medical knowledge graphs for medical question answering (Appl. Intell., 2024) paper
Proposes a framework that cross-references LLM-generated answers with facts extracted from medical knowledge graphs to ensure accuracy.
-
Infusing Multi-Hop Medical Knowledge Into Smaller Language Models for Biomedical Question Answering (IEEE JBHI, 2025) paper
Enables smaller language models to perform complex biomedical QA by injecting multi-hop reasoning paths derived from knowledge graphs.
-
Improving Clinical Question Answering with Multi-Task Learning: A Joint Approach for Answer Extraction and Medical Categorization (arXiv, 2025) paper
A multi-task learning approach that simultaneously optimizes answer extraction and question categorization to improve clinical QA performance.
-
Beyond EHRs: External Clinical knowledge and cohort Features for medication recommendation (Artif. Intell. Med., 2025) paper
(Same as "Knowledge-Induced Medicine..." in Planning) Integrates external clinical knowledge graphs with patient cohort features for precise medication recommendation.
-
MEPNet: Medical Entity-balanced Prompting Network for Brain CT Report Generation (arXiv, 2025) paper
A network that balances the generation of various medical entities in CT reports through specialized prompting actions.
-
Improving Reliability and Explainability of Medical Question Answering through Atomic Fact Checking in Retrieval-Augmented LLMs (arXiv, 2025) paper
Enhances RAG systems by decomposing answers into atomic facts and verifying each against retrieved evidence for better reliability.
-
K-COMP: Retrieval-Augmented Medical Domain Question Answering With Knowledge-Injected Compressor (arXiv, 2025) paper
Employs a compressor module injected with medical knowledge to condense retrieved documents, optimizing the context for the LLM.
-
MedCoT-RAG: Causal Chain-of-Thought RAG for Medical Question Answering (arXiv, 2025) paper
Combines retrieval augmentation with causal chain-of-thought reasoning to explain the causal relationships behind medical answers.
-
Towards Efficient Methods in Medical Question Answering using Knowledge Graph Embeddings (IEEE BigData, 2024) paper
Utilizes knowledge graph embeddings to efficiently retrieve relevant medical concepts, improving QA speed and accuracy.
-
MediTriR: A Triple-Driven Approach to Retrieval-Augmented Generation for Medical Question Answering Tasks (IEEE Access, 2025) paper
A RAG approach driven by knowledge triples (Subject-Predicate-Object) to ensure the retrieval of structured and precise medical information.
-
Medical Knowledge Graph QA for Drug-Drug Interaction Prediction based on Multi-hop Machine Reading Comprehension (arXiv, 2022) paper
Predicts drug-drug interactions by treating the task as a multi-hop machine reading comprehension problem over a knowledge graph.
-
MediSearch: Advanced Medical Web Search Engine (IEEE ICHI, 2023) paper
A specialized search engine framework that aggregates and filters medical information from the web to provide authoritative health answers.
-
Evaluating search engines and large language models for answering health questions (npj Digit. Med., 2025) paper
A comparative study evaluating the accuracy, safety, and completeness of traditional search engines versus LLMs in answering health queries.
-
Leveraging long context in retrieval augmented language models for medical question answering (npj Digit. Med., 2025) paper
Examines the effectiveness of using long-context LLMs to process extensive retrieved medical documents compared to standard chunking methods.
-
Using Internet search engines to obtain medical information: a comparative study (J. Med. Internet Res., 2012) paper
A foundational study (cited for context) comparing the efficacy of general-purpose search engines in retrieving accurate medical information.
-
Large language model agents can use tools to perform clinical calculations (npj Digit. Med., 2025) paper
Demonstrates that LLM agents equipped with external calculator tools significantly outperform base models in performing complex clinical scores (e.g., MELD).
-
MeNTi: Bridging Medical Calculator and LLM Agent with Nested Tool Calling (arXiv, 2024) paper
Enables LLM agents to execute nested tool calls, allowing them to handle complex medical calculations that require intermediate steps.
-
MMedAgent: Learning to Use Medical Tools with Multi-modal Agent (NeurIPS, 2024) paper, code
A framework where multimodal agents learn to autonomously select and utilize various medical tools (search, calculators) to solve clinical problems.
-
KMTLabeler: An Interactive Knowledge-Assisted Labeling Tool for Medical Text Classification (IEEE ICASSP, 2024) paper
An interactive tool that uses medical knowledge to assist human annotators in labeling clinical text, improving efficiency and consistency.
-
ADEPT: An advanced data exploration and processing tool for clinical data insights (Database, 2025) paper
A comprehensive software tool designed for the exploration, cleaning, and preprocessing of large-scale clinical datasets for research.
-
Error Detection in Medical Note through Multi Agent Debate (BioNLP, 2025) paper
Utilizes a multi-agent debate framework where agents critically analyze medical notes to identify and reach consensus on documentation errors.
-
Multi-modal Medical Diagnosis via Large-small Model Collaboration (IEEE, 2025) paper
Proposes a collaborative framework where large multi-modal models guide smaller, efficient models to improve diagnostic accuracy on resource-constrained devices.
-
MACD: Multi-Agent Clinical Diagnosis with Self-Learned Knowledge for LLM (arXiv, 2025) paper
A multi-agent system where agents self-learn from historical diagnostic cases to build a shared knowledge base, enhancing collaborative decision-making.
-
MedSentry: Understanding and Mitigating Safety Risks in Medical LLM Multi-Agent Systems (arXiv, 2025) paper
A comprehensive study and framework for identifying, categorizing, and mitigating safety risks (e.g., toxicity, bias) arising from agent interactions.
-
MedConMA: A Confidence-Driven Multi-agent Framework for Medical Q&A (Springer, 2025) paper
Introduces a confidence-driven mechanism where agents weigh their contributions to the final answer based on their self-assessed certainty levels.
-
MDTeamGPT: A Self-Evolving LLM-based Multi-Agent Framework for Multi-Disciplinary Team Medical Consultation (arXiv, 2025) paper
Simulates a Multi-Disciplinary Team (MDT) consultation where agents evolve their collaborative strategies over time to solve complex cancer cases.
-
Adaptive Knowledge Graphs Enhance Medical Question Answering: Bridging the Gap Between LLMs and Evolving Medical Knowledge (arXiv, 2025) paper
(Previously listed as Agentic Medical Knowledge Graphs...) A framework that autonomously updates its knowledge graph using agentic search to reflect the latest medical research.
-
Large language model agents can use tools to perform clinical calculations (npj Digit. Med., 2025) paper
Demonstrates that enabling LLM agents to autonomously identify the need for and use clinical calculators significantly reduces computational errors.
-
MACD: Multi-Agent Clinical Diagnosis with Self-Learned Knowledge for LLM (arXiv, 2025) paper
(Also listed in Cooperation) Highlights the self-evolution aspect where the system improves its diagnostic logic through self-learned knowledge accumulation.
-
MedAgent-Pro: Towards Evidence-based Multi-modal Medical Diagnosis via Reasoning Agentic Workflow (arXiv, 2025) paper
An advanced agentic workflow that iteratively gathers multimodal evidence and refines its reasoning path to provide evidence-based diagnoses.
-
Improving Self-training with Prototypical Learning for Source-Free Domain Adaptation on Clinical Text (BioNLP, 2024) paper
Combines self-training with prototypical learning to adapt clinical NLP models to new hospitals or domains without accessing source data.
-
ReflecTool: Towards Reflection-Aware Tool-Augmented Clinical Agents (ACL Findings, 2025) paper
Enables agents to "reflect" on their outputs and tool usage history, allowing them to self-correct and optimize their tool selection strategies.
-
TAMA: A Human-AI Collaborative Thematic Analysis Framework Using Multi-Agent LLMs for Clinical Interviews (arXiv, 2025) paper
A multi-agent framework that assists researchers in performing thematic analysis of clinical interviews, learning from human feedback to improve coding quality.
-
VITA: 'Carefully Chosen and Weighted Less' Is Better in Medication Recommendation (AAAI, 2024) paper, code
Proposes a medication recommendation framework that prioritizes selecting the most critical drugs over comprehensive but redundant lists, improving safety.
-
EMRs2CSP: Mining Clinical Status Pathway from Electronic Medical Records (ACL Findings, 2025) paper
Extracts Clinical Status Pathways (CSP) from EHRs to model the temporal progression of patient states, aiding in proactive clinical planning.
-
HealthBranches: Synthesizing Clinically-Grounded Question Answering Datasets via Decision Pathways (arXiv, 2025) paper
Generates synthetic QA datasets by simulating clinical decision pathways (branches), ensuring the data reflects realistic diagnostic logic.
-
Streamlining evidence based clinical recommendations with large language models (npj Digital Medicine, 2025) paper, code
A comprehensive study on using LLMs to translate clinical questions directly into evidence-based recommendations, evaluating their utility in decision support.
-
CMQCIC-Bench: A Chinese Benchmark for Evaluating Large Language Models in Medical Quality Control Indicator Calculation (arXiv, 2025) paper
Establishes a benchmark for calculating Medical Quality Control Indicators (MQCIs) from medical records, testing LLMs' ability to perform precise administrative planning.
-
Augmenting Black-box LLMs with Medical Textbooks for Biomedical Question Answering (arXiv, 2023) paper
Enhances black-box LLMs by retrieving relevant context from trusted medical textbooks, improving the accuracy of biomedical planning and QA.
-
M-QALM: A Benchmark to Assess Clinical Reading Comprehension and Knowledge Recall in Large Language Models via Question Answering (ACL Findings, 2024) paper
A benchmark designed to evaluate long-context clinical reading comprehension, essential for planning based on extensive patient history.
-
Listening to Patients: Detecting and Mitigating Patient Misreport in Medical Dialogue System (ACL Findings, 2025) paper
Addresses the planning challenge where patients provide incorrect information, proposing a mechanism to detect and mitigate these misreports during dialogue.
-
Visual and Domain Knowledge for Professional-level Graph-of-Thought Medical Reasoning (ICML, 2025) paper
Utilizes a Graph-of-Thought approach integrated with visual and domain knowledge to achieve professional-level reasoning in medical diagnostics.
-
MedPlan: A Two-Stage RAG-Based System for Personalized Medical Plan Generation (arXiv, 2025) paper
Generates personalized treatment plans by first retrieving general guidelines and then adapting them to specific patient data in a two-stage process.
-
PIPA: A Unified Evaluation Protocol for Diagnosing Interactive Planning Agents (arXiv, 2025) paper
A protocol for evaluating interactive agents on their ability to plan diagnostic inquiries and gather information efficiently.
-
RGAR: Recurrence Generation-augmented Retrieval for Factual-aware Medical Question Answering (arXiv, 2025) paper
Introduces a recurrence mechanism where the model's own generation is used to refine subsequent retrieval queries for better factuality.
-
End-to-End Agentic RAG System Training for Traceable Diagnostic Reasoning (arXiv, 2025) paper
Trains an end-to-end agentic system that not only diagnoses but also provides a traceable reasoning path linked to retrieved evidence.
-
Labeling-free RAG-enhanced LLM for intelligent fault diagnosis via reinforcement learning (Eng. Appl. Artif. Intell., 2025) paper
[Methodology] Integrates RAG and RL for fault diagnosis without labeled data. (Note: Domain is primarily industrial fault diagnosis, not clinical).
-
The Helicobacter pylori AI-clinician harnesses artificial intelligence to personalise H. pylori treatment recommendations (Nat. Commun., 2025) paper
An AI-clinician system that personalizes antibiotic treatment plans for H. pylori infection, significantly improving eradication rates.
-
Continual contrastive reinforcement learning: Towards stronger agent for environment-aware fault diagnosis of aero-engines through long-term optimization under highly imbalance scenarios (Eng. Appl. Artif. Intell., 2025) paper
[Methodology] A reinforcement learning agent for diagnosing aero-engine faults. (Note: Domain is industrial engineering, included for completeness of input list).
-
Integration of Multi-Source Medical Data for Medical Diagnosis Question Answering (IEEE Access, 2024) paper
Proposes a method to integrate heterogeneous medical data sources (text, structured data) to answer diagnostic questions more accurately.
-
Stage-Aware Hierarchical Attentive Relational Network for Diagnosis Prediction (IEEE JBHI, 2023) paper
A hierarchical network that captures the stage-wise progression of diseases from EHR data for precise diagnosis prediction.
-
RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models (arXiv, 2024) paper
Enhances the factuality of medical VLMs by retrieving and grounding responses in reliable multimodal evidence during generation.
-
M-QALM: A Benchmark to Assess Clinical Reading Comprehension and Knowledge Recall in Large Language Models via Question Answering (ACL Findings, 2024) paper
Establishes a benchmark specifically designed to evaluate the clinical reading comprehension and long-term knowledge recall capabilities of LLMs.
-
PIPA: A Unified Evaluation Protocol for Diagnosing Interactive Planning Agents (arXiv, 2025) paper
(Also listed in Planning) A protocol evaluating how agents manage diagnostic history and plan information-gathering steps in interactive scenarios.
-
EMRs2CSP: Mining Clinical Status Pathway from Electronic Medical Records (ACL Findings, 2025) paper
(Also listed in Planning) Mines Clinical Status Pathways (CSP) to represent the temporal progression and memory of patient states from EHRs.
-
Medical Graph RAG: Evidence-based Medical Large Language Model via Graph Retrieval-Augmented Generation (ACL, 2025) paper
Enhances LLM memory by integrating a medical knowledge graph into the RAG process, ensuring generation is grounded in structured evidence.
-
MedRAG: Enhancing Retrieval-augmented Generation with Knowledge Graph-Elicited Reasoning for Healthcare Copilot (arXiv, 2025) paper
Uses knowledge graph-elicited reasoning to optimize the retrieval component, providing a more robust memory mechanism for healthcare copilots.
-
CardioTRAP: Design of a Retrieval Augmented System (RAG) for Clinical Data in Cardiology (IEEE, 2025) paper
Designs a specialized RAG system for cardiology that effectively retrieves and utilizes patient-specific clinical data (memory) for decision support.
-
CLI-RAG: A Retrieval-Augmented Framework for Clinically Structured and Context Aware Text Generation with LLMs (arXiv, 2025) paper
A RAG framework capable of handling structured clinical data and maintaining context awareness during long-text generation.
-
HI-DR: Exploiting Health Status-Aware Attention and an EHR Graph+ for Effective Medication Recommendation (AAAI, 2025) paper
Utilizes a health status-aware attention mechanism and an enhanced EHR graph to capture patient history memory for precise medication recommendation.
-
Listening to Patients: Detecting and Mitigating Patient Misreport in Medical Dialogue System (ACL Findings, 2025) paper
(Also listed in Planning) Focuses on verifying the reliability of patient-provided information (memory of symptoms) during medical dialogues.
-
End-to-End Agentic RAG System Training for Traceable Diagnostic Reasoning (arXiv, 2025) paper
(Also listed in Planning) Trains an agentic system where the retrieval (memory) and reasoning components are optimized end-to-end for traceability.
-
CardioTRAP: Design of a Retrieval Augmented System (RAG) for Clinical Data in Cardiology (IEEE Access, 2025) paper
Designs a specialized RAG system tailored for cardiology that retrieves and processes patient-specific clinical data to support cardiologist decision-making.
-
CLI-RAG: A Retrieval-Augmented Framework for Clinically Structured and Context Aware Text Generation with LLMs (arXiv, 2025) paper
Introduces a RAG framework capable of handling the complex structure and context of clinical texts, enabling more accurate medical report generation.
-
HI-DR: Exploiting Health Status-Aware Attention and an EHR Graph+ for Effective Medication Recommendation (AAAI, 2025) paper
(Also listed in Memory) Uses an action-oriented recommendation engine that leverages health status-aware attention and EHR graphs to prescribe medications.
-
Medical Graph RAG: Evidence-based Medical Large Language Model via Graph Retrieval-Augmented Generation (ACL, 2025) paper
(Also listed in Memory) Enhances the retrieval action by utilizing a medical knowledge graph to ground LLM generations in structured, evidence-based facts.
-
MedRAG: Enhancing Retrieval-augmented Generation with Knowledge Graph-Elicited Reasoning for Healthcare Copilot (arXiv, 2025) paper
Optimizes the retrieval action through knowledge graph-elicited reasoning, improving the relevance and accuracy of information provided by healthcare copilots.
-
KPL: Training-Free Medical Knowledge Mining of Vision-Language Models (arXiv, 2025) paper
Proposes a training-free method to actively mine and extract medical knowledge hidden within pre-trained Vision-Language Models.
-
End-to-End Agentic RAG System Training for Traceable Diagnostic Reasoning (arXiv, 2025) paper
Trains an agentic system to perform end-to-end diagnostic actions where every reasoning step is traceable to a specific retrieved document.
-
SearchRAG: Can Search Engines Be Helpful for LLM-based Medical Question Answering? (arXiv, 2025) paper
Investigates the utility of integrating commercial search engine actions into the RAG pipeline to supplement internal knowledge bases for medical QA.
-
Enhancing medical information retrieval: Re-engineering the tala-med search engine for improved performance and flexibility (BMC Med. Inform. Decis. Mak., 2025) paper
Details the re-engineering of the 'tala-med' search engine, optimizing its architecture for more flexible and high-performance medical information retrieval.
-
Designing a Distributed LLM-Based Search Engine as a Foundation for Agent Discovery (IEEE, 2025) paper
Proposes a distributed architecture for LLM-based search that serves as a foundational layer for autonomous agents to discover and access medical knowledge.
-
How the Algorithmic Transparency of Search Engines Influences Health Anxiety: The Mediating Effects of Trust in Online Health Information Search (CHI, 2025) paper
A user study analyzing how the transparency of search engine algorithms affects user trust and health anxiety during online health information seeking.
-
Transforming Medical Data Access: The Role and Challenges of Recent Language Models in SQL Query Automation (MIPRO, 2024) paper
Evaluates the capability of LLMs to automate SQL query generation (Text-to-SQL), facilitating easier access to medical databases for non-technical users.
-
Improving Interactive Diagnostic Ability of a Large Language Model Agent Through Clinical Experience Learning (arXiv, 2025) paper
(Also listed in Self-evolution) Enhances the agent's diagnostic actions by allowing it to learn from simulated clinical experiences and feedback.
-
Designing VR Simulation System for Clinical Communication Training with LLMs-Based Embodied Conversational Agents (arXiv, 2025) paper
Integrates LLM-based embodied agents into a Virtual Reality simulation to train medical students in clinical communication actions.
-
Enhancing Clinical Trial Patient Matching through Knowledge Augmentation and Reasoning with Multi-Agent (arXiv, 2024) paper
Introduces MAKA, a multi-agent framework that improves patient-trial matching by dynamically augmenting criteria with domain knowledge and performing structured reasoning.
-
TeamMedAgents: Enhancing Medical Decision-Making of LLMs Through Structured Teamwork (arXiv, 2025) paper
Integrates the "Big Five" human teamwork components (e.g., leadership, trust) into a multi-agent system to systematically improve medical decision-making.
-
ClinicalLab: Aligning Agents for Multi-Departmental Clinical Diagnostics in the Real World (ACL Findings, 2025) paper, code
Presents a comprehensive suite for aligning and evaluating medical agents across 24 clinical departments, featuring a realistic benchmark (ClinicalBench).
-
The Optimization Paradox in Clinical AI Multi-Agent Systems (arXiv, 2025) paper
Reveals a paradox where systems built from individually optimized "best-of-breed" components underperform due to poor information flow, advocating for end-to-end system validation.
-
EvoAgentX: An Automated Framework for Evolving Agentic Workflows (arXiv, 2025) paper, code
An open-source platform that automates the generation and evolutionary optimization of multi-agent workflows using algorithms like TextGrad and AFlow.
-
MetaAgent: Toward Self-Evolving Agent via Tool Meta-Learning (arXiv, 2025) paper, code
Proposes an agent that evolves through "learning by doing," autonomously creating tools and building a knowledge base from its own experiences.
-
ZERA: Zero-init Instruction Evolving Refinement Agent (EMNLP, 2025) paper, code
An automated prompt optimization agent that evolves structured prompts from zero initial instructions using principle-based self-correction.
-
HealthBranches: Synthesizing Clinically-Grounded Question Answering Datasets via Decision Pathways (arXiv, 2025) paper
(Also listed in Planning) A benchmark generation framework that synthesizes QA datasets from clinical decision pathways to test complex reasoning.
-
A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence (arXiv, 2025) paper
A comprehensive survey categorizing self-evolving agents by what (model/tool/context), when, and how they evolve, positioning them as a path to ASI.
-
Evolving Collective Cognition in Human-Agent Hybrid Societies: How Agents Form Stances and Boundaries (CogSci, 2025) paper
Investigates the emergence of collective cognition and social boundaries in hybrid societies where humans and self-evolving agents interact.
Your contributions are always welcome! Please contact Xiaoquan Zhi or Chuang Zhao
If you find this code useful for your research, please cite our paper:
@article{zhi2025reinventing,
title={Reinventing Clinical Dialogue: Agentic Paradigms for LLM Enabled Healthcare Communication},
author={ADM Lab},
journal={arXiv preprint arXiv:2512.01453},
year={2025}
}
