Skip to content

AI4Scientist/awesome-autoresearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

108 Commits
 
 
 
 

Repository files navigation

Awesome AutoResearch Awesome

Awesome AutoResearch

A curated list of AutoResearch tools and frameworks that can autonomously conduct research, design experiments, analyze data, write papers, and generate scientific discoveries.

AutoResearch represents a new paradigm in scientific discovery where AI systems can perform the complete research lifecycle - from hypothesis generation to peer review - with minimal or no human intervention. These systems leverage large language models and multi-agent frameworks to accelerate scientific progress across multiple domains.

Open-source Research Agents & Tools

Benchmarks & Evaluation

Benchmarks and evaluation suites for measuring capabilities of autonomous research agents across tasks.

  • aira-dojo - Meta FAIR's extensible AI research agent development and evaluation framework with isolated code execution.
  • AIRS-Bench - Benchmark by Meta FAIR for quantifying end-to-end AI research abilities of LLM agents.
  • BixBench - Benchmark for LLM-based agents on verifiable computational biology and bioinformatics tasks.
  • DeepResearch Bench - Comprehensive benchmark and leaderboard for evaluating deep research agents.
  • MLGym - Unified framework and benchmark by Meta FAIR for developing and evaluating AI research agents across ML tasks.
  • ScholarEval - Literature-grounded framework for evaluating research ideas with multi-dimensional quality criteria.
  • scienceboard - VM-based benchmark for evaluating multimodal autonomous agents on realistic scientific workflows.

Deep Research Agents

Research-focused agents for iterative retrieval, synthesis, and report generation on complex questions.

  • Auto-Deep-Research - Open-source, cost-efficient alternative to OpenAI's Deep Research with universal LLM compatibility and strong GAIA Benchmark results.
  • ChatPaper - Uses ChatGPT to summarize arXiv papers, with professional translation, paper polishing, peer review analysis, and reviewer response generation.
  • ChatReviewer - Uses ChatGPT to analyze paper strengths/weaknesses, provide improvement suggestions, and auto-generate reviewer responses.
  • DeerFlow - ByteDance's deep research workflow agent combining web search, code execution, and multi-step reasoning for complex research tasks.
  • DeepResearch - Tongyi Deep Research — iterative retrieval-augmented research agent for complex multi-hop questions.
  • DeepResearchAgent - Skywork AI's deep research agent for thorough multi-source web research and report synthesis.
  • GPT Researcher - Autonomous agent that conducts deep online research on any topic, producing detailed, factual reports with citations.
  • local-deep-research - Local-first deep research agent reaching ~95% on SimpleQA with local LLMs, integrating arXiv, PubMed, Semantic Scholar, Wikipedia, and 10+ sources with encrypted storage.
  • Open Deep Research - Fully open-source deep research agent with multi-model support, multi-search API, MCP integration, and built-in report generation.
  • OpenResearcher - Fully open training and inference pipeline for long-horizon deep research, releasing a 30B-A3B model that surpasses GPT-4.1 and Claude Opus 4 on BrowseComp-Plus.

End-to-End Research Systems

Autonomous systems that cover the full research lifecycle from ideation through experimentation and paper writing.

  • AgentLaboratory - End-to-end autonomous research workflow using LLM agents to assist with literature reviews, experiments, and report writing.
  • AI-Researcher - Autonomous scientific innovation system with dedicated research and paper agents for idea generation, experiment execution, and paper writing.
  • AI-Scientist - First comprehensive system for fully automatic scientific discovery, enabling LLMs to autonomously generate ideas, run experiments, and write papers.
  • AI-Scientist-v2 - Workshop-level automated scientific discovery via agentic tree search; builds on AI-Scientist with improved experimental control.
  • ARIS - Lightweight Markdown-only research workflow with cross-model review loops, idea discovery, and experiment automation.
  • auto-research - Autonomous generalist scientist framework for fully automated research agents from literature reviews to experiments and writing.
  • autoresearch - Karpathy's autonomous overnight research loop where agents iteratively edit/train/evaluate a compact LLM setup under a fixed 5-minute experiment budget.
  • AutoResearchClaw - Fully autonomous & self-evolving research from idea to paper using a multi-agent debate pipeline with self-healing and citation verification.
  • Biomni - Stanford's general-purpose biomedical AI agent that autonomously executes research tasks across biology and medicine, combining LLM reasoning, retrieval, and tool/code use.
  • claude-scholar - Semi-automated academic research assistant covering ideation → coding → experiments → writing → publication using Claude Code or Codex CLI.
  • Curie - AI-agent framework for automated and rigorous scientific experimentation with end-to-end automation from hypothesis formulation to result interpretation.
  • DATAGEN - AI-driven multi-agent research assistant automating hypothesis generation, data analysis, visualization, and report writing via LangChain and LangGraph.
  • DeepScientist - Local-first autonomous research studio with Findings Memory and Bayesian optimization orchestrating baseline reproduction → branched experiments → LaTeX paper drafts.
  • EvoScientist - Self-evolving AI Scientists with a six-agent team and persistent memory for autonomous iterative research exploration.
  • Idea2Paper - End-to-end pipeline that takes a research idea and autonomously generates a complete paper draft.
  • InternAgent - Shanghai AI Lab's unified agentic framework for long-horizon autonomous discovery across physics, biology, earth, and life sciences.
  • nano-scientist - Budget-driven autonomous research agent that turns a topic into a technical report with a plan-execute-review loop and PDF output pipeline.
  • NanoResearch - Lightweight end-to-end research automation agent with minimal setup requirements.
  • nanoresearch - Tri-level co-evolving multi-agent research automation system with chat-style interaction and field-agnostic workflows.
  • OmniScientist - AI Scientist ecosystem covering idea generation, experiment design, and paper writing as a holistic blueprint for autonomous research.
  • pi-autoresearch - Extension for the pi agent that enables autonomous experiment loops to benchmark ideas, keep improvements, and revert regressions.
  • QUIT - Human-in-the-loop research automation platform with an artifact-driven Query-Understand-Implement-Tell pipeline and optional end-to-end mode.
  • Robin - Multi-agent system by FutureHouse for automating scientific discovery, demonstrated on real-world biomedical research tasks (Nature 2026).
  • Virtual Lab - AI agent team that autonomously designs novel SARS-CoV-2 nanobodies, demonstrating end-to-end wet-lab–integrated scientific discovery.
  • Virtual-Scientists - ACL 2025 project featuring many-heads multi-agent scientific idea generation for diverse hypothesis exploration.

Experiment & Code Agent Infrastructure

General-purpose coding and experiment agents that serve as the "hands" of auto-research pipelines.

  • AIDE - AI-driven exploration in the space of code via agentic tree search; delivers 4× more Kaggle medals than the best linear agent.
  • Aider - AI pair programming in the terminal with multi-file edits and git integration; widely used as the coding backbone in research pipelines.
  • AutoGPT - One of the earliest autonomous AI agent frameworks with workflow blocks, benchmarking suite, and support for 300+ models.
  • MLE-agent - Intelligent companion for ML engineering and research integrating arXiv and Papers with Code for automated planning and debugging.
  • OpenHands - AI-driven software development platform where autonomous agents edit files, run commands, and browse the web; 72% on SWE-Bench Verified.
  • PaperBanana - Reference-driven multi-agent framework for automated academic illustration with 5 specialized agents producing publication-quality diagrams.
  • SWE-agent - Princeton's LLM-based software engineering agent that fixes real GitHub issues; pioneered the SWE-Bench benchmark.

Experiment & Data Automation

Tools that automate experiment execution, data workflows, and closed-loop empirical research pipelines.

  • agentic-data-scientist - Multi-agent framework for data science workflows with separated planning and execution phases.
  • autora - Automated research assistant for closed-loop empirical research with autonomous experiment design and data analysis.
  • expflow - Experiment workflow orchestration toolkit with CLI-driven training, hyperparameter optimization, and observability integrations.
  • RD-Agent - Microsoft's LLM-agent framework for autonomous R&D, covering data science, quant finance, and research-driven software development.
  • Simply - Minimal JAX research codebase by Google DeepMind designed for agents to read code, propose ideas, run experiments, and iterate.

Literature & Knowledge Synthesis

Tools focused on searching, synthesizing, and reasoning over scientific literature.

  • OpenScholar - Retrieval-augmented language model that synthesizes scientific literature by grounding responses in relevant paper evidence.
  • PaperQA2 - High-accuracy retrieval-augmented QA over scientific PDFs, demonstrating superhuman synthesis of scientific knowledge.
  • STORM - Stanford system for synthesizing Wikipedia-style long-form articles through multi-perspective question asking and retrieval.

Research Platforms

Integrated platforms that host AI agents for collaborative research, publishing, and scientific workflows.

  • aiXiv - Multi-agent preprint server for human, AI and robot scientists with dual-track review and auto-agents ecosystem.
  • AutoSci - Agentic AI research platform that automates the full pipeline from paper ingestion and knowledge management to manuscript drafting.

RL Training Infrastructure

Core RL environments and training infrastructure for building and scaling autonomous research agents.

  • aviary - Language-agent gym with challenging scientific tasks and research-oriented environments.
  • Gym - NVIDIA NeMo environment library for evaluating and improving models/agents with multiple backend support.
  • NVIDIA NeMo RL - Scalable RL toolkit for efficient model reinforcement, including GRPO and large-scale training workflows.
  • tiny-scientist - Lightweight modular framework for building research agents with tool integration and controllable execution.

Scientific Writing & Skills

Frameworks and skill collections that support scientific writing, reasoning, and reusable research capabilities.

  • AI-Research-SKILLs - 86 skills across 22 categories covering the full AI research lifecycle: literature review, idea generation, experimentation, and paper authoring.
  • autoresearch-skill - Cross-platform LLM skill set (Claude Code/Codex/Gemini) that runs experiment-evaluate-iterate autoresearch loops from natural-language goals.
  • claude-scientific-skills - Comprehensive collection of 140 ready-to-use scientific skills for Claude across biology, chemistry, medicine, and more.
  • claude-scientific-writer - AI-powered scientific writing assistant for automated research paper generation and technical documentation.
  • FactReview - Evidence-grounded ML paper review system that tags claim verdicts, positions literature, and supports execution-based verification.
  • happy-figure-skill - Claude Code skill for generating publication-quality research figures with automated chart creation and styling.
  • PaperOrchestra - Multi-agent framework for automated research paper writing from raw ideas and experiment logs to submission-ready LaTeX drafts.
  • Researcher - AI-powered research assistant for automated research workflows.
  • scientific-agent-skills - 133 ready-to-use scientific skills across bioinformatics, drug discovery, clinical research, medical imaging, and materials science.

Surveys, Guides & Tutorials

Educational resources, surveys, and practical guides for learning autonomous research concepts and workflows.

  • Autonomous-Agents - Daily-updated curated collection of research papers on autonomous LLM agents covering multi-agent systems, scientific computing, robotics, and more.
  • awesome-ai-for-science - Curated list of AI tools, libraries, papers, datasets, and frameworks for scientific discovery across physics, chemistry, biology, and materials.
  • awesome-autoresearch - Curated index of autonomous improvement loops, research agents, and autoresearch-style systems inspired by Karpathy's autoresearch.
  • Awesome-Deep-Research - Curated collection of deep research agents — industry products, open-source implementations, 70+ recent papers, and benchmarks through early 2026.
  • learn-auto-research - Educational repository for learning and practicing autonomous research workflows.

ResearchClaw & OpenClaw Ecosystem

Agent-native research tools built on or inspired by the OpenClaw framework, covering research platforms, IDEs, skill libraries, and multi-agent research pipelines.

Research Platforms & IDEs

OpenClaw-based platforms and IDE experiences for end-to-end scientific workflows and agent orchestration.

  • Dr. Claw - AI research IDE with 100+ skills, structured dashboard (Survey → Ideation → Experiment → Publication), auto-research one-click execution, and multi-agent support.
  • openclaw-agents - One-command setup for 9 specialized research agents with Paper Pipeline, Brainstorm, Daily Digest, and Rebuttal workflows built in.
  • PaperClaw - OpenClaw skill for generating topic-specific expert agents for paper search, review, and critique workflows.
  • Prismer - End-to-end research platform with PDF reading, Jupyter, LaTeX, code execution, and citation verification.
  • Research Claw - Self-hosted academic assistant by nanoAgentTeam for paper management, literature search, deadline tracking, and LaTeX/Overleaf sync.
  • ResearchClaw (Noietch) - OpenClaw-based research assistant by Noietch with structured paper management and annotation workflows.
  • ResearchClaw (ymx10086) - Local-first Research OS with claims/evidence graph, experiment tracking, paper management, skills, and multi-channel access.
  • sciClaw - Paired-scientist agent (Go/PicoClaw runtime) with Telegram/Discord interface, PubMed integration, reproducible experiment logging, skill library, and multi-provider LLM support.
  • ScienceClaw - Science-focused OpenClaw variant for structured scientific research workflows.
  • Scientify - Continuous-metabolism research system that tracks papers, evolves hypotheses, runs validation experiments, and proactively pushes updates.

Skill Libraries & Extensions

Reusable OpenClaw skill ecosystems for domain expertise, automation, and tool augmentation.

  • LabClaw - 240 OpenClaw skills for biology, pharmacology, medicine, literature, and visualization.
  • OpenClaw-Medical-Skills - 869 medical/biomedical skills spanning clinical, genomics, and drug discovery domains.

Commercial Platforms

Full Research Automation

Commercial systems focused on autonomous end-to-end scientific research execution and acceleration.

  • Analemma - Fully autonomous research system for end-to-end scientific research automation.
  • Co-Scientist - Google's AI co-scientist system for hypothesis generation, experimental design, and literature synthesis, demonstrated across multiple biomedical domains (Nature 2026).
  • DeepScientist - AI platform for accelerating scientific research and discovery.
  • Edison Scientific - Autonomous AI scientist platform (Kosmos) for end-to-end research automation.
  • FutureHouse - AI platform building autonomous systems to accelerate scientific discovery.

Literature Discovery & Search

Commercial tools for discovering, searching, and synthesizing scientific literature at scale.

  • AI Researcher - AI-powered research assistant for literature review and research synthesis.
  • AiraXiv - AI research platform for automated paper discovery and analysis.
  • Consensus - AI search engine that finds and summarizes scientific research papers.
  • Elicit - AI research assistant that automates literature review workflows.

Paper Review & Management

Commercial platforms for paper understanding, review workflows, and research knowledge management.

  • IBM Watson Discovery - Enterprise AI platform for intelligent document understanding and search.
  • paper2skills - AI-powered platform that converts research papers into actionable skills.
  • PaperReview - AI-powered platform for automated paper review and feedback.
  • SciSpace - AI copilot for research paper reading, writing, and understanding.

Self-driving Labs

Automated laboratory platforms and initiatives that integrate robotics, AI, and closed-loop experimentation.

  • Acceleration Consortium - University of Toronto-led initiative advancing self-driving labs for materials acceleration and autonomous experimentation.
  • Emerald Cloud Lab - Cloud-based robotic laboratory platform for programmable and remotely executed experiments.
  • IBM RXN for Chemistry - AI platform for chemistry synthesis planning and autonomous lab workflow integration.
  • Strateos - Remote robotic lab platform for automated experiment execution and high-throughput biology workflows.

Contributing

Contributions are welcome! Please feel free to submit a pull request to add more AutoResearch tools to this awesome list.

Releases

No releases published

Packages

 
 
 

Contributors