We maintain a curated collection of papers exploring the path towards Foundation Agents, with a focus on formulating the core concepts and navigating the research landscape.
⌛️ Coming soon: Version 2! We're continuously compiling and updating cutting-edge insights. Feel free to suggest any related work you find valuable!
✨✨✨ Advances and Challenges in Foundation Agents (Paper)
Table of Contents- Core Components of Intelligent Agents
- Self-Enhancement in Intelligent Agents
- Collaborative and Evolutionary Intelligent Systems
- Building Safe and Beneficial AI
- Add SFT,RLHF,PEFT
- ReFT: Reasoning with Reinforced Fine-Tuning, arxiv 2024, [paper] [code]
- Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning [paper] [code]
- R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning, arxiv 2025, [paper] [code]
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models, Wei et al. 2022, [paper] [code]
- Voyager: An Open-Ended Embodied Agent with Large Language Models, arxiv 2023, [paper] [code]
- Reflexion: Language Agents with Verbal Reinforcement Learning, NeurIPS 2023, [paper] [code]
- ReAct meets ActRe: Autonomous Annotations of Agent Trajectories for Contrastive Self-Training, arxiv 2024, [paper] [code]
- Generative Agents: Interactive Simulacra of Human Behavior, ACM UIST 2023, [paper] [code]
- CLIP: Learning Transferable Visual Models from Natural Language Supervision, ICML 2021, [paper] [code]
- LLaVA: Visual Instruction Tuning, NeurIPS 2023, [paper] [code]
- CogVLM: Visual Expert for Pretrained Language Models, NeurIPS 2025, [paper] [code]
- Qwen2-Audio Technical Report, arxiv 2024, [paper] [code]
- Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning, arxiv 2025, [paper] [code]
- SKY-T1: Train Your Own o1 Preview Model Within $450, 2025, [paper] [code]
- Open Thoughts, 2025, [paper] [code]
- LIMO: Less is More for Reasoning, arxiv 2025, [paper] [code]
- STaR: Bootstrapping Reasoning with Reasoning, arxiv 2022, [paper] [code]
- ReST: Reinforced Self-Training for Language Modeling, arxiv 2023, [paper] [code]
- OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models, arxiv 2024, [paper] [code]
- LLaMA-Berry: Pairwise Optimization for o1-like Olympiad-level Mathematical Reasoning, arxiv 2024, [paper] [code]
- RAGEN: Training Agents by Reinforcing Reasoning, arxiv 2025, [paper] [code]
- Open-R1, 2024, [paper] [code]
- Inner Monologue: Embodied Reasoning through Planning with Language Models, CoRL 2023, [paper] [code]
- Self-Refine: Iterative Refinement with Self-Feedback, NeurIPS 2024, [paper] [code]
- Reflexion: Language Agents with Verbal Reinforcement Learning, NeurIPS 2023, [paper] [code]
- ExpeL: LLM Agents Are Experiential Learners, AAAI 2024, [paper] [code]
- AutoManual: Generating Instruction Manuals by LLM Agents via Interactive Environmental Learning, arxiv 2024, [paper] [code]
- ReAct meets ActRe: Autonomous Annotations of Agent Trajectories for Contrastive Self-Training, arxiv 2024, [paper] [code]
- ReAct: Synergizing Reasoning and Acting in Language Models, arxiv 2022, [paper] [code]
- Markov Chain of Thought for Efficient Mathematical Reasoning, arxiv 2024, [paper] [code]
- Tree of Thoughts: Deliberate Problem Solving with Large Language Models, NeurIPS 2023, [paper] [code]
- Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models, ICML 2024, [paper] [code]
- Reasoning via Planning (RAP): Improving Language Models with World Models, EMNLP 2023, [paper] [code]
- Graph of Thoughts: Solving Elaborate Problems with Large Language Models, AAAI 2023, [paper] [code]
- Path of Thoughts: Extracting and Following Paths for Robust Relational Reasoning with Large Language Models, arxiv 2024, [paper] [code]
- On the Diagram of Thought, arxiv 2024, [paper] [code]
- Self-Consistency Improves Chain of Thought Reasoning in Language Models, ICLR 2023, [paper] [code]
- Self-Refine: Iterative Refinement with Self-Feedback, NeurIPS 2024, [paper] [code]
- Progressive-Hint Prompting Improves Reasoning in Large Language Models, arxiv 2023, [paper] [code]
- On the Self-Verification Limitations of Large Language Models on Reasoning and Planning Tasks, arxiv 2024, [paper] [code]
- Chain-of-Verification Reduces Hallucination in Large Language Models, ICLR 2024 Workshop, [paper] [code]
- MathPrompter: Mathematical Reasoning Using Large Language Models, ACL 2023, [paper] [code]
- LLMs Can Find Mathematical Reasoning Mistakes by Pedagogical Chain-of-Thought, arxiv 2024, [paper] [code]
- Physics Reasoner: Knowledge-Augmented Reasoning for Solving Physics Problems with Large Language Models, COLING 2025, [paper] [code]
- Chain of Thought Prompting Elicits Reasoning in Large Language Models, NeurIPS 2022, [paper] [code]
- Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models, ICLR 2024, [paper] [code]
- Ask Me Anything: A Simple Strategy for Prompting Language Models, arxiv 2022, [paper] [code]
- Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources, arxiv 2023, [paper] [code]
- Self-Explained Keywords Empower Large Language Models for Code Generation, arxiv 2024, [paper] [code]
- DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning, arxiv 2025, [paper] [code]
- Claude 3.7 Sonnet, 2025, [paper] [code]
- OpenAI o1 System Card, arxiv 2024, [paper] [code]
- Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking, arxiv 2024, [paper] [code]
- Chain of Continuous Thought (Coconut): Training Large Language Models to Reason in a Continuous Latent Space, arxiv 2024, [paper] [code]
- Describe, Explain, Plan and Select (DEPS): Interactive Planning with Large Language Models, arxiv 2023, [paper] [code]
- ProgPrompt: Generating Situated Robot Task Plans Using Large Language Models, ICRA 2023, [paper] [code]
- ADAPT: As-Needed Decomposition and Planning with Language Models, arxiv 2023, [paper] [code]
- Tree of Thoughts: Deliberate Problem Solving with Large Language Models, NeurIPS 2023, [paper] [code]
- Reasoning via Planning (RAP): Improving Language Models with World Models, EMNLP 2023, [paper] [code]
- TravelPlanner: A Benchmark for Real-World Planning with Language Agents, ICML 2024, [paper] [code]
- PDDL—The Planning Domain Definition Language, 1998, [paper] [code]
- Mind2Web: Towards a Generalist Agent for the Web, NeurIPS 2023, [paper] [code]
- RecAgent (Wang et al., 2023)
- CoPS (Zhou et al., 2024)
- MemoryBank (Zhong et al., 2024)
- Memory Sandbox (Huang et al., 2023)
- VideoAgent (Fan et al., 2024)
- WorldGPT (Ge et al., 2024)
- Agent S (Agashe et al., 2024)
- OS-Copilot (Wu et al., 2024)
- MuLan (Li et al., 2024)
- MemGPT (Packer et al., 2023)
- KARMA (Wang et al., 2024)
- LSFS (Shi et al., 2024)
- OSCAR (Wang et al., 2024)
- RCI (Geunwoo et al., 2023)
- Generative Agent (Park et al., 2023)
- RLP (Fischer et al., 2023)
- CALYPSO (Zhu et al., 2023)
- HiAgent (Hu et al., 2024)
- AriGraph (Anokhin et al., 2024)
- RecAgent (Wang et al., 2023)
- HippoRAG (Gutierrez et al., 2024)
- MobileGPT (Lee et al., 2023)
- MemoryBank (Zhong et al., 2024)
- Episodic Verbalization (Barmann et al., 2024)
- MrSteve (Park et al., 2024)
- AAG (Roth et al., 2024)
- Cradle (Tan et al., 2024)
- JARVIS-1 (Wang et al., 2024)
- LARP (Yan et al., 2023)
- HiAgent (Hu et al., 2024)
- LMAgent (Liu et al., 2024)
- ReadAgent (Lee et al., 2024)
- M²WF (Wang et al., 2025)
- ExpeL (Zhao et al., 2024)
- MindOS (Hu et al., 2025)
- Vanschoren et al. (2018)
- Hou et al. (2024)
- AgentCorrd (Pan et al., 2024)
- MS (Gao et al., 2024)
- GraphVideoAgent (Chu et al., 2025)
- A-MEM (Xu et al., 2025)
- Ali et al. (2024)
- Optimus-1 (Li et al., 2024)
- Optimus-2 (Li et al., 2025)
- JARVIS-1 (Wang et al., 2024)
- Agent S (Agashe et al., 2024)
- OSCAR (Wang et al., 2024)
- R2D2 (Huang et al., 2025)
- Mobile-Agent-E (Wang et al., 2025)
- SummEdits (Laban et al., 2023)
- SCM (Wang et al., 2023)
- Healthcare Copilot (Ren et al., 2024)
- Wang et al. (2023)
- Knowagent (Zhu et al., 2024)
- AoTD (Shi et al., 2024)
- LDPD (Liu et al., 2024)
- Sub-goal Distillation (Hashemzadeh et al., 2024)
- MAGDi (Chen et al., 2024)
- Lyfe Agent (Kaiya et al., 2023)
- TiM (Liu et al., 2023)
- MemoryBank (Zhong et al., 2024)
- S³ (Gao et al., 2023)
- Hou et al. (2024)
- HippoRAG (Gutierrez et al., 2024)
- TradingGPT (Li et al., 2023)
- LongMemEval (Wu et al., 2024)
- SeCom (Pan et al., 2025)
- Product Keys (Lample et al., 2019)
- OSAgent (Xu et al., 2024)
- Bahdanau et al. (2014)
- Hou et al. (2024)
- Hopfield Networks (Demircigil et al., 2017; Ramsauer et al., 2020)
- Neural Turing Machines (Falcon et al., 2022)
- MemoryLLM (Wang et al., 2024)
- SELF-PARAM (Wang et al., 2024)
- MemoRAG (Qian et al., 2024)
- TTT-Layer (Sun et al., 2024)
- Titans (Behrouz et al., 2024)
- R³Mem (Wang et al., 2025)
- RAGLAB (Zhang et al., 2024)
- Adaptive Retrieval (Mallen et al., 2023)
- Atlas (Farahani et al., 2024)
- Yuan et al. (2025)
- RMT (Bulatov et al., 2022, 2023)
- AutoCompressor (Chevalier et al., 2023)
- ICAE (Ge et al., 2023)
- Gist (Mu et al., 2024)
- CompAct (Yoon et al., 2024)
- Lamini (Li et al., 2024)
- Memoria (Park et al., 2023)
- PEER (He et al., 2024)
- Ding et al. (2024)
- BERT (Devlin et al., 2018)
- RoBERTa (Liu et al., 2019)
- ALBERT (Lan et al., 2019)
- ResNet (He et al., 2016)
- DETR (Carion et al., 2020)
- Grounding DINO 1.5 (Ren et al., 2024)
- ViViT (Arnab et al., 2021)
- VideoMAE (Tong et al., 2022)
- FastSpeech 2 (Ren et al., 2020)
- Seamless (Barrault et al., 2023)
- wav2vec 2.0 (Baevski et al., 2020)
- Visual ChatGPT (Wu et al., 2023)
- HuggingGPT (Shen et al., 2024)
- MM-REACT (Yang et al., 2023)
- ViperGPT (Suris et al., 2023)
- AudioGPT (Huang et al., 2024)
- LLaVA-Plus (Liu et al., 2025)
- CLIP (Alec et al., 2021)
- ALIGN (Jia et al., 2021)
- DALL·E 3 (Betker et al., 2023)
- VisualBERT (Li et al., 2019)
- VideoCLIP (Xu et al., 2021)
- Phenaki (Villegas et al., 2022)
- Make-A-Video (Singer et al., 2022)
- Wav2CLIP (Wu et al., 2022)
- VATT (Akbari et al., 2021)
- AudioCLIP (Guzhov et al., 2022)
- CLIP-Forge (Sanghi et al., 2022)
- Point-E (Nichol et al., 2022)
- MiniGPT-v2 (Chen et al., 2023)
- LLaVA-NeXT (Liu et al., 2024)
- CogVLM2 (Hong et al., 2024)
- Qwen2-VL (Wang et al., 2024)
- Emu2 (Sun et al., 2024)
- TinyGPT-V (Yuan et al., 2023)
- MobileVLM (Chu et al., 2023)
- MiniCPM-V (Yao et al., 2024)
- OmniParser (Lu et al., 2024)
- CLIPort (Shridhar et al., 2022)
- RT-1 (Brohan et al., 2022)
- MOO (Stone et al., 2023)
- PerAct (Shridhar et al., 2023)
- Diffusion Policy (Chi et al., 2023)
- PaLM-E (Driess et al., 2023)
- MultiPLY (Hong et al., 2024)
- Audio Flamingo (Kong et al., 2024)
- SpeechVerse (Das et al., 2024)
- UniAudio 1.5 (Yang et al., 2024)
- Qwen2-Audio (Chu et al., 2024)
- Audio-LLM (Li et al., 2024)
- Mini-Omni (Xie et al., 2024)
- SpeechGPT (Zhang et al., 2023)
- ONE-PEACE (Wang et al., 2023)
- PandaGPT (Su et al., 2023)
- Macaw-LLM (Lyu et al., 2023)
- LanguageBind (Zhu et al., 2023)
- UnIVAL (Shukor et al., 2023)
- X-LLM (Chen et al., 2023)
- PointLLM (Xu et al., 2025)
- MiniGPT-3D (Tang et al., 2024)
- NExT-GPT (Wu et al., 2023)
- Unified-IO 2 (Lu et al., 2024)
- CoDi-2 (Tang et al., 2024)
- ModaVerse (Wang et al., 2024)
DINO-WM [358]: Video World Models on Pre-trained Visual Features Enable Zero-Shot Planning, arxiv 2024, [paper], [[code][]]
SAPIEN [351]: A Simulated Part-based Interactive Environment, CVPR 2020, [paper], [[code][]]
MuZero [349]: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model, Nature 2020, [paper], [[code][]]
GR-2 [357]: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation, arxiv 2024, [paper], [[code][]]
COAT [356]: Discovery of the Hidden World with Large Language Models, arxiv 2024, [paper], [[code][]]
AutoManual [108]: Generating Instruction Manuals by LLM Agents via Interactive Environmental Learning, arxiv 2024, [paper], [[code][]]
PILCO [355]: A Model-Based and Data-Efficient Approach to Policy Search, ICML 2011, [paper], [[code][]]
ActRe [49]: ReAct meets ActRe: Autonomous Annotations of Agent Trajectories for Contrastive Self-Training, arxiv 2024, [paper], [[code][]]
World Models [348]: World Models, NeurIPS 2018, [paper], [[code][]]
Dreamer [350]: Dream to Control: Learning Behaviors by Latent Imagination, ICLR 2020, [paper], [[code][]]
Diffusion WM [353]: Diffusion for World Modeling: Visual Details Matter in Atari, arxiv 2024, [paper], [[code][]]
GQN [354]: Neural Scene Representation and Rendering, Science 2018, [paper], [[code][]]
Daydreamer [352]: World Models for Physical Robot Learning, CoRL 2023, [paper], [[code][]]
-
ReAct: Synergizing Reasoning and Acting in Language Models, ICLR 2023, [paper] [code]
-
AutoGPT: Build, Deploy, and Run AI Agents, Github, [code]
-
Reflexion: Language Agents with Verbal Reinforcement Learning, NeurIPS 2023, [paper] [code]
-
LLM+P: Empowering Large Language Models with Optimal Planning Proficiency, arXiv 2023, [paper] [code]
-
MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework, ICLR 2023, [paper] [code]
-
ChatDev: Communicative Agents for Software Development, ACL 2024, [paper] [code]
-
SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering, NeurIPS 2025, [paper] [code]
-
OpenHands: An Open Platform for AI Software Developers as Generalist Agents, arXiv 2024, [paper] [code]
-
Generative Agents: Interactive Simulacra of Human Behavior, UIST 2023, [paper] [code]
-
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation, COLM 2024, [paper] [code]
-
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge, NeurIPS 2022, [paper] [code]
-
Voyager: An Open-Ended Embodied Agent with Large Language Models, TMLR 2024, [paper] [code]
-
SwarmBrain: Embodied agent for real-time strategy game StarCraft II via large language models, arXiv 2024, [paper] [code]
-
JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models, NeurIPS 2025, [paper] [code]
-
MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action, arXiv 2023, [paper] [code]
-
ViperGPT: Visual Inference via Python Execution for Reasoning, ICCV 2023, [paper] [code]
-
Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models, arXiv 2023, [paper] [code]
-
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face, NeurIPS 2023, [paper] [code]
-
WebGPT: Browser-assisted question-answering with human feedback, arXiv 2021, [paper] [blog]
-
WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents, NeurIPS 2022, [paper] [code]
-
A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis, ICLR 2024, [paper]
-
Mind2Web: Towards a Generalist Agent for the Web, NeurIPS 2025, [paper] [code]
-
Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception, arXiv 2024, [paper] [code]
-
AppAgent: Multimodal Agents as Smartphone Users, arXiv 2023, [paper] [code]
-
UFO: A UI-Focused Agent for Windows OS Interaction, arXiv 2024, [paper] [code]
-
OmniParser for Pure Vision Based GUI Agent, arXiv 2024, [paper] [code]
-
A Survey of NL2SQL with Large Language Models: Where are we, and where are we going?, arXiv 2024, [paper] [Handbook]
-
Alpha-SQL: Zero-Shot Text-to-SQL using Monte Carlo Tree Search, arXiv 2025, [paper]
-
EllieSQL: Cost-Efficient Text-to-SQL with Complexity-Aware Routing, arXiv 2025, [paper] [code]
-
NL2SQL-Bugs: A Benchmark for Detecting Semantic Errors in NL2SQL Translation, arXiv 2025, [paper] [code]
-
nvBench 2.0: A Benchmark for Natural Language to Visualization under Ambiguity, arXiv 2025, [paper] [code]
-
The Dawn of Natural Language to SQL: Are We Fully Ready?, VLDB 2024, [paper] [code]
-
Are Large Language Models Good Statisticians?, NIPS 2024, [paper] [code]
-
UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models, EMNLP 2022, [paper] [code]
-
Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments, ACL 2023, [paper] [code]
-
Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs, NeurIPS 2025, [paper] [project]
-
Spider 2.0: Evaluating language models on real-world enterprise text-to-sql workflows., ICLR 2025, [paper] [code]
-
Middleware for llms: Tools are instrumental for language agents in complex environments., EMNLP 2024, [paper] [code]
-
RT-1: Robotics Transformer for Real-World Control at Scale, RSS 2023, [paper] [project]
-
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control, CoRL 2023, [paper] [project]
-
Open X-Embodiment: Robotic Learning Datasets and RT-X Models, arXiv 2023, [paper] [project]
-
GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation, arXiv 2024, [paper] [project]
-
π0: A vision-language-action flow model for general robot control., arXiv 2024, [paper]
-
Do as I can, not as I say Grounding language in robotic affordances, CoRL 2022, [paper] [project]
-
Voxposer: Composable 3d value maps for robotic manipulation with language models., CoRL 2023, [paper] [code]
-
Embodiedgpt: Vision-language pre-training via embodied chain of thought., NeurIPS 2023, [paper] [project]
-
CoT: Chain-of-Thought Prompting Elicits Reasoning in Large Language Models, NeurIPS 2022, [paper]
-
ReAct: React: Synergizing reasoning and acting in language models, arXiv 2022, [paper] [project]
-
Auto-CoT: Automatic Chain of Thought Prompting in Large Language Models, ICLR 2023, [paper] [code]
-
ToT: Tree of Thoughts: Deliberate Problem Solving with Large Language Models, NeurIPS 2023, [paper] [code]
-
GoT: Graph of Thoughts: Solving Elaborate Problems with Large Language Models, AAAI 2023, [paper] [code]
-
LearnAct: Empowering Large Language Model Agents through Action Learning, arXiv 2024, [paper] [code]
-
CoA: Improving Multi-Agent Debate with Sparse Communication Topology, arXiv 2024, [paper]
-
Least-to-Most: Least-to-Most Prompting Enables Complex Reasoning in Large Language Models, ICLR 2023, [paper]
-
HuggingGPT: Hugginggpt: Solving ai tasks with chatgpt and its friends in hugging face, NeurIPS 2024, [paper] [code]
-
Plan-and-Solve: Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models, ACL 2023, [paper] [code]
-
ProgPrompt: Progprompt: Generating situated robot task plans using large language models, ICRA 2023, [paper] [project]
-
Generative Agents: Generative agents: Interactive simulacra of human behavio, arXiv 2023, [paper] [code]
-
MetaGPT: Meta{GPT}: Meta Programming for Multi-Agent Collaborative Framework, ICLR 2023, [paper] [code]
-
ChatDev: ChatDev: Communicative Agents for Software Development, ACL 2024, [paper] [code]
-
SWE-Agent: SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering, arXiv 2024, [paper] [project]
-
Reflexion: Reflexion: language agents with verbal reinforcement learning, NeurIPS 2023, [paper] [code]
-
Self-refine: Self-refine: Iterative refinement with self-feedback, NeurIPS 2024, [paper] [code]
-
GPTSwarm: GPTSwarm: Language Agents as Optimizable Graphs, ICML 2024, [paper] [project]
-
RT-1: RT-1: Robotics Transformer for Real-World Control at Scale, arXiv 2022, [paper] [project]
-
RT-2: RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control, arXiv 2023, [paper] [project]
-
RT-X: Open x-embodiment: Robotic learning datasets and rt-x models, arXiv 2023, [paper] [project]
-
GR-2: GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation, arXiv 2024, [paper] [project]
-
LAM: Large Action Models: From Inception to Implementation, arXiv 2024, [paper] [code]
-
CogACT: CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation, arXiv 2024, [paper] [project]
-
RT-H: RT-H: Action Hierarchies Using Language, arXiv 2024, [paper] [project]
-
OpenVLA: OpenVLA: An Open-Source Vision-Language-Action Model, arXiv 2024, [paper] [project]
-
$\pi_0$ :$\pi_0$ : A Vision-Language-Action Flow Model for General Robot Control, arXiv 2024, [paper] [project] -
UniAct: Universal Actions for Enhanced Embodied Foundation Models, CVPR 2025, [paper] [code]
-
RLHF: Training language models to follow instructions with human feedback, NeurIPS 2022, [paper]
-
DPO: Direct preference optimization: Your language model is secretly a reward model, NeurIPS 2023, [paper]
-
RLFP: Reinforcement Learning with Foundation Priors: Let the Embodied Agent Efficiently Learn on Its Own, CoRL 2024, [paper] [project]
-
ELLM: Guiding pretraining in reinforcement learning with large language models, ICML 2023, [paper] [code]
-
GenSim: Gensim: Generating robotic simulation tasks via large language models, arXiv 2023, [paper] [project]
-
LEA: Reinforcement learning-based recommender systems with large language models for state reward and action modeling, ACM 2024, [paper]
-
MLAQ: Empowering LLM Agents with Zero-Shot Optimal Decision-Making through Q-learning, ICLR 2025, [paper]
-
KALM: KALM: Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts, NeurIPS 2024, [paper] [project]
-
When2Ask: Enabling intelligent interactions between an agent and an LLM: A reinforcement learning approach, RLC 2024, [paper]
-
Eureka: Eureka: Human-level reward design via coding large language models, ICLR 2024, [paper] [project]
-
ArCHer: ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL, arXiv 2024, [paper] [project]
-
LLaRP: Large Language Models as Generalizable Policies for Embodied Tasks, ICLR 2024, [paper] [project]
-
GPTSwarm: GPTSwarm: Language Agents as Optimizable Graphs, ICML 2024, [paper] [project]
- InstructGPT (Ouyang et al., 2022)
- DRO (Richemond et al., 2024)
- sDPO (Kim et al., 2024)
- ΨPO (Azar et al., 2024)
- β-DPO (Wu et al., 2025)
- ORPO (Hong et al., 2024)
- DNO (Rosset et al., 2024)
- f-DPO (Wang et al., 2023)
- Xu et al., 2023
- Rafailov et al., 2024
- PAFT (Pentyala et al., 2024)
- SimPO (Meng et al., 2025)
- LiPO (Liu et al., 2024)
- RRHF (Yuan et al., 2023)
- PRO (Song et al., 2024)
- D²O (Duan et al., 2024)
- NPO (Zhang et al., 2024)
- Ahmadian et al., 2024
- CPO (Xu et al., 2024)
- NLHF (Munos et al., 2023)
- Swamy et al., 2024
- InstructGPT (Ouyang et al., 2022)
- DRO (Richemond et al., 2024)
- β-DPO (Wu et al., 2025)
- ORPO (Hong et al., 2024)
- PAFT (Pentyala et al., 2024)
- SimPO (Meng et al., 2025)
- NLHF (Munos et al., 2023)
- Swamy et al., 2024
- f-DPO (Wang et al., 2023)
- Pathak et al., 2017
- Pathak et al., 2019
- Plan2Explore (Sekar et al., 2020)
- LIIR (Du et al., 2019)
- CURIOUS (Colas et al., 2019)
- Skew-Fit (Pong et al., 2019)
- DISCERN (Hassani et al., 2021)
- Yuan et al., 2024
- KTO (Ethayarajh et al., 2024)
- Yuan et al., 2024
- Burda et al., 2018
- Ton et al., 2024
- VIME (Houthooft et al., 2016)
- EMI (Kim et al., 2018)
- MAX (Shyam et al., 2019)
- KTO (Ethayarajh et al., 2024)
- d-RLAIF (Lee et al., 2023)
- Bai et al., 2022
- Xiong et al., 2023
- Dong et al., 2024
- TDPO (Zeng et al., 2024)
-
Prompt optimization in multi-step tasks (promst): Integrating human feedback and preference alignment, EMNLP 2024 [paper]
-
StraGo: Harnessing strategic guidance for prompt optimization, EMNLP 2024 [paper]
-
Connecting large language models with evolutionary algorithms yields powerful prompt optimizers, ICLR 2024 [paper]
-
Large Language Models Are Human-Level Prompt Engineers, ICLR 2023 [paper]
-
Automatic Prompt Optimization with "Gradient Descent" and Beam Search, EMNLP 2023 [paper]
-
GPTSwarm: Language Agents as Optimizable Graphs, ICML 2024 [paper]
-
Promptbreeder: Self-Referential Self-Improvement via Prompt Evolution, ICML 2024 [paper]
-
Teaching Large Language Models to Self-Debug, ICLR 2024 [paper]
-
Large Language Models as Optimizers, ICLR 2024 [paper]
-
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines, ICLR 2024 [paper]
-
Prompt Engineering a Prompt Engineer, Findings of ACL 2024 [paper]
-
Prompt optimization in multi-step tasks (promst): Integrating human feedback and preference alignment, EMNLP 2024 [paper]
-
StraGo: Harnessing strategic guidance for prompt optimization, EMNLP 2024 [paper]
-
Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs, EMNLP 2024 [paper]
-
Trace is the Next AutoDiff: Generative Optimization with Rich Feedback, Execution Traces, and LLMs, NeurIPS 2024 [paper]
-
Optimizing Generative AI by Backpropagating Language Model Feedback, Nature [paper]
-
Are Large Language Models Good Prompt Optimizers?, arxiv [paper]
-
An Explanation of In-context Learning as Implicit Bayesian Inference, ICLR 2022, [paper]
-
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?, EMNLP 2022, [paper]
-
What Can Transformers Learn In-Context? A Case Study of Simple Function Classes, NeurIPS 2022, [paper]
-
What Learning Algorithm Is In-Context Learning? Investigations with Linear Models, ICLR 2023, [paper]
-
Transformers Learn In-Context by Gradient Descent, ICML 2023, [paper]
-
Transformers Learn to Achieve Second-Order Convergence Rates for In-Context Linear Regression, NeurIPS 2024, [paper]
-
Reflexion: language agents with verbal reinforcement learning, NeurIPS 2023, [paper]
-
Self-refine: Iterative refinement with self-feedback, NeurIPS 2023, [paper]
-
ReAct: Synergizing Reasoning and Acting in Language Models, ICLR 2023, [paper]
-
Tree of thoughts: Deliberate problem solving with large language models, NeurIPS 2023, [paper]
-
Voyager: An Open-Ended Embodied Agent with Large Language Models, TMLR 2024, [paper]
-
Let's Verify Step by Step, ICLR 2024, [paper]
-
MetaGPT: Meta programming for multi-agent collaborative framework, ICLR 2024, [paper]
-
Camel: Communicative agents for “mind” exploration of large language model society, NeurIPS 2023, [paper]
-
ChatDev: Communicative Agents for Software Development, ACL 2024, [paper]
-
Hugginggpt: Solving ai tasks with chatgpt and its friends in hugging face, NeurIPS 2023, [paper]
-
Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation, COLM 2024, [paper]
-
Quiet-star: Language models can teach themselves to think before speaking, CoRR 2024, [paper]
-
**Text2reward: Automated dense reward function generation for reinforcement learning. **, ICLR 2024, [paper]
-
Extracting prompts by inverting LLM outputs, ACL 2024, [paper]
-
Aligning large language models via self-steering optimization., arxiv 2024, [paper]
-
Aligning large language models via self-steering optimization., arxiv 2024, [paper]
-
Are Large Language Models Good Statisticians?, NeurIPS 2024, [paper]
-
nvBench 2.0: A Benchmark for Natural Language to Visualization under Ambiguity, arxiv 2025, [paper]
-
Srag: Structured retrieval-augmented generation for multi-entity question answering over wikipedia graph, arxiv 2025, [paper]
-
Fine-grained retrieval-augmented generation for visual question answering, arxiv 2025, [paper]
-
xLAM: A Family of Large Action Models to Empower AI Agent Systems, arxiv 2024, [paper]
-
Automated design of agentic systems., arxiv 2024, [paper]
-
LIRE: listwise reward enhancement for preference alignment, ACL 2024, [paper]
-
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers, arXiv 2024, [paper]
-
SciAgents: Automating Scientific Discovery Through Bioinspired Multi-Agent Intelligent Graph Reasoning, Advanced Materials 2024, [paper]
-
Genesis: Towards the Automation of Systems Biology Research, arXiv 2024, [paper]
-
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery, arXiv 2024, [paper]
-
Agent Laboratory: Using LLM Agents as Research Assistants, arXiv 2025, [paper]
-
ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning, arXiv 2025, [paper]
-
ChemOS 2.0: An orchestration architecture for chemical self-driving laboratories, Matter 2024, [paper]
-
Towards an AI co-scientist, arXiv 2025, [paper]
-
Autonomous mobile robots for exploratory synthetic chemistry, Nature 2024, [paper]
-
Delocalized, asynchronous, closed-loop discovery of organic laser emitters, Science 2024, [paper]
-
The Virtual Lab: AI Agents Design New SARS-CoV-2 Nanobodies with Experimental Validation, bioRxiv 2024, [paper]
-
Solving olympiad geometry without human demonstrations, Nature 2024, [paper]
-
Toward a Team of AI-made Scientists for Scientific Discovery from Gene Expression Data, arXiv 2024, [paper]
-
Data Interpreter: An LLM Agent For Data Science, arXiv 2024, [paper]
- RECONCILE (Chen et al., 2023)
- LLM-Game-Agent (Lan et al., 2023)
- BattleAgentBench (Wang et al., 2024)
- Generative Agents (Park et al., 2023)
- Agent Hospital (Li et al., 2024)
- MedAgents (Tang et al., 2024)
- MEDCO (Wei et al., 2024)
- MetaGPT (Hong et al., 2023)
- ChatDev (Qian et al., 2024)
- Agent Laboratory (Schmidgall et al., 2025)
- The Virtual Lab (Swanson et al., 2024)
- CoELA (Zhang et al., 2023)
- VillagerAgent (Dong et al., 2024)
- LLM-Coordination (Agashe et al., 2024)
- MetaGPT (Hong et al., 2023)
- ChatDev (Qian et al., 2024)
- Generative Agents (Park et al., 2023)
- S-Agents (Chen et al., 2024)
- SciAgents (Ghafarollahi et al., 2024)
- AppAgent (Chi et al., 2023)
- MetaGPT (Hong et al., 2023)
- AgentBench (Liu et al., 2023)
- VAB (Liu et al., 2024)
- TaskWeaver (Qiao et al., 2024)
- HULA (Takerngsaksiri et al., 2025)
- MCP (Anthropic)
- Agora (Marro et al., 2024)
- IoA (Chen et al., 2024)
- MEDCO (Wei et al., 2024)
- Agent Hospital (Li et al., 2024)
- Welfare Diplomacy (Mukobi et al., 2023)
- MedAgents (Tang et al., 2024)
- DyLAN (Liu et al., 2023)
- GPTSwarm (Zhuge et al., 2024)
- CodeR (Chen et al., 2024)
- Oasis (Yang et al., 2024)
- Agent Laboratory (Schmidgall et al., 2025)
- The Virtual Lab (Swanson et al., 2024)
- OASIS (Yang et al., 2024)
- Generative Agents (Park et al., 2023)
- Welfare Diplomacy (Mukobi et al., 2023)
- LLM-Game-Agent (Lan et al., 2023)
- BattleAgentBench (Wang et al., 2024)
- MEDCO (Wei et al., 2024)
- Agent Hospital (Li et al., 2024)
- MedAgents (Tang et al., 2024)
- S-Agents (Chen et al., 2024)
- Dittos (Leong et al., 2024)
- PRELUDE (Gao et al., 2024)
- Generative Agents (Park et al., 2023)
- Welfare Diplomacy (Mukobi et al., 2023)
- LLM-Game-Agent (Lan et al., 2023)
- BattleAgentBench (Wang et al., 2024)
- Agent Hospital (Li et al., 2024)
- Agent Laboratory (Schmidgall et al., 2025)
- MEDCO (Wei et al., 2024)
- MBPP (dataset-mbpp)
- HotpotQA (dataset-hotpot-qa)
- MATH (dataset-math)
- SVAMP (dataset-svamp)
- MultiArith (dataset-multiarith)
- Collab-Overcooked (Sun et al., 2025)
- REALM-Bench (Geng et al., 2025)
- PARTNR (Chang et al., 2024)
- VillagerBench (Dong et al., 2024)
- AutoArena (Zhao et al., 2024)
- MultiagentBench (Zhu et al., 2025)
-
Jailbreak attacks and defenses against large language models: A survey, arXiv 2024, [paper]
-
Universal and transferable adversarial attacks on aligned language models, arXiv 2023, [paper]
-
Boosting jailbreak attack with momentum, arXiv 2024, [paper]
-
Improved techniques for optimization-based jailbreaking on large language models, arXiv 2024, [paper]
-
Jailbreak Instruction-Tuned LLMs via end-of-sentence MLP Re-weighting, arXiv 2024, [paper]
-
Open the Pandora's Box of LLMs: Jailbreaking LLMs through Representation Engineering, arXiv 2024, [paper]
-
DROJ: A Prompt-Driven Attack against Large Language Models, arXiv 2024, [paper]
-
Autodan: Generating stealthy jailbreak prompts on aligned large language models, arXiv 2023, [paper]
-
POEX: Policy Executable Embodied AI Jailbreak Attacks, arXiv 2024, [paper]
-
Jailbroken: How does LLM safety training fail?, NeurIPS 2023, [paper]
-
Jailbreaking black box large language models in twenty queries, arXiv 2023, [paper]
-
Jailbreaking large language models against moderation guardrails via cipher characters, NeurIPS 2024, [paper]
-
Visual adversarial examples jailbreak aligned large language models, AAAI 2024, [paper]
-
POEX: Policy Executable Embodied AI Jailbreak Attacks, arXiv 2024, [paper]
-
Autodan: Generating stealthy jailbreak prompts on aligned large language models, arXiv 2023, [paper]
-
Guard: Role-playing to generate natural-language jailbreakings to test guideline adherence of large language models, arXiv 2024, [paper]
-
Heuristic-Induced Multimodal Risk Distribution Jailbreak Attack for Multimodal Large Language Models, arXiv 2024, [paper]
-
Rt-attack: Jailbreaking text-to-image models via random token, arXiv 2024, [paper]
-
Not what you've signed up for: Compromising real-world LLM-integrated applications with indirect prompt injection, AISec@CCS 2023, [paper
-
Automatic and universal prompt injection attacks against large language models, arXiv 2024, [paper]
-
Optimization-based prompt injection attack to LLM-as-a-judge, CCS 2024, [paper]
-
Benchmarking indirect prompt injections in tool-integrated large language model agents, arXiv 2024, [paper]
-
Trust No AI: Prompt Injection Along The CIA Security Triad, arXiv 2024, [paper]
-
Empirical analysis of large vision-language models against goal hijacking via visual prompt injection, arXiv 2024, [paper]
-
Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition, arXiv 2024, [paper]
-
Ignore this title and HackAPrompt: Exposing systemic vulnerabilities of LLMs through a global prompt hacking competition, EMNLP 2023, [paper]
-
Not what you've signed up for: Compromising real-world LLM-integrated applications with indirect prompt injection, AISec@CCS 2023, [paper]
-
HijackRAG: Hijacking Attacks against Retrieval-Augmented Large Language Models, arXiv 2025, [paper]
-
Backdoored Retrievers for Prompt Injection Attacks on Retrieval Augmented Generation of Large Language Models, arXiv 2024, [paper]
-
Prompt Infection: LLM-to-LLM Prompt Injection within Multi-Agent Systems, arXiv 2024, [paper]
-
Adversarial search engine optimization for large language models, arXiv 2024, [paper]
-
Survey of hallucination in natural language generation, ACM Computing Surveys 2023, [paper]
-
A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions, arXiv 2023, [paper]
-
DELUCIONQA: Detecting Hallucinations in Domain-specific Question Answering, Findings of EMNLP 2023, [paper]
-
Deficiency of large language models in finance: An empirical examination of hallucination, Failure Modes Workshop @ NeurIPS 2023, [paper]
-
MetaGPT: Meta Programming for Multi-Agent Collaborative Framework, ICLR 2023, [paper]
-
Hallucination is inevitable: An innate limitation of large language models, arXiv 2024, [paper]
-
ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models, arXiv 2024, [paper]
-
Truth-Aware Context Selection: Mitigating the Hallucinations of Large Language Models Being Misled by Untruthful Contexts, arXiv 2024, [paper]
-
Large Language Models are Easily Confused: A Quantitative Metric, Security Implications and Typological Analysis, arXiv 2024, [paper]
-
HaluEval-Wild: Evaluating Hallucinations of Language Models in the Wild, arXiv 2024, [paper]
-
Analyzing and Mitigating Object Hallucination in Large Vision-Language Models, ICLR 2023, [paper]
-
Mitigating object hallucination in large vision-language models via classifier-free guidance, arXiv 2024, [paper]
-
When Large Language Models contradict humans? Large Language Models' Sycophantic Behaviour, arXiv 2023, [paper]
-
HallusionBench: an advanced diagnostic suite for entangled language hallucination and visual illusion in large vision-language models, CVPR 2024, [paper]
-
DiaHalu: A Dialogue-level Hallucination Evaluation Benchmark for Large Language Models, arXiv 2024, [paper]
-
AI alignment: A comprehensive survey, arXiv 2023, [paper]
-
Specification Gaming: The Flip Side of AI Ingenuity, DeepMind Blog 2020, [paper]
-
The alignment problem from a deep learning perspective, arXiv 2022, [paper]
-
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!, arXiv 2024, [paper]
-
Agent Alignment in Evolving Social Norms, arXiv 2024, [paper]
-
Model Merging and Safety Alignment: One Bad Model Spoils the Bunch, arXiv 2024, [paper]
-
Trustworthy LLMs: A survey and guideline for evaluating large language models' alignment, arXiv 2023, [paper]
-
Assessing the brittleness of safety alignment via pruning and low-rank modifications, arXiv 2024, [paper]
-
AI alignment: A comprehensive survey, arXiv 2023, [paper]
-
Fine-tuning aligned language models compromises safety, even when users do not intend to!, arXiv 2023, [paper]
-
Fundamental limitations of alignment in large language models, arXiv 2023, [paper]
-
Weight poisoning attacks on pre-trained models, ACL 2020, [paper]
-
Badedit: Backdooring large language models by model editing, arXiv 2024, [paper]
-
The philosopher's stone: Trojaning plugins of large language models, arXiv 2023, [paper]
-
Obliviate: Neutralizing Task-agnostic Backdoors within the Parameter-efficient Fine-tuning Paradigm, arXiv 2024, [paper]
-
Poisoned ChatGPT finds work for idle hands: Exploring developers’ coding practices with insecure suggestions from poisoned AI models, IEEE S&P 2024, [paper
-
Secret Collusion Among Generative AI Agents, arXiv 2024, [paper]
-
Exploiting the Vulnerability of Large Language Models via Defense-Aware Architectural Backdoor, arXiv 2024, [paper]
-
Poisoning language models during instruction tuning, ICML 2023, [paper]
-
Agentpoison: Red-teaming LLM agents via poisoning memory or knowledge bases, NeurIPS 2025, [paper]
-
Poison-RAG: Adversarial Data Poisoning Attacks on Retrieval-Augmented Generation in Recommender Systems, arXiv 2025, [paper]
-
PoisonBench: Assessing Large Language Model Vulnerability to Data Poisoning, arXiv 2024, [paper]
-
The dark side of human feedback: Poisoning large language models via user inputs, arXiv 2024, [paper]
-
Scaling laws for data poisoning in LLMs, arXiv 2024, [paper]
-
Talk too much: Poisoning large language models under token limit, arXiv 2024, [paper]
-
Best-of-Venom: Attacking RLHF by Injecting Poisoned Preference Data, arXiv 2024, [paper]
-
Sleeper agents: Training deceptive LLMs that persist through safety training, arXiv 2024, [paper]
-
Wipi: A new web threat for LLM-driven web agents, arXiv 2024, [paper]
-
Exploring backdoor attacks against large language model-based decision making, arXiv 2024, [paper]
-
When Backdoors Speak: Understanding LLM Backdoor Attacks Through Model-Generated Explanations, arXiv 2024, [paper]
-
Backdooring instruction-tuned large language models with virtual prompt injection, NAACL 2024, [paper]
-
Membership inference attacks against machine learning models, IEEE S&P 2017, [paper]
-
The secret sharer: Evaluating and testing unintended memorization in neural networks, USENIX Security 2019, [paper]
-
Label-only membership inference attacks, ICML 2021, [paper]
-
Practical membership inference attacks against fine-tuned large language models via self-prompt calibration, arXiv 2023, [paper]
-
Membership inference attacks from first principles, IEEE S&P 2022, [paper]
-
Membership inference attacks on machine learning: A survey, ACM Computing Surveys 2022, [paper]
-
Extracting training data from large language models, USENIX Security 2021, [paper]
-
Special characters attack: Toward scalable training data extraction from large language models, arXiv 2024, [paper]
-
Ethicist: Targeted training data extraction through loss smoothed soft prompting and calibrated confidence estimation, arXiv 2023, [paper]
-
Language model inversion, arXiv 2023, [paper]
-
Privacy risks of general-purpose language models, IEEE S&P 2020, [paper]
-
Quantifying memorization across neural language models, arXiv 2022, [paper]
-
Stealing part of a production language model, arXiv 2024, [[paper](https://arxiv.org
-
Ignore previous prompt: Attack techniques for language models, TSRML@NeurIPS 2022, [paper]
-
Prompt Stealing Attacks Against Text-to-Image Generation Models, USENIX Security 2024, [paper]
-
Safeguarding System Prompts for LLMs, arXiv 2024, [paper]
-
InputSnatch: Stealing Input in LLM Services via Timing Side-Channel Attacks, arXiv 2024, [paper]
-
Effective prompt extraction from language models, arXiv 2023, [paper]
-
Last one standing: A comparative analysis of security and privacy of soft prompt tuning, lora, and in-context learning, arXiv 2023, [paper]
-
LLM app store analysis: A vision and roadmap, ACM TOSEM 2024, [paper]
-
Prsa: Prompt reverse stealing attacks against large language models, arXiv 2024, [paper]
-
Prompt Leakage effect and defense strategies for multi-turn LLM interactions, arXiv 2024, [paper]
-
Investigating the prompt leakage effect and black-box defenses for multi-turn LLM interactions, arXiv 2024, [paper]
-
Why Are My Prompts Leaked? Unraveling Prompt Extraction Threats in Customized Large Language Models, arXiv 2024, [paper]
-
Pleak: Prompt leaking attacks against large language model applications, CCS 2024, [paper]
-
Stealing User Prompts from Mixture of Experts, arXiv 2024, [paper]
-
Extracting Prompts by Inverting LLM Outputs, arXiv 2024, [paper]
-
An LLM can Fool Itself: A Prompt-Based Adversarial Attack, arXiv 2023, [paper]
-
Revisiting Character-level Adversarial Attacks for Language Models, ICML 2024, [paper]
-
Hard prompts made easy: Gradient-based discrete optimization for prompt tuning and discovery, NeurIPS 2024, [paper]
-
Universal and transferable adversarial attacks on aligned language models, arXiv 2023, [paper]
-
Image hijacks: Adversarial images can control generative models at runtime, arXiv 2023, [paper]
-
Image-based Multimodal Models as Intruders: Transferable Multimodal Attacks on Video-based MLLMs, arXiv 2025, [paper]
-
Dissecting Adversarial Robustness of Multimodal LM Agents, ICLR 2025, [paper]
-
Poltergeist: Acoustic adversarial machine learning against cameras and computer vision, IEEE S&P 2021, [paper]
-
Inaudible adversarial perturbation: Manipulating the recognition of user speech in real time, arXiv 2023, [paper]
-
The Silent Manipulator: A Practical and Inaudible Backdoor Attack against Speech Recognition Systems, ACM Multimedia 2023, [paper]
-
Enrollment-stage backdoor attacks on speaker recognition systems via adversarial ultrasound, IEEE IoT Journal 2023, [paper]
-
Ultrabd: Backdoor attack against automatic speaker verification systems via adversarial ultrasound, ICPADS 2023, [paper
-
DolphinAttack: Inaudible voice commands, CCS 2017, [paper]
-
A Survey on Adversarial Robustness of LiDAR-based Machine Learning Perception in Autonomous Vehicles, arXiv 2024, [paper]
-
Rocking drones with intentional sound noise on gyroscopic sensors, USENIX Security 2015, [paper]
-
Adversarial attacks on multi-agent communication, ICCV 2021, [paper]
-
GPS location spoofing attack detection for enhancing the security of autonomous vehicles, IEEE VTC-Fall 2021, [paper]
-
Grounding large language models in interactive environments with online reinforcement learning, ICML 2023, [paper]
-
Bias and fairness in large language models: A survey, Computational Linguistics 2024, [paper]
-
Domain generalization using causal matching, ICML 2021, [paper]
-
GEM: Glare or gloom, I can still see you—End-to-end multi-modal object detection, IEEE RA-L 2021, [paper]
-
NPHardEval: Dynamic benchmark on reasoning ability of large language models via complexity classes, arXiv 2023, [paper]
-
Modeling opinion misperception and the emergence of silence in online social system, PLOS ONE 2024, [paper]
-
Bridging the domain gap for multi-agent perception, ICRA 2023, [paper]
-
Cooperative and competitive biases for multi-agent reinforcement learning, arXiv 2021, [paper]
-
Model-agnostic multi-agent perception framework, ICRA 2023, [paper]
-
Mutual influence between language and perception in multi-agent communication games, PLOS Computational Biology 2022, [paper]
-
A new era in LLM security: Exploring security concerns in real-world LLM-based systems, arXiv 2024, [paper]
-
Wipi: A new web threat for LLM-driven web agents, arXiv 2024, [paper]
-
Identifying the risks of LM agents with an LM-emulated sandbox, arXiv 2023, [paper]
-
Not what you've signed up for: Compromising real-world LLM-integrated applications with indirect prompt injection, AISec@CCS 2023, [paper]
-
Benchmarking indirect prompt injections in tool-integrated large language model agents, arXiv 2024, [paper]
-
Identifying the risks of LM agents with an LM-emulated sandbox, arXiv 2023, [paper]
-
Toolsword: Unveiling safety issues of large language models in tool learning across three stages, arXiv 2024, [paper]
-
Benchmarking indirect prompt injections in tool-integrated large language model agents, arXiv 2024, [paper]
-
Agentpoison: Red-teaming LLM agents via poisoning memory or knowledge bases, NeurIPS 2025, [paper]
-
ConfusedPilot: Confused deputy risks in RAG-based LLMs, arXiv 2024, [paper]
-
PoisonedRAG: Knowledge corruption attacks to retrieval-augmented generation of large language models, arXiv 2024, [paper]
-
Machine against the RAG: Jamming retrieval-augmented generation with blocker documents, arXiv 2024, [paper]
-
BadRAG: Identifying vulnerabilities in retrieval augmented generation of large language models, arXiv 2024, [paper]
-
TrojanRAG: Retrieval-Augmented Generation Can Be Backdoor Driver in Large Language Models, arXiv 2024, [paper]
-
Whispers in Grammars: Injecting Covert Backdoors to Compromise Dense Retrieval Systems, arXiv 2024, [paper]
-
Autonomous vehicles: Sophisticated attacks, safety issues, challenges, open topics, blockchain, and future directions, JCP 2023, [paper]
-
Engineering challenges ahead for robot teamwork in dynamic environments, Applied Sciences 2020, [paper]
-
On GPS spoofing of aerial platforms: a review of threats, challenges, methodologies, and future research directions, PeerJ Computer Science 2021, [paper]
-
Security and privacy in cyber-physical systems: A survey, IEEE Communications Surveys & Tutorials 2017, [paper]
-
Adversarial objects against LiDAR-based autonomous driving systems, arXiv 2019, [paper]
-
Learning to walk in the real world with minimal human effort, arXiv 2020, [paper]
-
Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science, arXiv 2024, [paper]
-
A new era in LLM security: Exploring security concerns in real-world LLM-based systems, arXiv 2024, [paper]
-
Demystifying RCE vulnerabilities in LLM-integrated apps, CCS 2024, [paper]
-
Wipi: A new web threat for LLM-driven web agents, arXiv 2024, [paper]
-
Application of large language models to DDoS attack detection, SPCPS 2023, [paper]
-
Coercing LLMs to do and reveal (almost) anything, arXiv 2024, [paper]
-
Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science, arXiv 2024, [paper]
-
EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage, arXiv 2024, [paper]
-
AdvWeb: Controllable Black-Box Attacks on VLM-Powered Web Agents, arXiv 2024, [paper]
-
AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection, arXiv 2025, [paper]
-
Multi-Agent Risks from Advanced AI, arXiv 2025, [paper]
-
Hoodwinked: Deception and cooperation in a text-based game for language models, arXiv 2023, [paper]
-
Attacking deep reinforcement learning with decoupled adversarial policy, IEEE TDSC 2022, [paper]
-
Secure consensus of multi-agent systems under denial-of-service attacks, Asian Journal of Control 2023, [paper]
-
A Perfect Collusion Benchmark: How can AI agents be prevented from colluding with information-theoretic undetectability?, Multi-Agent Security Workshop @ NeurIPS 2023, [paper]