Skip to content

Commit 0e4ed70

Browse files
[chore] daily pipeline
1 parent 6357a5c commit 0e4ed70

201 files changed

Lines changed: 12689 additions & 36 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

archive/20260605/recommend/arxiv_papers_20260605.standard.json

Lines changed: 453 additions & 0 deletions
Large diffs are not rendered by default.

archive/carryover.json

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
{
2-
"generated_at": "2026-06-04T21:47:30.140124+00:00",
3-
"updated_date": "20260604",
2+
"generated_at": "2026-06-05T20:52:32.189968+00:00",
3+
"updated_date": "20260605",
44
"carryover_days": 9,
55
"tag_states": {
66
"continual": {
7-
"updated_date": "20260604",
7+
"updated_date": "20260605",
88
"carryover_days": 9,
99
"items": []
1010
}
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
---
2+
title: "Skill is Not One-Size-Fits-All: Model-Aware Skill Alignment for LLM Agents"
3+
title_zh: 技能并非一刀切:面向LLM智能体的模型感知技能对齐
4+
authors: "Jianxiang Yu, Jiapeng Zhu, Bochen Lin, Qier Cui, Zichen Ding, Xiang Li"
5+
date: 2026-05-29
6+
pdf: "https://arxiv.org/pdf/2605.30723v1"
7+
tags: ["query:continual"]
8+
score: 6.0
9+
evidence: 模型感知的技能对齐框架,为不同代理主干进化技能表示以增强适应性
10+
tldr: 发现技能有效性高度依赖代理模型,相同技能对不同主干可能有害。本文提出MASA,一种模型感知技能对齐框架,通过层次化技能进化管道调整技能表示,无需修改代理权重。在多个基准上,MASA提升了代理的任务完成率,表明个性化技能适配是实现代理高效适应动态环境的有效手段,对代理持续自我优化具有启示意义。
11+
source: arxiv
12+
selection_source: fresh_fetch
13+
figures_json: "[{\"url\": \"assets/figures/arxiv/2605.30723v1/fig-001.webp\", \"caption\": \"\", \"page\": 0, \"index\": 1, \"width\": 788, \"height\": 549, \"label\": \"Figure\"}, {\"url\": \"assets/figures/arxiv/2605.30723v1/fig-002.webp\", \"caption\": \"\", \"page\": 0, \"index\": 2, \"width\": 1635, \"height\": 762, \"label\": \"Figure\"}, {\"url\": \"assets/figures/arxiv/2605.30723v1/fig-003.webp\", \"caption\": \"\", \"page\": 0, \"index\": 3, \"width\": 785, \"height\": 610, \"label\": \"Figure\"}, {\"url\": \"assets/figures/arxiv/2605.30723v1/fig-004.webp\", \"caption\": \"\", \"page\": 0, \"index\": 4, \"width\": 762, \"height\": 531, \"label\": \"Figure\"}, {\"url\": \"assets/figures/arxiv/2605.30723v1/fig-005.webp\", \"caption\": \"\", \"page\": 0, \"index\": 5, \"width\": 1635, \"height\": 2192, \"label\": \"Figure\"}]"
14+
tables_json: "[{\"url\": \"assets/tables/arxiv/2605.30723v1/table-001.webp\", \"caption\": \"\", \"page\": 0, \"index\": 1, \"width\": 1607, \"height\": 856, \"label\": \"Table\"}, {\"url\": \"assets/tables/arxiv/2605.30723v1/table-002.webp\", \"caption\": \"\", \"page\": 0, \"index\": 2, \"width\": 1584, \"height\": 826, \"label\": \"Table\"}, {\"url\": \"assets/tables/arxiv/2605.30723v1/table-003.webp\", \"caption\": \"\", \"page\": 0, \"index\": 3, \"width\": 657, \"height\": 807, \"label\": \"Table\"}, {\"url\": \"assets/tables/arxiv/2605.30723v1/table-004.webp\", \"caption\": \"\", \"page\": 0, \"index\": 4, \"width\": 644, \"height\": 261, \"label\": \"Table\"}, {\"url\": \"assets/tables/arxiv/2605.30723v1/table-005.webp\", \"caption\": \"\", \"page\": 0, \"index\": 5, \"width\": 1597, \"height\": 794, \"label\": \"Table\"}, {\"url\": \"assets/tables/arxiv/2605.30723v1/table-006.webp\", \"caption\": \"\", \"page\": 0, \"index\": 6, \"width\": 1597, \"height\": 731, \"label\": \"Table\"}, {\"url\": \"assets/tables/arxiv/2605.30723v1/table-007.webp\", \"caption\": \"\", \"page\": 0, \"index\": 7, \"width\": 1387, \"height\": 849, \"label\": \"Table\"}, {\"url\": \"assets/tables/arxiv/2605.30723v1/table-008.webp\", \"caption\": \"\", \"page\": 0, \"index\": 8, \"width\": 1090, \"height\": 739, \"label\": \"Table\"}, {\"url\": \"assets/tables/arxiv/2605.30723v1/table-009.webp\", \"caption\": \"\", \"page\": 0, \"index\": 9, \"width\": 1116, \"height\": 576, \"label\": \"Table\"}, {\"url\": \"assets/tables/arxiv/2605.30723v1/table-010.webp\", \"caption\": \"\", \"page\": 0, \"index\": 10, \"width\": 1646, \"height\": 1062, \"label\": \"Table\"}]"
15+
motivation: 模型感知的技能对齐框架,为不同代理主干进化技能表示以增强适应性。
16+
method: 方法与实现细节请参考摘要与正文。
17+
result: 结果与对比结论请参考摘要与正文。
18+
conclusion: 总体而言,该工作在所述任务上展示了有效性,并提供了可复用的思路或工具。
19+
---
20+
21+
## 摘要
22+
LLM智能体越来越多地检索外部策划的技能——即在决策时检索的流程指令——以提高在长期交互任务上的表现。现有的技能库通常被视为模型无关的,在能力和行为差异很大的不同骨干上复用相同的技能表述。然而,我们跨多个模型规模的控制实验表明,技能有效性强烈依赖于模型:对某个骨干有益的技能可能对另一个骨干有害。受此观察启发,我们提出了MASA(Model-Aware Skill Alignment,模型感知技能对齐),这是一个无需修改智能体权重即可让技能适应每个目标骨干的框架。MASA分两个阶段运作:(1)一个分层技能演化流水线,利用爬山法和UCB驱动的树搜索,在环境反馈和模型能力画像的指导下,迭代重写通用技能和任务特定技能;(2)一个轻量级的模型条件化技能重写器,基于演化轨迹训练,能在单次前向传递中重现适应过程。在三个交互环境和四个骨干上的实验表明,MASA始终取得最佳总体性能,相较最强基线最多提升25.8分。学习到的重写器还能在没有额外搜索的情况下泛化到未见任务和环境,始终以极小部分的推理成本超越一个大得多的教师LLM。
23+
24+
## Abstract
25+
LLM agents increasingly retrieve externally curated skills-procedural instructions retrieved at decision time-to improve performance on long-horizon interactive tasks. Existing skill libraries are typically treated as model-agnostic, reusing the same skill formulations across backbones with substantially different capacities and behaviors. However, our controlled experiments across multiple model scales show that skill effectiveness is strongly model-dependent: a skill that benefits one backbone can harm another. Motivated by this observation, we propose MASA Model-Aware Skill Alignment, a framework that adapts skills to each target backbone without modifying agent weights. MASA operates in two stages: (1) a hierarchical skill evolution pipeline that iteratively rewrites general and task-specific skills using hill climbing and UCB-driven tree search, guided by environment feedback and model capability profiles; and (2) a lightweight model-conditioned skill rewriter trained on evolution trajectories to reproduce the adaptation in a single forward pass. Experiments across three interactive environments and four backbones show that MASA consistently achieves the best overall performance, with gains of up to 25.8 points over the strongest baseline. The learned rewriter further generalizes to unseen tasks and environments without additional search, consistently outperforming a much larger teacher LLM at a fraction of the inference cost.

0 commit comments

Comments
 (0)