Role-Agent: Bootstrapping LLM Agents via Dual-Role EvolutionAuthors: Xucong Wang, Ziyu Ma, Shidong Yang, Tongwen Huang, Pengkun Wang, Yong Wang, Xiangxiang Chu (USTC AMAP, Alibaba) |Year: 2026 |arXiv: 2606.10917二、研究背景LLM Agent 的学习受限于两个问题(1)低效的交互反馈——传统强化学习通常只有稀疏的最终奖励(2)静态训练环境——训练数据固定无法针对失败模式进行针对性练习。Role-Agent 的核心洞察LLM 本身具有足够的世界知识可以模拟环境动态同时具备分析自身失败的能力可以主动选择练习题。四、实验结果在编程、导航、知识问答等多个 Agent 基准上评测相比强基线平均提升4%WIA 的过程奖励在长时序任务中效果尤为显著AIW 的失败模式检索有效将练习集中于已知弱点报告生成时间2026-06-11 | 论文来源arXiv:2606.10917原文摘要:Although Large Language Model (LLM) agents have demonstrated strong performance on complex tasks, their learning is often limited by inefficient interaction feedback and static training environments, which hinder broader generalization. To address these limitations, this paper introduces Role-Agent, \textcolor{black}{a framework} that harnesses a single LLM to function concurrently as both the agent and the environment, enabling a bootstrapped co-evolution. Role-Agent comprises two synergistic components: World-In-Agent (WIA) and Agent-In-World (AIW). In WIA, the LLM acts as the agent and predicts future states after each action; the alignment between predicted and actual states is then used as a process reward, encouraging environment-aware reasoning. In AIW, the LLM analyzes failure modes from failed trajectories and retrieves tasks with similar failure patterns, thereby reshaping the training data distribution for targeted practice. Experiments on multiple benchmarks show that Role-Agent consistently improves performance, yielding an average gain of over 4% over strong baselines.PDF链接:https://arxiv.org/pdf/2606.10917v1部分平台可能图片显示异常请以我的博客内容为准