第 7 章　综述与延伸阅读（Surveys & Further Reading）

按主题整理的权威综述 + 课程教程 + 评测基准 + 聚合清单 · 经 Chrome 实时检索 arXiv / GitHub 汇编

说明：本页清单于 2026-06-27 通过浏览器实时检索 arXiv 与 GitHub 汇编、链接逐条核对。其中大量为 2025–2026 预印本（编号 25xx/26xx），部分尚未同行评审，引用前请回原文复核版本与口径。各条按"主题 → 标题 + 一句话 + 链接"组织，便于按需深读。

7.1　总体 / Agent 架构与设计模式（综述）

The Rise and Potential of LLM Based Agents: A Survey（Xi et al., 2023）— 领域权威综述，系统梳理构成、应用与挑战。2309.07864
A Survey on Large Language Model based Autonomous Agents（Wang et al., 2023）— 构建/应用/评估三视角的经典综述。2308.11432
From Question Answering to Task Completion: A Survey on Agent System and Harness Design（2026）— 从"答题"到"完成任务"的系统与 harness 设计综述。2606.20683
A Two-Dimensional Framework for AI Agent Design Patterns（2026）— 用"认知功能 × 执行拓扑"统一 28 种命名设计模式。2605.13850
Agent System Operations: Categorization, Challenges, and Future Directions（2026）— "AgentOps"运维视角的分类与挑战。2606.01581
Critique of Agent Model（Eric Xing et al., 2026）— 对"什么是 agent / agency"的批判性反思。2606.23991

7.2　推理（综述）

The Periodic Table of LLM Reasoning（2026）— 把推理范式、方法与失败模式做结构化"元素表"。2606.11470
AI for Mathematical Reasoning: An Integrated Survey（2026）— 语言模型 + 神经符号 + 可验证发现的统一综述（47 页）。2606.08728
Generate, Filter, Control, Replay: A Survey of Rollout Strategies for LLM RL（2026）— 推理型模型背后的 RL rollout 策略综述。2605.02913

7.3　记忆 / 工具 / 技能 / 环境（综述）

From Storage to Experience: A Survey on the Evolution of LLM Agent Memory Mechanisms（ACL 2026 Findings）— agent 记忆机制演化综述。2605.06716
A Comprehensive Survey on Agent Skills: Taxonomy, Techniques, and Applications（2026）— "agent 技能"的分类、技术与应用。2605.07358
Agentic Environment Engineering for LLMs: A Survey（2026，63 页）— 环境建模、合成、评估与应用的系统综述。2606.12191

7.4　多智能体（综述）

Beyond Individual Intelligence: Surveying Collaboration, Failure Attribution, and Self-Evolution in LLM-based Multi-Agent Systems（2026）— 协作、失败归因与自演化三主题综述。2605.14892
Toward Human-Centered Multi-Agent Systems（2026）— 把认知/文化/价值/合作纳入多智能体设计。2606.08274

7.5　评测（综述）

From Holistic Evaluation to Structured Criteria: Rubrics Across the Evolving LLM Landscape（2026）— 从整体评测走向结构化 rubric 的评测综述。2606.08625
基准本身见下方 7.9 评测基准；评测难点与指标见第 4 章。

7.6　安全 / 可信 / 隐私 / 溯源（综述）

Towards Trustworthy Agentic AI: A Comprehensive Survey of Safety, Robustness, Privacy, and System Security（2026）— 可信 agentic AI 四维综述。2605.23989
Agents That Know Too Much: A Data-Centric Survey of Privacy in LLM Agents（2026）— 以数据为中心的 agent 隐私综述。2606.26627
From Agent Traces to Trust: A Survey of Evidence Tracing and Execution Provenance（2026）— agent 轨迹证据链与执行溯源。2606.04990
Toward Securing AI Agents Like Operating Systems（2026）— 以操作系统思路保护 agent。2605.14932
Security in the Fine-Tuning Lifecycle of LLMs（2026）— 微调生命周期的威胁/防御/评测。2605.25073
工程实践对照 OWASP Top 10 for LLM Applications。genai.owasp.org

7.7　系统 / 经济性 / 效率（综述）

Token Economics for LLM Agents（2026）— 从计算与经济双视角研究 agent 的 token 成本。2605.09104
Networking-Aware Energy Efficiency in Agentic AI Inference: A Survey（2026）— agentic 推理的网络感知能效。2604.07857
Model-Native Computing Architecture（2026）— 用计算机体系结构视角设想"模型原生"系统架构。2606.00288
From Accuracy to Auditability: A Survey of Determinism in Financial AI Systems（2026）— 可复现/可审计的确定性综述。2605.23955

7.8　领域应用（综述）

Agentic Trading: When LLM Agents Meet Financial Markets（2026，59 页）— 金融交易 agent 综述。2605.19337
AI-IoT-Robotics Integration: Survey of Frameworks, Emerging Trends...（IEEE IoT-J, 2026）— 强调边缘 SLM 与 LLM 的融合（呼应第 5 章）。2606.01015
Integrating Graphs, LLMs, and Agents: Reasoning and Retrieval（2026）— 图 + LLM + agent 的推理与检索。2604.15951

7.9　课程与教程

系统课程

Berkeley「Advanced Large Language Model Agents」MOOC — 研究生级公开课，覆盖 LLM 基础/推理/工具/多智能体/安全，含业界讲座。F24 · Sp25 进阶
Hugging Face AI Agents Course — 4 单元从概念到 smolagents/LangGraph/LlamaIndex 实战 + 期末项目。GitHub
Microsoft AI Agents for Beginners — 开源课程，工具集成/RAG/设计模式/多智能体/生产部署。GitHub
Multi AI Agent Systems with crewAI（DeepLearning.AI）— CrewAI 上手课。课程

概念指南（官方/经典）

Anthropic — Building Effective Agents（workflow vs agent + 五模式）。官方
OpenAI — A Practical Guide to Building Agents（34 页实务手册）。PDF
Lilian Weng — LLM Powered Autonomous Agents（规划/记忆/工具三支柱奠基博文）。博文
Prompt Engineering Guide（DAIR.AI）。网站

框架教程

LangGraph / LangChain（有状态图编排）· LlamaIndex（RAG/工作流）· AutoGen（对话式多智能体）· CrewAI（角色团队）· smolagents（code-as-action）· OpenAI Agents SDK（Handoffs/Guardrails）· Haystack（agentic 管线）· AutoGPT。框架对比见第 3 章 6.3。

7.10　评测基准（速查）

基准	测什么	规模/形式
GAIA	通用助手：推理+检索+工具+多模态	~450 短答，三级难度
AgentBench	LLM-as-Agent 多域	8 环境（OS/SQL/Web/游戏…）
WebArena / VisualWebArena	真实 Web 操作（文本/多模态）	自托管 4 站，800+/910 任务
OSWorld	真实桌面/Web GUI 操作	350+ 任务，按执行轨迹判分
SWE-bench (Verified/Lite/MM)	真实代码修复	2k+ GitHub issue（Verified 500）
ToolBench	真实 API 工具调用	16k+ RapidAPI
τ-bench / τ²-bench	工具+用户交互+策略（pass^k）	零售/航空（含双控）
BrowseComp	难定位信息的网页检索	1200+ 人造题
GPQA / AIME	研究生科学 / 数学推理	448 / 竞赛题

7.11　聚合清单（持续更新源）

artnitolog/awesome-agent-learning — 课程/指南/框架教程/基准的精选合集（本页课程部分来源）。GitHub
e2b-dev/awesome-ai-agents — 自治 agent 与框架的大型清单。GitHub
kaushikb11/awesome-llm-agents — LLM agent 框架精选。GitHub
mlabonne/llm-course — LLM 学习路线图（基础/科学家/工程师）。GitHub

想要持续跟踪？可把"每周拉取 arXiv 最新 agent 综述并汇总"做成定时任务（每周一早上自动检索 + 生成增量清单）。需要的话我可以帮你建。

第 7 章 综述与延伸阅读（Surveys & Further Reading）

7.1 总体 / Agent 架构与设计模式（综述）

7.2 推理（综述）

7.3 记忆 / 工具 / 技能 / 环境（综述）

7.4 多智能体（综述）

7.5 评测（综述）

7.6 安全 / 可信 / 隐私 / 溯源（综述）

7.7 系统 / 经济性 / 效率（综述）

7.8 领域应用（综述）

7.9 课程与教程

系统课程

概念指南（官方/经典）

框架教程

7.10 评测基准（速查）

7.11 聚合清单（持续更新源）

第 7 章　综述与延伸阅读（Surveys & Further Reading）

7.1　总体 / Agent 架构与设计模式（综述）

7.2　推理（综述）

7.3　记忆 / 工具 / 技能 / 环境（综述）

7.4　多智能体（综述）

7.5　评测（综述）

7.6　安全 / 可信 / 隐私 / 溯源（综述）

7.7　系统 / 经济性 / 效率（综述）

7.8　领域应用（综述）

7.9　课程与教程

7.10　评测基准（速查）

7.11　聚合清单（持续更新源）