正谈春风 發表於 2025-9-5 14:58:00

完整教程:Agent Prompt工程:如何让智能体更“听话”?(实践指南)

<style>pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important; display: block !important; font-family: "Consolas", "Monaco", "Courier New", monospace !important; font-size: 14px !important; line-height: 1.6 !important; padding: 16px !important; margin: 16px 0 !important; background-color: rgba(248, 248, 248, 1) !important; border: 1px solid rgba(225, 228, 232, 1) !important; border-radius: 6px !important; tab-size: 4 !important; -moz-tab-size: 4 !important; max-width: 100% !important; box-sizing: border-box !important }
code { font-family: "Consolas", "Monaco", "Courier New", monospace !important; font-size: 14px !important; white-space: pre !important; word-wrap: normal !important; word-break: normal !important; overflow-wrap: normal !important; display: inline !important; background: rgba(0, 0, 0, 0) !important; border: none !important; padding: 0 !important; margin: 0 !important; line-height: inherit !important }
pre code { background: rgba(0, 0, 0, 0) !important; border: 0 !important; border-radius: 0 !important; display: block !important; line-height: 1.6 !important; margin: 0 !important; max-width: none !important; overflow: visible !important; padding: 0 !important; white-space: pre !important; word-wrap: normal !important; word-break: normal !important; color: inherit !important }
.token.comment, .token.prolog, .token.doctype, .token.cdata { color: rgba(112, 128, 144, 1) !important; font-style: italic !important }
.token.punctuation { color: rgba(153, 153, 153, 1) !important }
.token.atrule, .token.attr-value, .token.keyword { color: rgba(0, 119, 170, 1) !important; font-weight: bold !important }
.token.function, .token.class-name { color: rgba(221, 74, 104, 1) !important; font-weight: bold !important }
.token.selector, .token.attr-name, .token.string, .token.char, .token.builtin, .token.inserted { color: rgba(102, 153, 0, 1) !important }
.token.property, .token.tag, .token.boolean, .token.number, .token.constant, .token.symbol, .token.deleted { color: rgba(153, 0, 85, 1) !important }
.cnblogs-markdown pre, .cnblogs-post-body pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important; background-color: rgba(248, 248, 248, 1) !important; border: 1px solid rgba(225, 228, 232, 1) !important; border-radius: 6px !important; padding: 16px !important; margin: 16px 0 !important }
pre, pre, pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important }</style>
      <style>pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important; display: block !important; font-family: "Consolas", "Monaco", "Courier New", monospace !important; font-size: 14px !important; line-height: 1.6 !important; padding: 16px !important; margin: 16px 0 !important; background-color: rgba(248, 248, 248, 1) !important; border: 1px solid rgba(225, 228, 232, 1) !important; border-radius: 6px !important; tab-size: 4 !important; -moz-tab-size: 4 !important; max-width: 100% !important; box-sizing: border-box !important }
code { font-family: "Consolas", "Monaco", "Courier New", monospace !important; font-size: 14px !important; white-space: pre !important; word-wrap: normal !important; word-break: normal !important; overflow-wrap: normal !important; display: inline !important; background: rgba(0, 0, 0, 0) !important; border: none !important; padding: 0 !important; margin: 0 !important; line-height: inherit !important }
pre code { background: rgba(0, 0, 0, 0) !important; border: 0 !important; border-radius: 0 !important; display: block !important; line-height: 1.6 !important; margin: 0 !important; max-width: none !important; overflow: visible !important; padding: 0 !important; white-space: pre !important; word-wrap: normal !important; word-break: normal !important; color: inherit !important }
.token.comment, .token.prolog, .token.doctype, .token.cdata { color: rgba(112, 128, 144, 1) !important; font-style: italic !important }
.token.punctuation { color: rgba(153, 153, 153, 1) !important }
.token.atrule, .token.attr-value, .token.keyword { color: rgba(0, 119, 170, 1) !important; font-weight: bold !important }
.token.function, .token.class-name { color: rgba(221, 74, 104, 1) !important; font-weight: bold !important }
.token.selector, .token.attr-name, .token.string, .token.char, .token.builtin, .token.inserted { color: rgba(102, 153, 0, 1) !important }
.token.property, .token.tag, .token.boolean, .token.number, .token.constant, .token.symbol, .token.deleted { color: rgba(153, 0, 85, 1) !important }
.cnblogs-markdown pre, .cnblogs-post-body pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important; background-color: rgba(248, 248, 248, 1) !important; border: 1px solid rgba(225, 228, 232, 1) !important; border-radius: 6px !important; padding: 16px !important; margin: 16px 0 !important }
pre, pre, pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important }</style><div class="htmledit_views atom-one-dark" id="content_views"><p>随着大型语言模型(LLMs)能力的飞速发展,它们不再仅仅是简单的文本生成器,而是可以被赋予“智能体”(Agent)的能力,具备规划、工具调用、记忆、自主学习等复杂行为。然而,如何让这些智能体精准理解我们的意图,按计划高效执行任务,甚至在繁琐环境中做出合理的决策,是一个巨大的工程化挑战。这正是Agent Prompt工程大显身手的领域。</p><p>本文将深入探讨Agent Prompt工程的核心理念、关键技术和实用技巧,目标是帮助您构建出更“听话”、更可靠、更智能的AI Agent。我们将从Prompt的基本结构、如何构建思维链(CoT)、应用启用、记忆机制到反馈与迭代,为您提供一个全面的Agent Prompt工程实践指南。</p><p>一、 Agent Prompt工程:核心理念与目标</p><p>什么是Agent Prompt工程?</p><p>Agent Prompt工程是指设计、优化和管理用于指导LLM Agent执行特定任务、遵循特定规则、利用特定工具或与特定环境交互的文本提示(Prompts)。它不仅仅是给LLM一个轻松的指令,而是依据精心设计的Prompt,将其“引导”到一个期望的行为模式和能力空间。</p><p>核心目标:</p><p>明确指令与意图: 让Agent准确理解用户或框架的需求。</p><p>控制复杂行为: 引导Agent执行复杂的、多步骤的任务(如Planning, Reasoning, Acting)。</p><p>有效工具使用: 确保Agent能够正确选择、调用和解析应用的输入输出。</p><p>提升可靠性与准确性: 减少Agent的幻觉、误操作,输出更符合预期的结果。</p><p>增强可控性: 设定Agent的行为边界、输出格式、思考过程,使其“听话”。</p><p>优化效率: 提高Agent完成任务的成功率和效率。</p><p>二、 Agent Prompt的核心要素</p><p>一个典型的Agent Prompt通常包含以下几个关键要素:</p><p>角色设定 (Role Setting):</p><p>目的: 赋予Agent特定的身份、知识领域或行为风格。</p><p>示例: "你是一位经验丰富的软件工程师,专注于Python开发。" / "你是一名严谨的法律顾问,提供咨询服务。"</p><p>任务描述 (Task Description):</p><p>目的: 清晰、具体地阐述Agent需要完成的任务。</p><p>示例: "根据用户提供的需求,生成一段Python代码来解决问题。" / "分析以下合同条款,找出潜在的风险点。"</p><p>输入信息 (Input Information):</p><p>目的: 提供Agent完成任务所需的所有相关上下文、内容或用户输入。</p><p>示例: 用户提供的需求描述、原始数据、历史对话记录。</p><p>思考过程 (Thought Process / Reasoning):</p><p>目的: 引导Agent逐步思考,分解问题,推理决策。这是"让Agent更听话"的关键。</p><p>方法: 链式思考 (Chain-of-Thought, CoT)、思维树 (Tree-of-Thought, ToT) 等。</p><p>可用工具 (Tools / Action Space):</p><p>目的: 告知Agent它可以使用的外部工具,以及每个器具的名称、功能、输入参数和输出格式。</p><p>示例:</p><p>&lt;JSON&gt;</p><p></p><p>[</p><p>{</p><p>"name": "search",</p><p>"description": "Searches the internet for general knowledge.",</p><p>"parameters": {</p><p>"type": "object",</p><p>"properties": {</p><p>"query": {"type": "string", "description": "The query to search for."}</p><p>},</p><p>"required": ["query"]</p><p>}</p><p>},</p><p>{</p><p>"name": "calculator",</p><p>"description": "Evaluates mathematical expressions.",</p><p>"parameters": {</p><p>"type": "object",</p><p>"properties": {</p><p>"expression": {"type": "string", "description": "The mathematical expression."}</p><p>},</p><p>"required": ["expression"]</p><p>}</p><p>}</p><p>]</p><p>输出格式 (Output Format):</p><p>目的: 规定Agent最终输出的格式(JSON, Markdown, 特定结构等),便于下游平台解析。</p><p>示例: "请以JSON格式输出,涵盖"thought", "action", "action_input", "observation" 字段。"</p><p>约束与规则 (Constraints &amp; Rules):</p><p>目的: 设定Agent行为的边界,防止越界或不当行为。</p><p>示例: "不允许使用外部工具进行非事实性查询。" / "生成的代码必须是Python 3.9+兼容的。" / "回答不得超过1000个字符。"</p><p>三、 Agent Prompt 工程策略与技术</p><p>1. 链式思考 (Chain-of-Thought, CoT) Prompting</p><p>核心思想: 鼓励LLM在回答疑问前,输出一系列中间推理步骤。这大大提高了复杂推理任务(如算术、常识推理、符号操作)的准确性。</p><p>原始Prompt (No CoT): A + B * C = ?</p><p>CoT Prompt: A + B * C = ?</p><p>Thought: Let's break this down. The order of operations (PEMDAS/BODMAS) says we do multiplication before addition. So, first calculate B * C, then add A.</p><p>Calculation: Let A=2, B=3, C=4. B * C = 3 * 4 = 12. Then A + 12 = 2 + 12 = 14.</p><p>Answer: 14</p><p>示例 (Agent Prompt中的CoT引导):</p><p>&lt;TEXT&gt;</p><p></p><p>You are a helpful assistant that needs to plan actions to fulfill user requests.</p><p>You have access to the following tools:</p><p>{tools}</p><p></p><p>When you are asked to do something, you should think step by step.</p><p>First, think about what you need to do. What tools do you need to call?</p><p>Second, decide the input for each tool.</p><p>Then, execute the tool and observe the output.</p><p>Finally, use the observation to answer the user or take the next step.</p><p></p><p>User: What is the weather in Paris and what is the current population of France?</p><p></p><p>Thought:</p><p>Okay, I need to find two pieces of information: the weather in Paris and the population of France.</p><p>I have two tools: 'search' and 'calculator'.</p><p>The 'search' tool seems appropriate for both queries.</p><p>First, I will search for the weather in Paris.</p><p>Then, I will search for the population of France.</p><p>Finally, I will combine the results and present them to the user.</p><p></p><p>Action:</p><p>```json</p><p>{{</p><p>"action": "search",</p><p>"action_input": {{</p><p>"query": "weather in Paris"</p><p>}}</p><p>}}</p><p>关键:</p><p>明确要求Agent“Think step by step”。</p><p>提供一个“Thought:”前缀,让Agent输出其思考过程。</p><p>然后是“Action:”前缀,引导Agent输出工具调用。</p><p>2. 工具使用 (Tool Use / Function Calling)</p><p>Agent的强大在于其能够与外部世界交互,而工具就是此种交互的媒介。</p><p>Prompt的关键:</p><p>清晰的工具描述: 每个工具的name, description, parameters (及其type, properties, required) 必须准确无误。</p><p>示例: 描述 search 工具:</p><p>&lt;JSON&gt;</p><p></p><p>{</p><p>"name": "search",</p><p>"description": "Searches the internet for general knowledge.",</p><p>"parameters": {</p><p>"type": "object",</p><p>"properties": {</p><p>"query": {</p><p>"type": "string",</p><p>"description": "The search query string."</p><p>}</p><p>},</p><p>"required": ["query"]</p><p>}</p><p>}</p><p>期望的API调用格式: LLM需要输出一个结构化的表示(如JSON),包含action(工具名)和action_input(工具参数)。</p><p>示例 (Agent Prompt结构):</p><p>&lt;TEXT&gt;</p><p></p><p>You are an AI assistant with access to the following tools.</p><p></p><p>Tools:</p><p>{tool_code}</p><p></p><p>You are a helpful assistant. Respond to user questions by calling the actions from the tools.</p><p>You must use the JSON format for your response. The response must be in the following format:</p><p>{tool_call_format}</p><p></p><p>User: What is 2 plus 2?</p><p></p><p>Thought:</p><p>The user is asking for a simple calculation. I should use the calculator tool.</p><p>The expression to evaluate is "2 + 2".</p><p></p><p>Action:</p><p>```json</p><p>{{</p><p>"action": "calculator",</p><p>"action_input": {{</p><p>"expression": "2 + 2"</p><p>}}</p><p>}}</p><p>&lt;TEXT&gt;</p><p></p><p></p><p>**`{tool_code}` 和 `{tool_call_format}` 占位符** 会被你的Agent框架(如LangChain, LlamaIndex, AutoGen)动态填充。</p><p></p><p>#### 3. 记忆机制 (Memory)</p><p></p><p>**痛点:** Agent无法记住长时期的对话历史或关键的中间决策。</p><p></p><p>**Prompt解决方案:**</p><p>* **短期记忆 (Short-term Memory):** 将最近的对话历史、Agent的思考过程、程序输出等信息添加到Prompt中。</p><p>* **长期记忆 (Long-term Memory) / 知识库:**</p><p>* **Vector Databases:** 将重要的信息(如关键决策、用户偏好、知识片段)嵌入(Embed)后存入向量数据库,通过相似性搜索检索相关信息,再加入Prompt。</p><p>* **Summary Memory:** 定期总结过长的对话历史,保留关键信息,减少Prompt的Token长度。</p><p></p><p>**示例 (加入短期记忆和总结记忆):**</p><p>System: You are an AI assistant.</p><p>User: Tell me about the capital of France.</p><p>Assistant: The capital of France is Paris. It's a beautiful city known for its art, fashion, and gastronomy.</p><p>User: Now tell me about its population.</p><p>Assistant (Memory Summary): User previously asked about the capital of France, which is Paris.</p><p>User: Now tell me about its population.</p><p>Thought:</p><p>The user is asking for the population of Paris. I know from the previous turn that:</p><p>Summary: User previously asked about the capital of France, which is Paris.</p><p>I have a 'search' tool that can help me find this information.</p><p>Action:</p><p>&lt;JSON&gt;</p><p></p><p>{{</p><p>"action": "search",</p><p>"action_input": {{</p><p>"query": "population of Paris"</p><p>}}</p><p>}}</p><p>&lt;TEXT&gt;</p><p></p><p></p><p>#### 4. 反馈与迭代 (Feedback Loop)</p><p></p><p>**痛点:** Agent可能犯错,执行不准确。</p><p></p><p>**Prompt解决方案:**</p><p>* **自我修正 (Self-Correction):** 在Agent的思考流程中引入一个“Review”或“Self-Critique”步骤。</p><p>* **外部反馈:** 将Agent的输出结果提交给另一个LLM实例进行评估,或与已知事实比对,继而将评估结果反馈给Agent。</p><p></p><p>**示例 (加入自我反思步骤):**</p><p>User: Write a Python function to calculate the factorial of a number.</p><p>Thought:</p><p>User wants a Python function for factorial.</p><p>I need to define a function that takes an integer, calculates its factorial, and returns it.</p><p>I can implement this recursively or iteratively. Iterative is often more efficient and avoids recursion depth limits for large numbers.</p><p>I will use an iterative approach.</p><p>Action:</p><p>&lt;JSON&gt;</p><p></p><p>{{</p><p>"action": "python_interpreter",</p><p>"action_input": {{</p><p>"code": "def factorial(n):\n if n &lt; 0:\n return 'Factorial not defined for negative numbers'\n elif n == 0:\n return 1\n else:\n result = 1\n for i in range(1, n + 1):\n result *= i\n return result\nprint(factorial(5))"</p><p>}}</p><p>}}</p><p>Observation:</p><p>120</p><p>Thought:</p><p>The previous step was to execute the factorial function with input 5.</p><p>The output was 120, which is the correct factorial of 5.</p><p>The code handles non-negative integers correctly.</p><p>However, it might be good to add some docstrings to the function for better usability and to explicitly mention that it returns an error message for negative input.</p><p>I should output the function definition with docstrings.</p><p>Action:</p><p>&lt;JSON&gt;</p><p></p><p>{{</p><p>"action": "python_interpreter",</p><p>"action_input": {{</p><p>"code": "def factorial(n):\n '''Calculates the factorial of a non-negative integer iteratively.'''\n if n &lt; 0:\n return 'Factorial not defined for negative numbers'\n elif n == 0:\n return 1\n else:\n result = 1\n for i in range(1, n + 1):\n result *= i\n return result\n\nprint(factorial(5)) # Example usage"</p><p>}}</p><p>}}</p><p>关键:</p><p>在Thought:步骤中引入“Review”, “Self-Critique”等字眼。</p><p>让Agent根据Observation来决定下一步是“Answer”(结束)还是“Action”(继续执行或修正)。</p><p>四、 组织Agent Prompt:模板与框架</p><p>为了更好地管理和复用Agent Prompt,通常需要借助Prompt模板和Agent框架。</p><p>Prompt Templates: 将Prompt结构化,采用占位符(如{user_input}, {tools}, {memory}, {instructions})来动态填充内容。</p><p>Agent Frameworks:</p><p>LangChain: 提供了一套完整的Agent创建框架,包括AgentExecutor、Tool wrappers、Memory modules、Prompt templates,极大简化了Agent的构建流程。</p><p>LlamaIndex: 侧重于材料索引和检索,也提供了Agent构建能力。</p><p>AutoGen: 一个更加通用的多Agent协作框架,每个Agent可以有自己的Prompt和软件。</p><p>示例 (LangChain Agent Prompt Template 概念):</p><p>&lt;PYTHON&gt;</p><p></p><p>from langchain_core.prompts import PromptTemplate</p><p></p><p>template = """</p><p>You are a helpful AI assistant. You have access to the following tools:</p><p></p><p>{tools}</p><p></p><p>Use the following format:</p><p>User: the input question you must answer</p><p>Thought: I need to use a tool to help me answer the question.</p><p>Action: The action to take, should be one of [{tool_names}]</p><p>Action Input: The input to the action</p><p>Observation: The result of the action</p><p>... (this Thought/Action/Tool Input/Observation can repeat N times)</p><p>Thought: I now have enough information to answer the question.</p><p>Final Answer: the final answer to the user</p><p>{agent_scratchpad}</p><p>"""</p><p></p><p>prompt = PromptTemplate.from_template(template)</p><p></p><p># agent_scratchpad will be filled by the AgentExecutor with previous thoughts and actions.</p><p>五、 Agent Prompt工程的进阶技巧</p><p>Few-shot Prompting: 在Prompt中提供几个高质量的输入-输出示例,让LLM学习期望的行为模式。</p><p>Constitutional AI: 定义一套AI行为原则(Constitutions),Agent在输出前会检查是否违反这些原则,若是违反则进行修正。</p><p>Prompt Chaining / Graph: 将Agent的任务分解成一系列相互依赖的Prompt,形成一个任务图。</p><p>Tool Augmentation: 结合检索增强生成 (RAG) 技术,让Agent在调用程序前先利用检索获取信息。</p><p>Agent Orchestration: 设计多个Agent协作完成复杂任务,每个Agent负责一部分。</p><p>六、 提升Agent“听话度”的总结</p><p>要让AI Agent更“听话”,核心在于清晰、结构化、多维度地沟通你的指令和期望:</p><p>明确角色: 定义Agent的身份和能力边界。</p><p>分解任务: 将麻烦任务拆解,引导Agent逐步思考,使用CoT。</p><p>详述工具: 提供准确、完整的工具描述,并指定输出格式。</p><p>构建记忆: 确保Agent能记住关键信息,必要时使用长期记忆。</p><p>引入反馈: 让Agent能够自我检查、自我修正,或接受外部反馈。</p><p>启用模板和框架: 提高Prompt的复用性和管理效率。</p><p>持续迭代: 根据Agent的表现,不断优化Prompt。</p><p>一门科学。通过不断的实践和实验,你会发现如何用最有效的Prompt,解锁AI Agent的无限潜能。就是Agent Prompt工程是一门艺术,也</p></div><br><br>
来源:https://www.cnblogs.com/wzzkaifa/p/19075401
頁: [1]
查看完整版本: 完整教程:Agent Prompt工程:如何让智能体更“听话”?(实践指南)