完整教程：Agent Prompt工程：如何让智能体更“听话”？（实践指南） - Powered by Discuz! Archiver

首页 › 智能体/知识库 › 完整教程：Agent Prompt工程：如何让智能体更“听话”？（实践指南）

正谈春风 發表於 2025-9-5 14:58:00

完整教程：Agent Prompt工程：如何让智能体更“听话”？（实践指南）

<style>pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important; display: block !important; font-family: "Consolas", "Monaco", "Courier New", monospace !important; font-size: 14px !important; line-height: 1.6 !important; padding: 16px !important; margin: 16px 0 !important; background-color: rgba(248, 248, 248, 1) !important; border: 1px solid rgba(225, 228, 232, 1) !important; border-radius: 6px !important; tab-size: 4 !important; -moz-tab-size: 4 !important; max-width: 100% !important; box-sizing: border-box !important }
code { font-family: "Consolas", "Monaco", "Courier New", monospace !important; font-size: 14px !important; white-space: pre !important; word-wrap: normal !important; word-break: normal !important; overflow-wrap: normal !important; display: inline !important; background: rgba(0, 0, 0, 0) !important; border: none !important; padding: 0 !important; margin: 0 !important; line-height: inherit !important }
pre code { background: rgba(0, 0, 0, 0) !important; border: 0 !important; border-radius: 0 !important; display: block !important; line-height: 1.6 !important; margin: 0 !important; max-width: none !important; overflow: visible !important; padding: 0 !important; white-space: pre !important; word-wrap: normal !important; word-break: normal !important; color: inherit !important }
.token.comment, .token.prolog, .token.doctype, .token.cdata { color: rgba(112, 128, 144, 1) !important; font-style: italic !important }
.token.punctuation { color: rgba(153, 153, 153, 1) !important }
.token.atrule, .token.attr-value, .token.keyword { color: rgba(0, 119, 170, 1) !important; font-weight: bold !important }
.token.function, .token.class-name { color: rgba(221, 74, 104, 1) !important; font-weight: bold !important }
.token.selector, .token.attr-name, .token.string, .token.char, .token.builtin, .token.inserted { color: rgba(102, 153, 0, 1) !important }
.token.property, .token.tag, .token.boolean, .token.number, .token.constant, .token.symbol, .token.deleted { color: rgba(153, 0, 85, 1) !important }
.cnblogs-markdown pre, .cnblogs-post-body pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important; background-color: rgba(248, 248, 248, 1) !important; border: 1px solid rgba(225, 228, 232, 1) !important; border-radius: 6px !important; padding: 16px !important; margin: 16px 0 !important }
pre, pre, pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important }</style>
<style>pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important; display: block !important; font-family: "Consolas", "Monaco", "Courier New", monospace !important; font-size: 14px !important; line-height: 1.6 !important; padding: 16px !important; margin: 16px 0 !important; background-color: rgba(248, 248, 248, 1) !important; border: 1px solid rgba(225, 228, 232, 1) !important; border-radius: 6px !important; tab-size: 4 !important; -moz-tab-size: 4 !important; max-width: 100% !important; box-sizing: border-box !important }
code { font-family: "Consolas", "Monaco", "Courier New", monospace !important; font-size: 14px !important; white-space: pre !important; word-wrap: normal !important; word-break: normal !important; overflow-wrap: normal !important; display: inline !important; background: rgba(0, 0, 0, 0) !important; border: none !important; padding: 0 !important; margin: 0 !important; line-height: inherit !important }
pre code { background: rgba(0, 0, 0, 0) !important; border: 0 !important; border-radius: 0 !important; display: block !important; line-height: 1.6 !important; margin: 0 !important; max-width: none !important; overflow: visible !important; padding: 0 !important; white-space: pre !important; word-wrap: normal !important; word-break: normal !important; color: inherit !important }
.token.comment, .token.prolog, .token.doctype, .token.cdata { color: rgba(112, 128, 144, 1) !important; font-style: italic !important }
.token.punctuation { color: rgba(153, 153, 153, 1) !important }
.token.atrule, .token.attr-value, .token.keyword { color: rgba(0, 119, 170, 1) !important; font-weight: bold !important }
.token.function, .token.class-name { color: rgba(221, 74, 104, 1) !important; font-weight: bold !important }
.token.selector, .token.attr-name, .token.string, .token.char, .token.builtin, .token.inserted { color: rgba(102, 153, 0, 1) !important }
.token.property, .token.tag, .token.boolean, .token.number, .token.constant, .token.symbol, .token.deleted { color: rgba(153, 0, 85, 1) !important }
.cnblogs-markdown pre, .cnblogs-post-body pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important; background-color: rgba(248, 248, 248, 1) !important; border: 1px solid rgba(225, 228, 232, 1) !important; border-radius: 6px !important; padding: 16px !important; margin: 16px 0 !important }
pre, pre, pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important }</style><div class="htmledit_views atom-one-dark" id="content_views"><p>随着大型语言模型（LLMs）能力的飞速发展，它们不再仅仅是简单的文本生成器，而是可以被赋予“智能体”（Agent）的能力，具备规划、工具调用、记忆、自主学习等复杂行为。然而，如何让这些智能体精准理解我们的意图，按计划高效执行任务，甚至在繁琐环境中做出合理的决策，是一个巨大的工程化挑战。这正是Agent Prompt工程大显身手的领域。</p><p>本文将深入探讨Agent Prompt工程的核心理念、关键技术和实用技巧，目标是帮助您构建出更“听话”、更可靠、更智能的AI Agent。我们将从Prompt的基本结构、如何构建思维链（CoT）、应用启用、记忆机制到反馈与迭代，为您提供一个全面的Agent Prompt工程实践指南。</p><p>一、 Agent Prompt工程：核心理念与目标</p><p>什么是Agent Prompt工程？</p><p>Agent Prompt工程是指设计、优化和管理用于指导LLM Agent执行特定任务、遵循特定规则、利用特定工具或与特定环境交互的文本提示（Prompts）。它不仅仅是给LLM一个轻松的指令，而是依据精心设计的Prompt，将其“引导”到一个期望的行为模式和能力空间。</p><p>核心目标：</p><p>明确指令与意图：让Agent准确理解用户或框架的需求。</p><p>控制复杂行为：引导Agent执行复杂的、多步骤的任务（如Planning, Reasoning, Acting）。</p><p>有效工具使用：确保Agent能够正确选择、调用和解析应用的输入输出。</p><p>提升可靠性与准确性：减少Agent的幻觉、误操作，输出更符合预期的结果。</p><p>增强可控性：设定Agent的行为边界、输出格式、思考过程，使其“听话”。</p><p>优化效率：提高Agent完成任务的成功率和效率。</p><p>二、 Agent Prompt的核心要素</p><p>一个典型的Agent Prompt通常包含以下几个关键要素：</p><p>角色设定 (Role Setting):</p><p>目的：赋予Agent特定的身份、知识领域或行为风格。</p><p>示例： "你是一位经验丰富的软件工程师，专注于Python开发。" / "你是一名严谨的法律顾问，提供咨询服务。"</p><p>任务描述 (Task Description):</p><p>目的：清晰、具体地阐述Agent需要完成的任务。</p><p>示例： "根据用户提供的需求，生成一段Python代码来解决问题。" / "分析以下合同条款，找出潜在的风险点。"</p><p>输入信息 (Input Information):</p><p>目的：提供Agent完成任务所需的所有相关上下文、内容或用户输入。</p><p>示例：用户提供的需求描述、原始数据、历史对话记录。</p><p>思考过程 (Thought Process / Reasoning):</p><p>目的：引导Agent逐步思考，分解问题，推理决策。这是"让Agent更听话"的关键。</p><p>方法：链式思考 (Chain-of-Thought, CoT)、思维树 (Tree-of-Thought, ToT) 等。</p><p>可用工具 (Tools / Action Space):</p><p>目的：告知Agent它可以使用的外部工具，以及每个器具的名称、功能、输入参数和输出格式。</p><p>示例：</p><p><JSON></p><p></p><p>[</p><p>{</p><p>"name": "search",</p><p>"description": "Searches the internet for general knowledge.",</p><p>"parameters": {</p><p>"type": "object",</p><p>"properties": {</p><p>"query": {"type": "string", "description": "The query to search for."}</p><p>},</p><p>"required": ["query"]</p><p>}</p><p>},</p><p>{</p><p>"name": "calculator",</p><p>"description": "Evaluates mathematical expressions.",</p><p>"parameters": {</p><p>"type": "object",</p><p>"properties": {</p><p>"expression": {"type": "string", "description": "The mathematical expression."}</p><p>},</p><p>"required": ["expression"]</p><p>}</p><p>}</p><p>]</p><p>输出格式 (Output Format):</p><p>目的：规定Agent最终输出的格式（JSON, Markdown, 特定结构等），便于下游平台解析。</p><p>示例： "请以JSON格式输出，涵盖"thought", "action", "action_input", "observation" 字段。"</p><p>约束与规则 (Constraints & Rules):</p><p>目的：设定Agent行为的边界，防止越界或不当行为。</p><p>示例： "不允许使用外部工具进行非事实性查询。" / "生成的代码必须是Python 3.9+兼容的。" / "回答不得超过1000个字符。"</p><p>三、 Agent Prompt 工程策略与技术</p><p>1. 链式思考 (Chain-of-Thought, CoT) Prompting</p><p>核心思想：鼓励LLM在回答疑问前，输出一系列中间推理步骤。这大大提高了复杂推理任务（如算术、常识推理、符号操作）的准确性。</p><p>原始Prompt (No CoT): A + B * C = ?</p><p>CoT Prompt: A + B * C = ?</p><p>Thought: Let's break this down. The order of operations (PEMDAS/BODMAS) says we do multiplication before addition. So, first calculate B * C, then add A.</p><p>Calculation: Let A=2, B=3, C=4. B * C = 3 * 4 = 12. Then A + 12 = 2 + 12 = 14.</p><p>Answer: 14</p><p>示例 (Agent Prompt中的CoT引导):</p><p><TEXT></p><p></p><p>You are a helpful assistant that needs to plan actions to fulfill user requests.</p><p>You have access to the following tools:</p><p>{tools}</p><p></p><p>When you are asked to do something, you should think step by step.</p><p>First, think about what you need to do. What tools do you need to call?</p><p>Second, decide the input for each tool.</p><p>Then, execute the tool and observe the output.</p><p>Finally, use the observation to answer the user or take the next step.</p><p></p><p>User: What is the weather in Paris and what is the current population of France?</p><p></p><p>Thought:</p><p>Okay, I need to find two pieces of information: the weather in Paris and the population of France.</p><p>I have two tools: 'search' and 'calculator'.</p><p>The 'search' tool seems appropriate for both queries.</p><p>First, I will search for the weather in Paris.</p><p>Then, I will search for the population of France.</p><p>Finally, I will combine the results and present them to the user.</p><p></p><p>Action:</p><p>```json</p><p>{{</p><p>"action": "search",</p><p>"action_input": {{</p><p>"query": "weather in Paris"</p><p>}}</p><p>}}</p><p>关键：</p><p>明确要求Agent“Think step by step”。</p><p>提供一个“Thought:”前缀，让Agent输出其思考过程。</p><p>然后是“Action:”前缀，引导Agent输出工具调用。</p><p>2. 工具使用 (Tool Use / Function Calling)</p><p>Agent的强大在于其能够与外部世界交互，而工具就是此种交互的媒介。</p><p>Prompt的关键：</p><p>清晰的工具描述：每个工具的name, description, parameters (及其type, properties, required) 必须准确无误。</p><p>示例：描述 search 工具：</p><p><JSON></p><p></p><p>{</p><p>"name": "search",</p><p>"description": "Searches the internet for general knowledge.",</p><p>"parameters": {</p><p>"type": "object",</p><p>"properties": {</p><p>"query": {</p><p>"type": "string",</p><p>"description": "The search query string."</p><p>}</p><p>},</p><p>"required": ["query"]</p><p>}</p><p>}</p><p>期望的API调用格式： LLM需要输出一个结构化的表示（如JSON），包含action（工具名）和action_input（工具参数）。</p><p>示例 (Agent Prompt结构)：</p><p><TEXT></p><p></p><p>You are an AI assistant with access to the following tools.</p><p></p><p>Tools:</p><p>{tool_code}</p><p></p><p>You are a helpful assistant. Respond to user questions by calling the actions from the tools.</p><p>You must use the JSON format for your response. The response must be in the following format:</p><p>{tool_call_format}</p><p></p><p>User: What is 2 plus 2?</p><p></p><p>Thought:</p><p>The user is asking for a simple calculation. I should use the calculator tool.</p><p>The expression to evaluate is "2 + 2".</p><p></p><p>Action:</p><p>```json</p><p>{{</p><p>"action": "calculator",</p><p>"action_input": {{</p><p>"expression": "2 + 2"</p><p>}}</p><p>}}</p><p><TEXT></p><p></p><p></p><p>**`{tool_code}` 和 `{tool_call_format}` 占位符** 会被你的Agent框架（如LangChain, LlamaIndex, AutoGen）动态填充。</p><p></p><p>#### 3. 记忆机制 (Memory)</p><p></p><p>**痛点：** Agent无法记住长时期的对话历史或关键的中间决策。</p><p></p><p>**Prompt解决方案：**</p><p>* **短期记忆 (Short-term Memory):** 将最近的对话历史、Agent的思考过程、程序输出等信息添加到Prompt中。</p><p>* **长期记忆 (Long-term Memory) / 知识库：**</p><p>* **Vector Databases:** 将重要的信息（如关键决策、用户偏好、知识片段）嵌入（Embed）后存入向量数据库，通过相似性搜索检索相关信息，再加入Prompt。</p><p>* **Summary Memory:** 定期总结过长的对话历史，保留关键信息，减少Prompt的Token长度。</p><p></p><p>**示例 (加入短期记忆和总结记忆):**</p><p>System: You are an AI assistant.</p><p>User: Tell me about the capital of France.</p><p>Assistant: The capital of France is Paris. It's a beautiful city known for its art, fashion, and gastronomy.</p><p>User: Now tell me about its population.</p><p>Assistant (Memory Summary): User previously asked about the capital of France, which is Paris.</p><p>User: Now tell me about its population.</p><p>Thought:</p><p>The user is asking for the population of Paris. I know from the previous turn that:</p><p>Summary: User previously asked about the capital of France, which is Paris.</p><p>I have a 'search' tool that can help me find this information.</p><p>Action:</p><p><JSON></p><p></p><p>{{</p><p>"action": "search",</p><p>"action_input": {{</p><p>"query": "population of Paris"</p><p>}}</p><p>}}</p><p><TEXT></p><p></p><p></p><p>#### 4. 反馈与迭代 (Feedback Loop)</p><p></p><p>**痛点：** Agent可能犯错，执行不准确。</p><p></p><p>**Prompt解决方案：**</p><p>* **自我修正 (Self-Correction):** 在Agent的思考流程中引入一个“Review”或“Self-Critique”步骤。</p><p>* **外部反馈：** 将Agent的输出结果提交给另一个LLM实例进行评估，或与已知事实比对，继而将评估结果反馈给Agent。</p><p></p><p>**示例 (加入自我反思步骤):**</p><p>User: Write a Python function to calculate the factorial of a number.</p><p>Thought:</p><p>User wants a Python function for factorial.</p><p>I need to define a function that takes an integer, calculates its factorial, and returns it.</p><p>I can implement this recursively or iteratively. Iterative is often more efficient and avoids recursion depth limits for large numbers.</p><p>I will use an iterative approach.</p><p>Action:</p><p><JSON></p><p></p><p>{{</p><p>"action": "python_interpreter",</p><p>"action_input": {{</p><p>"code": "def factorial(n):\n if n < 0:\n return 'Factorial not defined for negative numbers'\n elif n == 0:\n return 1\n else:\n result = 1\n for i in range(1, n + 1):\n result *= i\n return result\nprint(factorial(5))"</p><p>}}</p><p>}}</p><p>Observation:</p><p>120</p><p>Thought:</p><p>The previous step was to execute the factorial function with input 5.</p><p>The output was 120, which is the correct factorial of 5.</p><p>The code handles non-negative integers correctly.</p><p>However, it might be good to add some docstrings to the function for better usability and to explicitly mention that it returns an error message for negative input.</p><p>I should output the function definition with docstrings.</p><p>Action:</p><p><JSON></p><p></p><p>{{</p><p>"action": "python_interpreter",</p><p>"action_input": {{</p><p>"code": "def factorial(n):\n '''Calculates the factorial of a non-negative integer iteratively.'''\n if n < 0:\n return 'Factorial not defined for negative numbers'\n elif n == 0:\n return 1\n else:\n result = 1\n for i in range(1, n + 1):\n result *= i\n return result\n\nprint(factorial(5)) # Example usage"</p><p>}}</p><p>}}</p><p>关键：</p><p>在Thought:步骤中引入“Review”, “Self-Critique”等字眼。</p><p>让Agent根据Observation来决定下一步是“Answer”（结束）还是“Action”（继续执行或修正）。</p><p>四、组织Agent Prompt：模板与框架</p><p>为了更好地管理和复用Agent Prompt，通常需要借助Prompt模板和Agent框架。</p><p>Prompt Templates: 将Prompt结构化，采用占位符（如{user_input}, {tools}, {memory}, {instructions}）来动态填充内容。</p><p>Agent Frameworks:</p><p>LangChain: 提供了一套完整的Agent创建框架，包括AgentExecutor、Tool wrappers、Memory modules、Prompt templates，极大简化了Agent的构建流程。</p><p>LlamaIndex: 侧重于材料索引和检索，也提供了Agent构建能力。</p><p>AutoGen: 一个更加通用的多Agent协作框架，每个Agent可以有自己的Prompt和软件。</p><p>示例 (LangChain Agent Prompt Template 概念):</p><p><PYTHON></p><p></p><p>from langchain_core.prompts import PromptTemplate</p><p></p><p>template = """</p><p>You are a helpful AI assistant. You have access to the following tools:</p><p></p><p>{tools}</p><p></p><p>Use the following format:</p><p>User: the input question you must answer</p><p>Thought: I need to use a tool to help me answer the question.</p><p>Action: The action to take, should be one of [{tool_names}]</p><p>Action Input: The input to the action</p><p>Observation: The result of the action</p><p>... (this Thought/Action/Tool Input/Observation can repeat N times)</p><p>Thought: I now have enough information to answer the question.</p><p>Final Answer: the final answer to the user</p><p>{agent_scratchpad}</p><p>"""</p><p></p><p>prompt = PromptTemplate.from_template(template)</p><p></p><p># agent_scratchpad will be filled by the AgentExecutor with previous thoughts and actions.</p><p>五、 Agent Prompt工程的进阶技巧</p><p>Few-shot Prompting: 在Prompt中提供几个高质量的输入-输出示例，让LLM学习期望的行为模式。</p><p>Constitutional AI: 定义一套AI行为原则（Constitutions），Agent在输出前会检查是否违反这些原则，若是违反则进行修正。</p><p>Prompt Chaining / Graph: 将Agent的任务分解成一系列相互依赖的Prompt，形成一个任务图。</p><p>Tool Augmentation: 结合检索增强生成 (RAG) 技术，让Agent在调用程序前先利用检索获取信息。</p><p>Agent Orchestration: 设计多个Agent协作完成复杂任务，每个Agent负责一部分。</p><p>六、提升Agent“听话度”的总结</p><p>要让AI Agent更“听话”，核心在于清晰、结构化、多维度地沟通你的指令和期望：</p><p>明确角色：定义Agent的身份和能力边界。</p><p>分解任务：将麻烦任务拆解，引导Agent逐步思考，使用CoT。</p><p>详述工具：提供准确、完整的工具描述，并指定输出格式。</p><p>构建记忆：确保Agent能记住关键信息，必要时使用长期记忆。</p><p>引入反馈：让Agent能够自我检查、自我修正，或接受外部反馈。</p><p>启用模板和框架：提高Prompt的复用性和管理效率。</p><p>持续迭代：根据Agent的表现，不断优化Prompt。</p><p>一门科学。通过不断的实践和实验，你会发现如何用最有效的Prompt，解锁AI Agent的无限潜能。就是Agent Prompt工程是一门艺术，也</p></div><br><br>
来源：https://www.cnblogs.com/wzzkaifa/p/19075401

頁: [1]

查看完整版本: 完整教程：Agent Prompt工程：如何让智能体更“听话”？（实践指南）