性凉薄 發表於 2026-2-24 11:51:00

智能体demo本地化

<p>上次开发的智能体demo向量数据库和模型都是使用的huggingface的,需要上网才能正常使用,这次都改成本地,实现离线智能体应用开发,更加安全方便。</p>
<p>也是通过cursor完成的。</p>
<p>首先需要1安装ollama,然后是2拉取模型,3运行模型,最后4修改程序,5运行程序即可。</p>
<p>1、参照官网https://docs.ollama.com/</p>
<p>2、ollama pull qwen2.5:7b</p>
<p>3、ollama run&nbsp;qwen2.5:7b</p>
<p>4、修改如下:</p>
<div class="cnblogs_code">
<pre><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> agent.py</span>
<span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> os
</span><span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> requests
</span><span style="color: rgba(0, 0, 255, 1)">from</span> langchain_community.document_loaders <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> TextLoader
</span><span style="color: rgba(0, 0, 255, 1)">from</span> langchain_text_splitters <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> RecursiveCharacterTextSplitter
</span><span style="color: rgba(0, 0, 255, 1)">from</span> langchain_community.vectorstores <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> Chroma
</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> from langchain_community.embeddings import HuggingFaceEmbeddings</span>
<span style="color: rgba(0, 0, 255, 1)">from</span> langchain_community.embeddings <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> OllamaEmbeddings
</span><span style="color: rgba(0, 0, 255, 1)">from</span> langchain_core.prompts <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> ChatPromptTemplate
</span><span style="color: rgba(0, 0, 255, 1)">from</span> langchain_core.runnables <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> RunnablePassthrough
</span><span style="color: rgba(0, 0, 255, 1)">from</span> langchain_core.output_parsers <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> StrOutputParser

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 1. 加载并切分文档</span>
loader = TextLoader(<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">password-cn.txt</span><span style="color: rgba(128, 0, 0, 1)">"</span>, encoding=<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">utf-8</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">)
docs </span>=<span style="color: rgba(0, 0, 0, 1)"> loader.load()
text_splitter </span>= RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50<span style="color: rgba(0, 0, 0, 1)">)
splits </span>=<span style="color: rgba(0, 0, 0, 1)"> text_splitter.split_documents(docs)

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 2. 创建向量数据库(首次运行会自动 embedding)</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> vectorstore = Chroma.from_documents(</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)">   documents=splits,</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)">   embedding=HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> )</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> 替换顶部 import</span>


<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 替换创建 vectorstore 的地方</span>
embedding =<span style="color: rgba(0, 0, 0, 1)"> OllamaEmbeddings(
    model</span>=<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">qwen2.5:7b</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 这里填你在 Ollama 里实际安装的模型名,比如 "qwen2.5:7b"</span>
<span style="color: rgba(0, 0, 0, 1)">)

vectorstore </span>=<span style="color: rgba(0, 0, 0, 1)"> Chroma.from_documents(
    documents</span>=<span style="color: rgba(0, 0, 0, 1)">splits,
    embedding</span>=<span style="color: rgba(0, 0, 0, 1)">embedding
)
retriever </span>=<span style="color: rgba(0, 0, 0, 1)"> vectorstore.as_retriever()

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 3. 设置本地大模型</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> 优先使用 Ollama,如果不可用则提示用户安装</span>
<span style="color: rgba(0, 0, 255, 1)">try</span><span style="color: rgba(0, 0, 0, 1)">:
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 检查 Ollama 服务是否可用</span>
    response = requests.get(<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">http://localhost:11434/api/tags</span><span style="color: rgba(128, 0, 0, 1)">"</span>, timeout=2<span style="color: rgba(0, 0, 0, 1)">)
    </span><span style="color: rgba(0, 0, 255, 1)">if</span> response.status_code == 200<span style="color: rgba(0, 0, 0, 1)">:
      </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 获取已安装的模型列表</span>
      models_data =<span style="color: rgba(0, 0, 0, 1)"> response.json()
      available_models </span>= )]
      
      </span><span style="color: rgba(0, 0, 255, 1)">if</span> <span style="color: rgba(0, 0, 255, 1)">not</span><span style="color: rgba(0, 0, 0, 1)"> available_models:
            </span><span style="color: rgba(0, 0, 255, 1)">raise</span> RuntimeError(<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">Ollama 中没有安装任何模型</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">)
      
      </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 优先使用的模型列表(按优先级排序)</span>
      preferred_models = [<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">qwen2.5:7b</span><span style="color: rgba(128, 0, 0, 1)">"</span>, <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">qwen2.5:14b</span><span style="color: rgba(128, 0, 0, 1)">"</span>, <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">llama3.2:3b</span><span style="color: rgba(128, 0, 0, 1)">"</span>, <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">llama3:latest</span><span style="color: rgba(128, 0, 0, 1)">"</span>, <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">mistral:7b</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">]
      
      </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 选择模型:优先使用 preferred_models 中已安装的,否则使用第一个可用模型</span>
      selected_model =<span style="color: rgba(0, 0, 0, 1)"> None
      </span><span style="color: rgba(0, 0, 255, 1)">for</span> preferred <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> preferred_models:
            </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 检查精确匹配</span>
            <span style="color: rgba(0, 0, 255, 1)">if</span> preferred <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> available_models:
                selected_model </span>=<span style="color: rgba(0, 0, 0, 1)"> preferred
                </span><span style="color: rgba(0, 0, 255, 1)">break</span>
            <span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 检查部分匹配(如 preferred="llama3:latest" 匹配 available="llama3")</span>
            preferred_base = preferred.split(<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">:</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">)
            </span><span style="color: rgba(0, 0, 255, 1)">for</span> available <span style="color: rgba(0, 0, 255, 1)">in</span><span style="color: rgba(0, 0, 0, 1)"> available_models:
                </span><span style="color: rgba(0, 0, 255, 1)">if</span> available.startswith(preferred_base + <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">:</span><span style="color: rgba(128, 0, 0, 1)">"</span>) <span style="color: rgba(0, 0, 255, 1)">or</span> available ==<span style="color: rgba(0, 0, 0, 1)"> preferred_base:
                  selected_model </span>=<span style="color: rgba(0, 0, 0, 1)"> available
                  </span><span style="color: rgba(0, 0, 255, 1)">break</span>
            <span style="color: rgba(0, 0, 255, 1)">if</span><span style="color: rgba(0, 0, 0, 1)"> selected_model:
                </span><span style="color: rgba(0, 0, 255, 1)">break</span>
      
      <span style="color: rgba(0, 0, 255, 1)">if</span> <span style="color: rgba(0, 0, 255, 1)">not</span><span style="color: rgba(0, 0, 0, 1)"> selected_model:
            selected_model </span>=<span style="color: rgba(0, 0, 0, 1)"> available_models
      
      </span><span style="color: rgba(0, 0, 255, 1)">from</span> langchain_community.chat_models <span style="color: rgba(0, 0, 255, 1)">import</span><span style="color: rgba(0, 0, 0, 1)"> ChatOllama
      llm </span>=<span style="color: rgba(0, 0, 0, 1)"> ChatOllama(
            model</span>=<span style="color: rgba(0, 0, 0, 1)">selected_model,
            base_url</span>=<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">http://localhost:11434</span><span style="color: rgba(128, 0, 0, 1)">"</span>,<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> Ollama 默认地址</span>
<span style="color: rgba(0, 0, 0, 1)">      )
      </span><span style="color: rgba(0, 0, 255, 1)">print</span>(f<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">✓ 使用 Ollama 本地模型: {selected_model}</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">)
      </span><span style="color: rgba(0, 0, 255, 1)">if</span> selected_model <span style="color: rgba(0, 0, 255, 1)">not</span> <span style="color: rgba(0, 0, 255, 1)">in</span> preferred_models[:3]:<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 如果不是前3个推荐模型</span>
            <span style="color: rgba(0, 0, 255, 1)">print</span>(f<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">💡 提示: 推荐安装 qwen2.5:7b 以获得更好的中文支持</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">)
            </span><span style="color: rgba(0, 0, 255, 1)">print</span>(f<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">   运行: ollama pull qwen2.5:7b</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">)
    </span><span style="color: rgba(0, 0, 255, 1)">else</span><span style="color: rgba(0, 0, 0, 1)">:
      </span><span style="color: rgba(0, 0, 255, 1)">raise</span> ConnectionError(<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">Ollama 服务未响应</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 0, 255, 1)">except</span><span style="color: rgba(0, 0, 0, 1)"> requests.exceptions.ConnectionError as e:
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> Ollama 服务不可用</span>
    error_msg = <span style="color: rgba(128, 0, 0, 1)">"""</span><span style="color: rgba(128, 0, 0, 1)">
╔═══════════════════════════════════════════════════════════════╗
║❌ Ollama 服务未运行或未安装!                              ║
╚═══════════════════════════════════════════════════════════════╝

请按以下步骤操作:

1️⃣安装 Ollama:
   • 访问 https://ollama.ai 下载安装
   • 或使用 Homebrew: brew install ollama

2️⃣启动 Ollama 服务(安装后通常会自动启动):
   ollama serve

3️⃣下载模型(选择一个):
   ollama pull qwen2.5:7b    # 推荐:中文支持好(约 4.7GB)
   # 或
   ollama pull llama3.2:3b   # 更小更快(约 2GB)

4️⃣重新运行应用

═══════════════════════════════════════════════════════════════
</span><span style="color: rgba(128, 0, 0, 1)">"""</span>
    <span style="color: rgba(0, 0, 255, 1)">print</span><span style="color: rgba(0, 0, 0, 1)">(error_msg)
    </span><span style="color: rgba(0, 0, 255, 1)">raise</span> RuntimeError(<span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">Ollama 服务不可用。请先安装并启动 Ollama。</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">)
</span><span style="color: rgba(0, 0, 255, 1)">except</span><span style="color: rgba(0, 0, 0, 1)"> requests.exceptions.Timeout as e:
    error_msg </span>= <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">❌ 连接 Ollama 服务超时,请确保 Ollama 正在运行</span><span style="color: rgba(128, 0, 0, 1)">"</span>
    <span style="color: rgba(0, 0, 255, 1)">print</span><span style="color: rgba(0, 0, 0, 1)">(error_msg)
    </span><span style="color: rgba(0, 0, 255, 1)">raise</span><span style="color: rgba(0, 0, 0, 1)"> RuntimeError(error_msg)
</span><span style="color: rgba(0, 0, 255, 1)">except</span><span style="color: rgba(0, 0, 0, 1)"> Exception as e:
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 其他错误(如模型不存在)</span>
    error_msg = f<span style="color: rgba(128, 0, 0, 1)">"""</span><span style="color: rgba(128, 0, 0, 1)">
╔═══════════════════════════════════════════════════════════════╗
║❌ 初始化 Ollama 模型时出错                                  ║
╚═══════════════════════════════════════════════════════════════╝

错误信息: {str(e)}

请确保:
1. Ollama 服务正在运行
2. 至少安装了一个模型(运行: ollama pull &lt;模型名&gt;)

═══════════════════════════════════════════════════════════════
</span><span style="color: rgba(128, 0, 0, 1)">"""</span>
    <span style="color: rgba(0, 0, 255, 1)">print</span><span style="color: rgba(0, 0, 0, 1)">(error_msg)
    </span><span style="color: rgba(0, 0, 255, 1)">raise</span>

<span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 4. 构建 RAG 链(Retrieval-Augmented Generation)</span>
template = <span style="color: rgba(128, 0, 0, 1)">"""</span><span style="color: rgba(128, 0, 0, 1)">
你是一个私人顾问,请根据以下上下文回答问题。
如果不知道答案,请说“根据现有文档无法回答”,不要编造。

上下文:
{context}

问题:
{question}
</span><span style="color: rgba(128, 0, 0, 1)">"""</span><span style="color: rgba(0, 0, 0, 1)">
prompt </span>=<span style="color: rgba(0, 0, 0, 1)"> ChatPromptTemplate.from_template(template)

rag_chain </span>=<span style="color: rgba(0, 0, 0, 1)"> (
    {</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">context</span><span style="color: rgba(128, 0, 0, 1)">"</span>: retriever, <span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(128, 0, 0, 1)">question</span><span style="color: rgba(128, 0, 0, 1)">"</span><span style="color: rgba(0, 0, 0, 1)">: RunnablePassthrough()}
    </span>|<span style="color: rgba(0, 0, 0, 1)"> prompt
    </span>|<span style="color: rgba(0, 0, 0, 1)"> llm
    </span>|<span style="color: rgba(0, 0, 0, 1)"> StrOutputParser()
)

</span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 5. 提供调用函数</span><span style="color: rgba(0, 128, 0, 1)">
#</span><span style="color: rgba(0, 128, 0, 1)"> Gradio ChatInterface 默认会把 (message, history) 作为两个参数传进来</span>
<span style="color: rgba(0, 0, 255, 1)">def</span> ask_question(message: str, history=None) -&gt;<span style="color: rgba(0, 0, 0, 1)"> str:
    </span><span style="color: rgba(0, 128, 0, 1)">#</span><span style="color: rgba(0, 128, 0, 1)"> 我们这里只关心当前用户问题 message,忽略 history</span>
    <span style="color: rgba(0, 0, 255, 1)">return</span> rag_chain.invoke(message)</pre>
</div>
<p>5、python app.py</p>
<p>想要查看app.py完整代码,可以看上一篇随笔</p>
<p>&nbsp;</p><br><br>
来源:https://www.cnblogs.com/jicing/p/19632993
頁: [1]
查看完整版本: 智能体demo本地化