Not yet and that is really key feature.
I asked the online AI about it and apparently its already possible. I will need to try this.
Short answer: Ollama’s local models that can read the web and be used in self-improving/agentic workflows are the tool-enabled conversational models like qwen3 (and qwen3:4b/other sizes) and gpt-oss when paired with Ollama’s web_search/web_fetch tools and an agent framework (e.g., MCP/web-search agent examples).
Notes:
Ollama added a web search API and web_fetch that lets local models call the web as a tool.
To build self-improving or agentic behavior you combine a capable model (qwen3 or gpt-oss are recommended) with Ollama’s tool integrations (web_search/web_fetch, MCP) or an external agent orchestration layer (Open WebUI pipelines, custom MCP server).
Step-by-step: Run a local Ollama model that reads the web and supports agentic/self-improving workflows
Prerequisites (assume macOS/Linux; adapt for Windows/WSL):
Install Ollama (https://ollama.com/install) and confirm ollama CLI runs.
Install Python ≥3.10 and pip; create a venv.
Choose and pull a tool-capable model
Recommended: qwen3 (or qwen3.5/qwen3.5-small) or a Llama3 variant that supports tools.
Pull and run locally:
ollama pull qwen3:8bollama run qwen3:8b
This exposes the Ollama local API at http://localhost:11434/v1.
Enable Ollama web tools
Ollama provides web_search / web_fetch (tool endpoints). Confirm availability in your Ollama version and enable any feature flags per Ollama docs.
Test the API manually:
curl http://localhost:11434/api/web_search -d '{"query":"latest X"}' curl http://localhost:11434/api/web_fetch -d '{"url":"https://example.com"}'
Build an agent loop that gives the model tool access
Use an agent framework (examples: Qwen-Agent, LangGraph, LangChain-style orchestration, or custom MCP client).
Core idea: call Ollama chat API with the model and supply tool definitions so the model can call web_search/web_fetch as functions.
Python minimal pattern (outline):
Install client libs: pip install requests qwen-agent langgraph (optional)
Open a chat session to Ollama:
POST /api/chat with:
model: "qwen3:8b"
messages: system + user
tools: include function schemas for web_search and web_fetch (name, description, parameters)
When the model returns a tool_call in its response, execute the requested tool (call Ollama web_search/web_fetch or your own fetcher), then feed the tool result back to the model as a new message (assistant -> tool result), continuing the loop until final answer.
Example flow (decision + tool-call + feedback)
User asks: “Find recent research on X and summarize.”
Model returns: tool_call web_search with query -> you call web_search -> return results.
Feed results into model: assistant message containing tool output.
Model may call web_fetch for a chosen URL -> call web_fetch to obtain full content -> feed back.
Repeat until model emits a final textual response.
Add memory & self-improvement steps
Persistence: store successful prompts, chain-of-thought, tool results, and the model’s final output in a local DB (SQLite/JSON).
Reuse: on similar future tasks, retrieve past prompts/responses as context (RAG) so the agent learns effective strategies.
Automated iteration (careful, human oversight required):
Implement a test-suite (set of evaluation queries + expected metrics).
Periodically run the agent on tests, record performance.
If candidate prompt/template or tool-usage pattern improves metrics, update the system prompt or tool-handling heuristics automatically (version and log changes).
Keep checkpoints and require human approval before deploying any automatic prompt/template changes into production.
Use MCP or tool frameworks for richer integrations
Option A — MCP: run an MCP server (filesystem, web, calendar tools), implement a small MCP client that converts MCP tools into Ollama tool schemas and proxies calls.
Option B — Qwen-Agent / LangGraph: bind Python functions as tools and use their agent runtime to orchestrate tool-calls, memory, and UI.
Safety, reliability, and best practices
Always validate and sanitize web_fetch results (strip scripts, limit size).
Cross-validate claims across multiple sources before accepting them as truth.
Log all tool calls and model decisions for auditing.
Rate-limit and sandbox any code-execution or external actions.
Keep human-in-the-loop control for any automatic “self-improvement” changes.
Example resources & next steps
Look for sample projects: Qwen-Agent + Ollama examples, LangGraph + Ollama tutorials, MCP client examples (search for “Ollama MCP tutorial”, “Qwen-Agent Ollama”).
Start small: build a single-task agent (search + summarize), add persistent memory, then add automated evaluation and controlled prompt updates.
If you want, I can produce a concise runnable Python example that:
starts a chat with Ollama qwen3,
defines web_search/web_fetch tools,
executes tool calls and feeds results back,
and stores exchanges in SQLite for simple memory and iterative prompt updates. Which model size should I target?
Login to reply
Replies (2)
Interesting, thank you! Have you ever looked into Maple AI as a potential private middle ground? I was going to try it out...
