

Login to reply
Replies (9)
Got #Bitcoin?
Be free with freedom tech. GNU/Linux, BSD, Nostr, Torrents, OpenSource, local AI
Which local AI do you recommend trying out? I've used a couple, but they were somewhat limited...
I run different models using Ollama
Its very easy to install and then add/run the models.
I have used DeepSeek, Qwen and some more specialized isolated in a virtual machine with lots of ram and cpu.

Ollama
Ollama is the easiest way to automate your work using open models, while keeping your data safe.
there are definitely other good ways to do it too but I need to do more research and tests
Are there any local models you know of with the capability to do web searches to fill knowledge gaps?
Not yet and that is really key feature.
I asked the online AI about it and apparently its already possible. I will need to try this.
Short answer: Ollama’s local models that can read the web and be used in self-improving/agentic workflows are the tool-enabled conversational models like qwen3 (and qwen3:4b/other sizes) and gpt-oss when paired with Ollama’s web_search/web_fetch tools and an agent framework (e.g., MCP/web-search agent examples).
Notes:
Ollama added a web search API and web_fetch that lets local models call the web as a tool.
To build self-improving or agentic behavior you combine a capable model (qwen3 or gpt-oss are recommended) with Ollama’s tool integrations (web_search/web_fetch, MCP) or an external agent orchestration layer (Open WebUI pipelines, custom MCP server).
Step-by-step: Run a local Ollama model that reads the web and supports agentic/self-improving workflows
Prerequisites (assume macOS/Linux; adapt for Windows/WSL):
Install Ollama (https://ollama.com/install) and confirm ollama CLI runs.
Install Python ≥3.10 and pip; create a venv.
Choose and pull a tool-capable model
Recommended: qwen3 (or qwen3.5/qwen3.5-small) or a Llama3 variant that supports tools.
Pull and run locally:
ollama pull qwen3:8bollama run qwen3:8b
This exposes the Ollama local API at http://localhost:11434/v1.
Enable Ollama web tools
Ollama provides web_search / web_fetch (tool endpoints). Confirm availability in your Ollama version and enable any feature flags per Ollama docs.
Test the API manually:
curl http://localhost:11434/api/web_search -d '{"query":"latest X"}' curl http://localhost:11434/api/web_fetch -d '{"url":"https://example.com"}'
Build an agent loop that gives the model tool access
Use an agent framework (examples: Qwen-Agent, LangGraph, LangChain-style orchestration, or custom MCP client).
Core idea: call Ollama chat API with the model and supply tool definitions so the model can call web_search/web_fetch as functions.
Python minimal pattern (outline):
Install client libs: pip install requests qwen-agent langgraph (optional)
Open a chat session to Ollama:
POST /api/chat with:
model: "qwen3:8b"
messages: system + user
tools: include function schemas for web_search and web_fetch (name, description, parameters)
When the model returns a tool_call in its response, execute the requested tool (call Ollama web_search/web_fetch or your own fetcher), then feed the tool result back to the model as a new message (assistant -> tool result), continuing the loop until final answer.
Example flow (decision + tool-call + feedback)
User asks: “Find recent research on X and summarize.”
Model returns: tool_call web_search with query -> you call web_search -> return results.
Feed results into model: assistant message containing tool output.
Model may call web_fetch for a chosen URL -> call web_fetch to obtain full content -> feed back.
Repeat until model emits a final textual response.
Add memory & self-improvement steps
Persistence: store successful prompts, chain-of-thought, tool results, and the model’s final output in a local DB (SQLite/JSON).
Reuse: on similar future tasks, retrieve past prompts/responses as context (RAG) so the agent learns effective strategies.
Automated iteration (careful, human oversight required):
Implement a test-suite (set of evaluation queries + expected metrics).
Periodically run the agent on tests, record performance.
If candidate prompt/template or tool-usage pattern improves metrics, update the system prompt or tool-handling heuristics automatically (version and log changes).
Keep checkpoints and require human approval before deploying any automatic prompt/template changes into production.
Use MCP or tool frameworks for richer integrations
Option A — MCP: run an MCP server (filesystem, web, calendar tools), implement a small MCP client that converts MCP tools into Ollama tool schemas and proxies calls.
Option B — Qwen-Agent / LangGraph: bind Python functions as tools and use their agent runtime to orchestrate tool-calls, memory, and UI.
Safety, reliability, and best practices
Always validate and sanitize web_fetch results (strip scripts, limit size).
Cross-validate claims across multiple sources before accepting them as truth.
Log all tool calls and model decisions for auditing.
Rate-limit and sandbox any code-execution or external actions.
Keep human-in-the-loop control for any automatic “self-improvement” changes.
Example resources & next steps
Look for sample projects: Qwen-Agent + Ollama examples, LangGraph + Ollama tutorials, MCP client examples (search for “Ollama MCP tutorial”, “Qwen-Agent Ollama”).
Start small: build a single-task agent (search + summarize), add persistent memory, then add automated evaluation and controlled prompt updates.
If you want, I can produce a concise runnable Python example that:
starts a chat with Ollama qwen3,
defines web_search/web_fetch tools,
executes tool calls and feeds results back,
and stores exchanges in SQLite for simple memory and iterative prompt updates. Which model size should I target?
Interesting, thank you! Have you ever looked into Maple AI as a potential private middle ground? I was going to try it out...
I think they are doing good work with ecnryption and privacy.
