Aman Sachan

Posted on Jun 17

How Sentience ships 60+ AI tools in one local desktop app — without locking you into one provider

#python #ai #opensource #desktop

I wanted Cursor's UX and Zo Computer's tool breadth in a desktop app I could close the laptop lid on. So I built Sentience — a PySide6 desktop AI assistant that runs entirely on your machine, brings its own browser, its own email client, its own voice controller, and exposes 60+ tool functions to the model.

The hard part was never the tools. The hard part was making 60+ tool schemas work identically across Groq, OpenAI, Anthropic, and a local Ollama instance — without writing a provider-specific tool dispatcher and without forcing the user to think about which model they happen to be on today.

This is the part of the codebase I'm actually proud of, and it's the part nobody ships as a tutorial.

The constraint

OpenAI's /v1/chat/completions format is now a de facto standard. Groq implements it. Ollama implements it. Localai implements it. So three out of four providers I wanted to support "just work" with one HTTP call — if you're willing to accept the OpenAI tool schema as ground truth.

Anthropic doesn't. The Messages API uses:

A separate system field instead of a system message in the array
x-api-key instead of Authorization: Bearer
anthropic-version: 2023-06-01 as a required header
A different tool_use / tool_result content block format on the response side

So the question is: do you write two tool dispatchers and two execution paths, or do you write a thin adapter that lets you keep one unified tool list and one dispatcher?

I chose the adapter. The whole provider layer is 60 lines.

The provider registry

PROVIDERS = {
    "groq": {
        "name": "Groq",
        "base_url": "https://api.groq.com/openai/v1",
        "models": ["llama-3.3-70b-versatile", "llama-3.1-70b-versatile",
                   "llama-3.2-90b-vision-preview", "mixtral-8x7b-32768"],
        "free_tier": True,
    },
    "openai": {
        "name": "OpenAI",
        "base_url": "https://api.openai.com/v1",
        "models": ["gpt-4o", "gpt-4o-mini", "gpt-4-turbo", "gpt-3.5-turbo"],
        "free_tier": False,
    },
    "anthropic": {
        "name": "Anthropic",
        "base_url": "https://api.anthropic.com/v1",
        "models": ["claude-3-5-sonnet-20241022", "claude-3-opus-20240229",
                   "claude-3-haiku-20240307"],
        "free_tier": False,
    },
    "ollama": {
        "name": "Ollama (Local)",
        "base_url": os.getenv("OLLAMA_HOST", "http://localhost:11434/v1"),
        "models": ["llama3.2", "llama3.1", "codellama", "mistral", "qwen2.5"],
        "free_tier": True,
    },
}

Free tier is a first-class field, not a comment. The settings dialog lights up the "free" badge for Groq and Ollama, and the README leads with those two for new users.

The two-method `chat()`

The whole class has one entry point and two private methods. The entry point picks a path based on self.provider. That's it.

class AIClient:
    def __init__(self, provider: str, model: str, api_key: str = ""):
        self.provider = provider
        self.model = model
        self.api_key = api_key
        self.config = PROVIDERS.get(provider, PROVIDERS["groq"])

    def chat(self, messages, tools=None):
        if self.provider == "anthropic":
            return self._chat_anthropic(messages, tools)
        return self._chat_openai_compatible(messages, tools)

    def _chat_openai_compatible(self, messages, tools=None):
        headers = {"Content-Type": "application/json"}
        if self.api_key:
            headers["Authorization"] = f"Bearer {self.api_key}"
        payload = {
            "model": self.model,
            "messages": messages,
            "max_tokens": 4096,
            "temperature": 0.7,
        }
        if tools:
            payload["tools"] = tools
            payload["tool_choice"] = "auto"
        try:
            resp = requests.post(
                f"{self.config['base_url']}/chat/completions",
                headers=headers, json=payload, timeout=60,
            )
            resp.raise_for_status()
            return resp.json()
        except Exception as e:
            return {"error": str(e)}

    def _chat_anthropic(self, messages, tools=None):
        headers = {
            "Content-Type": "application/json",
            "x-api-key": self.api_key,
            "anthropic-version": "2023-06-01",
        }
        # System message is a separate field in Messages API
        system_msg = None
        anthropic_messages = []
        for msg in messages:
            if msg["role"] == "system":
                system_msg = msg["content"]
            else:
                anthropic_messages.append(msg)
        payload = {
            "model": self.model,
            "messages": anthropic_messages,
            "max_tokens": 4096,
        }
        if system_msg:
            payload["system"] = system_msg
        if tools:
            payload["tools"] = tools
        try:
            resp = requests.post(...)
            ...

The Anthropic branch has exactly three differences from the OpenAI branch: header names, system-message location, and the path. Everything else — the tools list, the messages array, the max_tokens field — is identical. So the same 60 tool schemas I registered for Groq work on Claude without rewriting a single function definition.

This means I can hand the user a dropdown that says "switch to Claude Sonnet" and the next message the user types goes to Anthropic's API with the exact same tool surface. The model can call read_file, list_directory, browser_navigate, send_email, and oauth_github_login on any of the four providers. The dispatcher doesn't care.

The unified tool list

The tools live in their own modules and are aggregated with a single * spread:

from browser.automation import BROWSER_TOOLS, PLAYWRIGHT_AVAILABLE
from email_agent.client import EMAIL_TOOLS, init_email, execute_email_tool
from oauth_manager.manager import OAUTH_TOOLS, get_oauth_manager, execute_oauth_tool
from voice.controller import VOICE_TOOLS, get_voice_controller, execute_voice_tool
from skills.registry import SKILL_TOOLS, get_skill_registry, execute_skill_tool
from hosting.server import HOSTING_TOOLS, get_hosting_server

TOOLS = [
    {"type": "function", "function": {"name": "read_file", ...}},
    {"type": "function", "function": {"name": "write_file", ...}},
    {"type": "function", "function": {"name": "list_directory", ...}},
    {"type": "function", "function": {"name": "run_command", ...}},
    {"type": "function", "function": {"name": "search_files", ...}},
    *BROWSER_TOOLS,    # browser_navigate, browser_click, browser_screenshot, ...
    *EMAIL_TOOLS,      # email_read_inbox, email_send, email_search
    *OAUTH_TOOLS,      # oauth_google_login, oauth_github_login, oauth_notion_login
    *VOICE_TOOLS,      # voice_listen, voice_speak, voice_set_wake_word
    *SKILL_TOOLS,      # skill_list, skill_load, skill_run
    *HOSTING_TOOLS,    # hosting_start, hosting_stop, hosting_status, hosting_logs
]

Each submodule exports both the schema (BROWSER_TOOLS, EMAIL_TOOLS, etc.) and the executor (execute_browser_tool, execute_email_tool). The schemas are OpenAI function-calling dicts. The executors are the actual Python functions the dispatcher calls when the model invokes the tool.

The dispatcher

def execute_tool(name: str, args: dict, workspace: str) -> dict:
    try:
        if name == "read_file":
            path = Path(args.get("path", ""))
            if not path.is_absolute():
                path = Path(workspace) / path
            if path.exists():
                return {"success": True, "content": path.read_text()[:10000]}
            return {"success": False, "error": "File not found"}

        elif name == "run_command":
            result = subprocess.run(
                args.get("command", ""), shell=True,
                capture_output=True, text=True, timeout=30,
                cwd=args.get("cwd", workspace),
            )
            return {"success": True, "stdout": result.stdout[:5000],
                    "stderr": result.stderr[:5000], "exit_code": result.returncode}

        # ... 60+ branches, each returning {"success": bool, ...}
        elif name.startswith("browser_"):
            return execute_browser_tool(name, args)
        elif name.startswith("email_"):
            return execute_email_tool(name, args)
        elif name.startswith("oauth_"):
            return execute_oauth_tool(name, args)
        elif name.startswith("voice_"):
            return execute_voice_tool(name, args)
        elif name.startswith("skill_"):
            return execute_skill_tool(name, args)
        elif name.startswith("hosting_"):
            return execute_hosting_tool(name, args)
        else:
            return {"success": False, "error": f"Unknown tool: {name}"}
    except Exception as e:
        return {"success": False, "error": str(e)}

Two patterns I want to call out:

Every executor returns {"success": bool, ...}. The model gets a uniform response shape, no matter which tool blew up. The model can then decide to retry, escalate, or just tell the user "I couldn't read that file." This is what makes the system actually usable when one of the 60 tools fails mid-conversation.
All file paths are resolved against the workspace. The model never gets to specify an absolute path that the user didn't authorize. Path(workspace) / path is a tiny line, but it's the line that means "I can run this app on a stranger's laptop and not worry about ~-escape exploits."

What this gives the user

The killer feature, in practice, is the settings dropdown. The user can:

Start on Groq's free tier to test things (llama-3.3-70b-versatile)
Switch to Claude Sonnet when a task needs better reasoning — same 60 tools, same chat history
Drop to local Ollama when they're on a flight with no wifi — same 60 tools, same chat history
Route OAuth flows and browser automation through the same session regardless of which model is talking

The tool dispatcher doesn't know or care which provider is in self.provider. The tool list doesn't change. The chat history doesn't get a special "this is the Anthropic thread" branch. The 800-line main.py has one AIClient and one execute_tool and the rest is PySide6 widgets and message-routing glue.

What I'd do differently

Two things:

The Anthropic tool result format is still different on the response side. When Claude calls a tool, the response uses tool_use blocks and you have to send back tool_result blocks in the next turn. I handle this in the message-routing layer, not the dispatcher. If I were starting over I'd move the tool result translation into the _chat_anthropic method, so the dispatcher could pretend all four providers return identical shapes.
Streaming is half-done. Groq and OpenAI stream identically; Anthropic streams differently (event types, not data-only SSE). The current build buffers the full response and shows it once. That's fine for a 4k-token answer; for a 64k reasoning trace it's not. The fix is the same shape as the chat dispatcher — one streaming method per provider family, one normalized chunk iterator for the UI.

The stack

GUI: PySide6 (Qt for Python) — native widgets, no Electron
Browser: Playwright with stealth patches
Email: imaplib + smtplib from stdlib
Voice: SpeechRecognition + pyttsx3
AI: OpenAI-compatible client for 3/4 providers, native adapter for Anthropic
Skills: per-domain Python modules with manifest + executor
Storage: SQLite for chat history, JSON for settings
BYOK: Bring Your Own Key — Groq free tier, OpenAI, Anthropic, or fully local Ollama

Try it

git clone https://github.com/AmSach/sentience
cd sentience
pip install -r requirements.txt
playwright install chromium
export GROQ_API_KEY=your_key_here   # or OPENAI / ANTHROPIC / OLLAMA_HOST
python src/main.py

For 100% offline use:

ollama pull llama3.2
OLLAMA_HOST=http://localhost:11434 python src/main.py

MIT licensed. PRs welcome — especially on the streaming layer, a new provider, or another tool module (calendar? GitHub? Linear?).

GitHub: https://github.com/AmSach/sentience

Top comments (1)

Alex Shev • Jun 17

The hard part with 60 tools is not the tool count, it is routing and blast radius. A local desktop assistant gets interesting when each tool has a clear contract and the model cannot casually wander from browser to email to filesystem without boundaries. Local-first is a good base, but capability boundaries are what make it usable.