Octorato: an open-source AI agent OS with built-in per-client FinOps

#devops #agents #opensource #ai

Most agent frameworks assume one agent, one app, one bill. The moment you run agents for many clients, two problems appear that no runtime solves for you: you can't prove which client burned which tokens, and nothing stops one client's workspace from leaking into another's. I built Octorato to fix exactly that.

What Octorato is

Octorato is an open-source AI agent operating system: one file-native "brain" — rules, 190+ skills, 180+ specialist agents, all plain markdown under git — that a single operator runs across many sealed client "arms," with per-client token attribution and opt-in budget caps.

It's not a runtime you import. It's the agent's self as files you can read, diff, fork, and own — runtime-agnostic (it runs on Claude Code today).

The octopus model

One brain, many arms. The brain holds the shared self: rules (the constitution), skills (HOW to do things), agents (WHO does them). Each arm is a sealed deployment serving exactly one client. Knowledge flows down (generic skills cascade to every arm) and lessons flow up (anonymized patterns get distilled back into the brain). Like a real octopus, most of the neurons live in the arms, not the head.

Why "file-native" matters

Your agent's identity, skills, and memory normally live trapped inside vendor code and a cloud console — you can't read the whole self, diff a change, or move it. Octorato keeps all of it as plain markdown under version control. Identity becomes diffable, reviewable, portable, and ownable. Text outlives runtimes.

The part nobody else does: FinOps and isolation are the same wall

Because each arm is a sealed cell that no other arm can see, every token an arm spends is attributable to exactly one client by construction. Cellular isolation is per-client FinOps — the wall that seals a client is the wall that meters it. Concretely: per-arm USD rollup (estimated from local session logs at list price), cost-spike alerts, and an opt-in PreToolUse budget gate — wire the hook and set a client's cap in budgets.yaml, and it refuses the tool call (exits non-zero) once the cap is hit.

Gartner predicts over 40% of agentic AI projects will be cancelled by 2027 over unmanaged cost. The boring discipline — attribute every token, cap every client — is what keeps you on the right side of that statistic.

How it compares (honestly)

CrewAI, LangGraph, and AutoGen are excellent Python agent-runtime frameworks: you define agents and graphs in code and they execute in-process. They have far larger communities. Octorato lives at a different layer — the self as files — and its defensible difference is multi-tenant arm isolation plus built-in FinOps, which runtime frameworks don't target. If you're building one app, use a runtime framework. If you're an operator or agency serving many clients from one brain, that's the gap Octorato fills.

Try it

It's MIT-licensed and public: https://github.com/CarlosCaPe/octorato

Read the white paper for the full model, or the FAQ for the short version. Contributions welcome — every contributor is credited.

Top comments (2)

Harjot Singh • May 31

Per-client FinOps baked into an agent OS is a genuinely smart wedge - the moment you run agents on behalf of multiple clients/tenants, "which client's actions burned which tokens" becomes a billing AND a margin survival question, and almost no framework gives you that attribution out of the box. People bolt it on after the first surprise invoice. Making it native is the right call.

The detail I'd dig into: per-client cost attribution is only useful if it's also enforceable - tracking that client X spent $40 is step one; hard per-client budget caps so X can't run you to $4,000 is step two, and that's the part that actually protects you. I obsess over exactly this in Moonshift (a multi-agent pipeline shipping a prompt to a real SaaS) - per-step/per-run cost attribution + ceilings, which is also how a build stays a predictable ~$3 flat. First run's free, no card. Cool project - is the FinOps layer attribution-only, or does it enforce hard per-client spend limits? The enforcement piece is where agent-as-a-service businesses live or die.

Harjot Singh • May 31

Per-client finops built into an agent OS is a genuinely smart wedge. The moment you run agents for multiple clients, "who is this token spend attributable to" becomes a billing and margin question, not just an ops one. Most agent frameworks treat cost as a global blob and you can't answer "is client X actually profitable." Per-client attribution plus budget guards is exactly what turns agent-as-a-service from a money pit into a business. I built cost-attribution and spend caps into Moonshift for the same reason, you can't price what you can't measure per-unit. Are you attributing at the request level, or rolling up by agent/session per client?