Docs — HAAO

Overview

HAAO (Hybrid AI-Agile Orchestrator) is a governance layer for AI software agents. Claude decomposes a plain-language requirement into atomic, testable tickets; cheaper local models execute the code; and two human gates keep a person in control of what ships.

The core thesis: frontier models are best used sparingly for high-leverage reasoning (decomposition, audit), not for grinding out every line. Local open-weight coders are now good enough to do the bulk of execution cheaply and privately. HAAO is the orchestration layer that routes the right work to the right model and inserts human judgment where it matters.

Quickstart

Clone the repo and set up a virtual environment:

python3 -m venv .venv
. .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env

Set CLAUDE_API_KEY for Tech Lead decomposition and audit, and point LMSTUDIO_BASE_URL at your local LM Studio server. Then run and health-check:

uvicorn orchestrator.main:app --reload
curl http://127.0.0.1:8000/health
pytest

New and non-technical? Start with the Operator's Guide in the repo. Want the design rationale? Read the technical design article.

Roles

HAAO maps Scrum roles onto a hybrid AI workforce. The non-obvious choice is that the human is the Product Owner, not the Scrum Master — process mechanics get automated, product judgment stays human.

Scrum role	Who	Responsibility
Product Owner	You (human)	Define requirements, prioritize, approve the backlog, accept the result.
Tech Lead	Cloud model (Claude)	Decompose into atomic tickets, write machine-verifiable DoD, run technical audit.
Scrum Master	Orchestrator (software)	Dispatch, route, enforce WIP, retry, escalate — automated.
Dev team	Local LLMs (LM Studio)	Read context, write code, run tests, report back.

Atomic tickets

The Atomic Ticket is the handover format between the cloud Tech Lead and a local coder — defined by a JSON Schema. Three properties make it work:

Machine-readable — the local model parses it without guessing intent. Self-contained — relevant code is injected directly into the ticket rather than referenced by filename, so a small-active-param model doesn't have to find or remember anything. Verifiable Definition of Done — the DoD is a set of test commands with expected outcomes, so "done" is a test result, not an opinion.

Hybrid cost routing

Work stays local and free by default. A retry budget governs self-correction; only when local attempts are exhausted does a ticket escalate to the cloud. Cheap machine checks gate the expensive cloud audit, so you never pay a frontier model to read every diff. The metric that matters is cost per accepted ticket.

Architecture

        You (Product Owner)
   write prompt │           │ approve / accept
                ▼           ▲
        ┌────────────────────────────────┐
        │   Orchestrator (Scrum Master)  │  state machine · routing · retry
        └───┬───────────┬───────────┬────┘
            │ decompose │ dispatch  │ run tests
            │ + audit   │           │
        ┌───▼────┐  ┌───▼────────┐  ┌▼───────────────┐
        │ Claude │  │ Local LLMs │  │ pytest/npm test│
        │ (Tech  │  │ (LM Studio)│  │ (validation)   │
        │  Lead) │  │  dev team  │  └────────────────┘
        └────────┘  └────────────┘

Stack: Python · FastAPI · SQLite · React · Tailwind · LM Studio (local inference) · Claude API (cloud).

The loop

One requirement flows through a single loop:

Prompt — the PO writes a requirement; the Tech Lead decomposes it into atomic tickets; the PO reviews and approves (Gate 1).
Execute — the orchestrator dispatches each ticket to its local model, which writes code and runs the ticket's tests.
Self-correct — on failure, the worker retries within budget; if exhausted, it escalates to the Tech Lead.
Audit — the Tech Lead checks the diff against the DoD (automatic).
Accept — the PO accepts or rejects with feedback (Gate 2).

Enterprise

For regulated, on-prem-leaning, or budget-sensitive organizations, HAAO is designed to run entirely on your own infrastructure. Execution stays local so code never leaves your machines; the cloud model only sees ticket scope and diffs for audit, gated behind a retry budget.

Enterprise deployments add SSO and role-based access, policy and guardrail configuration (what agents may touch), bring-your-own inference, and priority support. Contact us to scope a deployment.