Working prototype · real-repo benchmark (n=14)

Run AI coding agents like a Scrum team.

Talk to an agent about what you want built — it files the work, a hybrid AI team delivers it, you watch every step, and it ships as a pull request. A cloud model is your Tech Lead, local (or cloud) LLMs are the dev team, and you stay the Product Owner — two human gates and hybrid cost routing built in.

Try the live demo View on GitHub

Open source · MIT licensed

2human gates

82%local-finish · real repos

$0local execution cost

The problem

AI agents write code well. They're bad at being trusted.

Today's coding agents are opaque autopilots: you approve one giant diff at the end, every token runs through an expensive frontier model, and on a big repo they drift. There's no agile structure, no per-step checkpoint, no cost control.

⛒

Two human gates

You approve the decomposed backlog before any code runs, and accept finished work at the end. Everything between is automatic but visible.

⇄

Hybrid cost routing

Work stays local and free until a retry budget is exhausted, then escalates to the cloud. You don't pay frontier prices to read every diff.

▤

Atomic ticket contracts

Each ticket is a JSON-Schema contract with code injected inline and a machine-verifiable Definition of Done. "Done" is a test result, not an opinion.

↻

Recovers, doesn't just fail

A blocked ticket isn't a dead end — split it into smaller work, clarify scope, change the model and retry, escalate, or abandon. A Decision Center shows exactly what needs you.

◉

Legible by default

Every run streams to an activity log with the diff, retries, cost, the model that did it, and the prompt version behind it — so when behaviour changes, you can see why.

The core idea

Roles, assigned where they belong

The non-obvious choice: the human is the Product Owner, not the Scrum Master. Process mechanics get automated. Product judgment doesn't.

Product Owner

You

Define requirements, approve scope, accept results.

Tech Lead

Cloud model

Decompose into atomic tickets, write DoD, run audits.

Scrum Master

Orchestrator

Dispatch, route, retry, escalate — fully automated.

Dev team

Local LLMs

Read context, write code, run tests, report back.

How it works

From a conversation to a pull request

Chat & decomposeTell the agent what you want; it restates what it heard and files the work as backlog proposals. The Tech Lead breaks each into atomic, testable tickets.

Gate 1 — approve scopeYou review, edit, or exclude proposed tickets before any code runs.

Execute & self-correctLocal or cloud models write code and run the ticket's tests in a sandbox, retrying within budget. Every step streams to the board and the activity log.

AuditThe Tech Lead checks the diff against the Definition of Done — automatically.

Gate 2 — acceptYou accept, or send it back with feedback for a rework loop.

Ship as a pull requestOn accept, HAAO opens a branch and PR to your GitHub or GitLab.

Why HAAO

Not another coding agent — the control plane

	HAAO	Autonomous coding agents (Devin / Copilot Workspace / …)
Human control	Two explicit gates (scope + acceptance)	Single end-of-run diff approval
Cost model	Hybrid: frontier for reasoning, local for execution	Frontier model for everything
Privacy	Local execution option (code stays in-house)	Cloud-only
Mental model	Agile tickets + Definition of Done	Freeform autopilot
Trust posture	Audit verdict + rework loop, reversible	Mostly opaque

See the whole flow in your browser

An interactive simulation of the full loop — no live model calls, no setup.

Open the demo See pricing

Frequently asked questions

What is HAAO?

HAAO (Hybrid AI-Agile Orchestrator) is an open-source governance layer for AI coding agents. A cloud model decomposes and audits, local models execute, and two human gates keep a person in control of scope and what ships.

How is it different from Cursor, Devin, or Copilot Workspace?

Those are coding agents — one model does everything and you approve a diff at the end. HAAO is the control plane around agents: hybrid cost routing (frontier for reasoning, local for execution), two explicit human gates, and tickets as machine-verifiable contracts.

Do I need cloud API access?

The cloud model (Claude) handles decomposition and audit. Execution runs on local models via LM Studio, so the bulk of the work is free and private. Cloud is gated behind a retry budget, not in the hot path.

Is it production-ready?

No — it's a working end-to-end prototype, validated on real open-source repos (see Benchmark: 82% local-finish, 46% one-shot, across small and large files). It now runs an optional parallel worker pool with a dependency graph and conflict detection (default one worker), but it's still early. Treat it as a measured prototype — not production software.

Does it ship code, or just write it?

It ships. When you accept a result, HAAO opens a branch and pull request to your GitHub or GitLab — least-privilege and idempotent (one PR per ticket).

Can I use OpenAI or other cloud models, not just Claude?

Yes. Register multiple cloud providers (Anthropic, OpenAI, OpenRouter and more) with encrypted keys, and assign a specific model per role — or keep everything local.

Is it safe to run AI-written code?

Test/DoD commands run sandboxed with networking disabled and a scoped environment; edits that touch files outside a ticket's declared scope are rejected, and network-egress attempts during tests are audited. Secrets are encrypted at rest and redacted from logs; injected repo content is treated as untrusted. You can also require an API token and, with identity enabled, role-based access plus an append-only audit log.

Is it open source?

Yes, MIT licensed. You can self-host and bring your own keys.