Multi-agent

One agent can run a team of agents.

A single loop has one context and one focus. When a problem is too big for that, the agent splits it across subagents, each its own harness, coordinated by a lead.


Recall the central idea: a harness is a reusable engine. Nothing stops an agent from spawning another instance of that engine to handle a piece of the work. That is all a subagent is, the same harness, run again. And delegation itself is just another tool with a schema: a task description goes in, a summary comes back.

Why split

Three reasons to use more than one

Context isolation: each subagent gets its own fresh window, so a big exploration does not pollute the lead's context. Parallelism: several subagents can work at once. Specialization: a subagent can be a focused reviewer, researcher, or tester, with its own tools and prompt.

The lead does not do the detailed work itself. It delegates a well-scoped task, the subagent runs its own loop in its own context, and only the result comes back, a clean summary instead of a thousand lines of intermediate noise.

lead agent

harness core

  • context
  • tool-calling
  • the loop
  • exec env
  • ctx mgmt

delegate ↓  ·  results ↑

  • reviewer subagent

    harness core

    • context
    • tool-calling
    • the loop
    • exec env
    • ctx mgmt

    its own context

  • researcher subagent

    harness core

    • context
    • tool-calling
    • the loop
    • exec env
    • ctx mgmt

    its own context

  • tester subagent

    harness core

    • context
    • tool-calling
    • the loop
    • exec env
    • ctx mgmt

    its own context

Each subagent is the same harness, run in its own isolated context. The lead delegates a task, the subagent works it, and only the result comes back, keeping the lead's window clean.

The shapes

More than one way to wire a team

Lead-and-workers, the shape above, is the most common, but it is not the only one. A few patterns recur, and the right one depends on the work.

Common multi-agent orchestration patterns
Pattern How it works Good for
Lead and workers A coordinator delegates scoped subtasks to specialists and synthesizes the results. General-purpose delegation.
Parallel fan-out One large task is split into independent pieces, run at once, then merged. Breadth: searching, reviewing, or generating many things quickly.
Pipeline Each agent's output feeds the next, like an assembly line. Staged work, such as research, then draft, then review.
Council Several agents tackle the same problem independently; a judge picks or merges the best answer. Hard problems where a single attempt is unreliable.

The tradeoff

More coverage, more overhead

More agents mean more coverage and more parallel progress, but also more coordination and more cost; every subagent is its own full loop. And subagents cannot see each other's work, so the lead has to scope their tasks to be independent. The rule of thumb: reach for subagents when one context cannot hold the problem, or when isolation and specialization are worth the overhead.

one delegation

lead delegates → reviewer: "check auth.ts for security issues"

↓ reviewer runs its own loop, its own context

read auth.ts · search "token" · read config.ts

↑ only the result returns

reviewer "1 issue: a token is logged in plaintext at line 42"

lead continues with just that finding

The lead never sees the reviewer's intermediate steps, only the one-line result. That is context isolation, in action.

Multi-agent is the harness applied to itself. The same loop that drives one agent drives a team of them, with a lead that delegates and synthesizes. It is how an agent takes on work too large for a single window to hold.

Why one window has limits End of the tour: back to the overview