AI Tooling · Handbook
Claude Code Agent Teams
Multi-agent orchestration — full handbook · 10 min read
Claude Code Agent Teams
Status: handbook (deep dive) Last updated: 2026-06-16 One-line verdict: Multiple independent Claude Code sessions that message each other and share a task list — powerful for parallel research/review/build, experimental, and markedly more expensive (≈7×) than a single session. Reach for it when workers need to collaborate, not just report back.
Companion to the Claude Ecosystem paper's §5.7. That page introduces the concept; this is the practical, every-detail guide.
Snapshot
| What it is | One lead Claude Code session spawns multiple teammate sessions that work in parallel, each in its own context window, communicating via a mailbox and coordinating through a shared task list. |
| Launched | February 2026 (alongside Opus 4.6). |
| Status | Experimental, disabled by default. Real limitations (below). |
| Requires | Claude Code v2.1.32+; Pro / Max / Team / Enterprise plan. |
| Enable | CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 (env or settings.json). |
| Cost | ≈7× tokens vs a single session (scales ~linearly with teammate count). |
The mental model
A single Claude Code session is one worker in one context window. Agent Teams turns that into a squad:
- Team lead — the session you start in. It creates the team, breaks work into tasks, spawns teammates, assigns/synthesizes, and is the only one that manages the team.
- Teammates — full, independent Claude Code sessions, each with its own context window. They don't share the lead's conversation history.
- Shared task list — a backlog of work items with states (pending → in-progress → completed) and dependencies; teammates claim items.
- Mailbox — teammates message each other directly (and the lead), delivered automatically.
The defining idea is that last point: peer-to-peer communication. That is the entire reason Agent Teams exists as something distinct from subagents.
Agent Teams vs Subagents vs worktrees — pick the right tool
| Subagents | Agent Teams | Git worktrees | |
|---|---|---|---|
| What | Helper agents spawned inside one session | Multiple independent sessions coordinating | You run multiple Claude sessions manually |
| Context | Own window; result returns to caller | Own window; fully independent | Separate checkouts/branches |
| Communication | Report back to the main agent only | Teammates message each other | None (manual) |
| Coordination | Main agent manages everything | Shared task list + self-claiming | You coordinate |
| Token cost | Lower (results summarized back) | Higher (≈7×; each is a full instance) | N× whatever you run |
| Best for | Focused tasks where only the result matters | Work needing discussion, challenge, parallel ownership | Isolated parallel branches you drive yourself |
Rule of thumb: if your workers don't need to talk to each other, use subagents (cheaper, simpler). Use Agent Teams only when teammates genuinely need to share findings, challenge each other, or coordinate on a shared backlog. For sequential work, same-file edits, or heavy dependencies, a single session wins.
How it works (architecture)
Components: team lead, teammates, shared task list, mailbox.
Local storage (created automatically, removed on cleanup / session end):
- Team config:
~/.claude/teams/{team-name}/config.json— runtime state (session IDs, tmux pane IDs, amembersarray of name/agent-id/agent-type). Don't hand-edit or pre-author it — it's overwritten on every state update. - Task list:
~/.claude/tasks/{team-name}/. - There is no project-level team config; a
.claude/teams/…file in your repo is treated as an ordinary file, not configuration.
Task lifecycle: the lead creates tasks; teammates self-claim the next unblocked one or are assigned explicitly. Claiming uses file locking to avoid races. A task with unresolved dependencies can't be claimed until they complete; when a blocking task finishes, dependents unblock automatically.
Context each teammate loads on spawn: the same project context as a normal
session — CLAUDE.md, MCP servers, skills — plus the spawn prompt from the
lead. The lead's conversation history does not carry over, so the spawn
prompt must contain the task-specific detail.
Communication: messages are delivered automatically (no polling); idle teammates notify the lead when they stop; all agents see task status; you address a teammate by name (one message per recipient — no broadcast).
Setup
1. Check your version:
claude --version # need v2.1.32+
2. Enable the feature (env var, or persist it in settings):
// settings.json
{
"env": { "CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1" }
}
3. Choose a display mode (teammateMode in ~/.claude/settings.json):
in-process(default fallback) — all teammates in your main terminal; works in any terminal, no extra setup.tmux/ split panes — each teammate gets its own pane; needs tmux or iTerm2 (with theit2CLI + Python API enabled).auto(default) — split panes if you're in tmux/iTerm2, else in-process.
{ "teammateMode": "in-process" }
Or per-session: claude --teammate-mode in-process.
Split panes are not supported in VS Code's integrated terminal, Windows Terminal, or Ghostty. On those, use in-process mode.
4. Set the default teammate model in /config → Default teammate model.
Teammates do not inherit the lead's /model by default. (Cost tip below:
default them to Sonnet.)
Using it — the workflow
Starting a team
Just describe the task and structure in natural language; the lead creates the team, spawns teammates, and coordinates. Two entry points: you request a team, or Claude proposes one (you confirm — it never spawns a team without approval).
I'm designing a CLI that tracks TODO comments across a codebase. Create an
agent team to explore it from different angles: one teammate on UX, one on
technical architecture, one playing devil's advocate.
Specifying teammates and models
Create a team with 4 teammates to refactor these modules in parallel.
Use Sonnet for each teammate.
Plan-approval gate (for risky work)
Hold teammates in read-only plan mode until the lead approves their plan:
Spawn an architect teammate to refactor the auth module.
Require plan approval before they make any changes.
The lead approves/rejects autonomously — steer it with criteria in your prompt ("only approve plans with test coverage", "reject schema changes").
Talking to teammates directly
Each teammate is a full session you can message directly:
- In-process:
Shift+Downcycles teammates; type to message;Enterto view a session;Escto interrupt;Ctrl+Ttoggles the task list. - Split panes: click into a teammate's pane.
Tasks: assign and claim
The lead can assign tasks explicitly ("give the auth task to the architect"), or teammates self-claim the next unassigned, unblocked task when they finish one. Dependencies resolve automatically.
Reusable roles via subagent definitions
Define a role once (project/user/plugin/CLI scope) and reuse it as a teammate:
Spawn a teammate using the security-reviewer agent type to audit src/auth/.
The teammate honors that definition's tools allowlist and model, and
its body is appended to the system prompt. Important caveats:
- The
skillsandmcpServersfrontmatter fields are not applied to teammates — they load skills/MCP from project + user settings like a normal session. - Team tools (
SendMessage, task-management tools) are always available even iftoolsrestricts everything else.
Permissions
Teammates inherit the lead's permission mode at spawn (including
--dangerously-skip-permissions). You can change an individual teammate's mode
after spawning, but not set per-teammate modes at spawn time. Teammate
permission prompts bubble up to the lead — pre-approve common ops in
permission settings to cut friction.
Quality-gate hooks
Enforce rules deterministically (exit code 2 = block + send feedback):
TeammateIdle— fires when a teammate is about to go idle; block to keep it working (e.g. "tests aren't passing yet").TaskCreated— block task creation with feedback.TaskCompleted— block marking a task complete (e.g. require green tests).
Shutdown and cleanup
Ask the researcher teammate to shut down # graceful; teammate may decline w/ reason
Clean up the team # removes shared resources
Always clean up via the lead (teammates may leave resources inconsistent). Cleanup fails if teammates are still running — shut them down first. The lead often cleans up on its own when work is done.
Use-case patterns
Parallel code review — distinct lenses, no overlap:
Create an agent team to review PR #142. Spawn three reviewers:
- one on security, one on performance, one validating test coverage.
Have each review and report findings.
Competing hypotheses (adversarial debugging) — the strongest pattern; beats a single agent's anchoring bias:
Users report the app exits after one message. Spawn 5 teammates to investigate
different hypotheses. Have them talk to each other to disprove each other's
theories, like a scientific debate. Update the findings doc with the consensus.
New modules / features — each teammate owns a separate file set. Cross-layer change — frontend / backend / tests, one teammate each.
Best practices
- Start with research/review, not parallel implementation — clear boundaries, no file conflicts, immediate payoff.
- 3–5 teammates for most work; ~5–6 tasks per teammate. Three focused teammates beat five scattered ones; returns diminish fast.
- Avoid file conflicts — give each teammate a different set of files; two editing the same file overwrite each other.
- Size tasks as self-contained units with a clear deliverable (a function, a test file, a review) — too small wastes coordination, too large risks long unattended drift.
- Front-load context in the spawn prompt (they don't see the lead's history).
- Monitor and steer; don't let a team run unattended. If the lead starts doing the work itself: "wait for your teammates to finish before proceeding."
Costs — read before you run a fleet
This is the headline trade-off. Official guidance: agent teams use ≈7× the tokens of a standard session (when teammates run in plan mode), because each teammate is a separate instance with its own context window; usage scales ~linearly with team size and runtime.
Practitioner field data (CloudZero, 2026 — treat as estimates):
- No multi-agent discount — 10 agents burn quota ~10× faster. A ~$13/day solo dev becomes ~$130–260/day at 5–10 parallel agents.
- Pro ($20/mo) is impractical — its 5-hour window drains in under an hour with ~5 agents. Max 20× ($200/mo) is the practical floor for regular use.
- Tiered models save ~40% — an Opus lead + Sonnet teammates beats an all-Opus fleet (and a heavier model's tokenizer can emit ~35% more tokens).
- Idle teammates still bleed tokens — clean up promptly.
- Each inter-agent message is a billable round-trip through the model.
- Stale context inflates cost 30–50% — clear between unrelated task batches.
Keeping it manageable (official): default teammates to Sonnet; keep
teams small; keep spawn prompts focused (everything in them is context
from turn one); clean up when done. Track with /usage; set a monthly
cap with /usage-credits (Pro/Max) or workspace spend limits (API).
Benefits
- True parallelism with collaboration — the only mode where workers share a backlog and challenge each other; surfaces better answers on review/debugging.
- Independent ownership — teammates take separate files/modules cleanly.
- Reuses existing primitives — subagent role definitions, hooks, CLAUDE.md, permissions all carry over; low conceptual overhead if you already use them.
- Human-in-the-loop control — never spawns without approval; plan-approval gates and quality hooks keep risky work bounded.
Limitations / cons (it's experimental)
- Cost — ~7×; the main reason to use it sparingly.
- One team at a time; no nested teams (teammates can't spawn teammates); lead is fixed (no leadership transfer).
- No session resumption for in-process teammates —
/resumeand/rewinddon't restore them; the lead may message teammates that no longer exist (spawn new ones). - Task status can lag — teammates sometimes fail to mark tasks done, blocking dependents; nudge or fix manually.
- Shutdown can be slow (finishes the current tool call first).
- Permissions only set at spawn; per-teammate modes only changeable after.
- Split panes need tmux/iTerm2; unsupported in VS Code/Windows Terminal/Ghostty.
Troubleshooting
- Teammates not appearing — in-process, press
Shift+Downto cycle; or the task wasn't complex enough for the lead to spawn a team; for split panes verifywhich tmux/ theit2CLI. - Teammates stop on errors — view their output, give instructions, or spawn a replacement.
- Lead shuts down early — tell it to keep going / wait for teammates.
- Orphaned tmux sessions —
tmux lsthentmux kill-session -t <name>. - Too many permission prompts — pre-approve common ops before spawning.
How it fits us
- Directly relevant — we're an AI-first shop (the core argument in react-vs-svelte.md); this is the heaviest-leverage Claude Code feature for parallel research and review of exactly the kind we do here.
- Start with the cheap, safe patterns — parallel review and competing-hypothesis investigation on a PR or a bug. They show the value without the file-conflict risk of parallel implementation, and map perfectly onto code review and root-cause analysis in long-lived systems.
- Budget consciously — ≈7× cost means it's a deliberate tool, not a default. Default teammates to Sonnet, keep teams at 3–5, and clean up. For most day-to-day work, a single session or subagents remain the right, cheaper choice.
- Pairs with our practices — quality-gate hooks (
TaskCompletedrequiring green tests) and reusablesecurity-reviewer/test-runnerrole definitions are how you'd make this auditable and repeatable for safety-adjacent work.
Sources
- Agent Teams — official docs (code.claude.com) — architecture, setup, control, limitations.
- Manage costs — official docs (code.claude.com) — ≈7× token figure, cost-management guidance.
- Claude Code agents in 2026 (CloudZero) — real-world cost/perf field data (estimates).
- Agent Teams setup guide (claudefa.st) — practitioner walkthrough.
- (accessed 2026-06-16; feature is experimental and evolving — re-verify env var, version, and limits before relying on them.)