Cascade AI — Multi-Tier Agent Orchestration

Classification	Example	Route	T2 Managers
Simple	"What is a closure?"	T3	—
Moderate	"Add pagination to the users API"	T2 T3 ×2	1
Complex	"Refactor auth module to JWT, add tests, open PR"	T1 T2 ×3 T3 ×n	3–5
Highly Complex	"Research, benchmark, and document the full auth ecosystem"	T1 T2 ×5+ T3 ×n	5+

Everything included

Production-grade from day one.

No plugin store to browse. The tools your agents need are already wired in.

New · v0.6

📊

Live Benchmark Auto-Routing

Set any tier to Auto and Cascade picks the best-value model per task — fusing live public benchmark scores with live OpenRouter pricing. Cheap models win trivial work; frontier models win the hard parts.

New · v0.8

🤖

Autonomous Mode

Type /auto on for hands-off runs: the plan auto-approves and safe tools run without prompts — while dangerous tools still ask and budget caps stay the hard stop.

New · v0.7

📋

Boardroom Plan Review

Pause before any worker spawns to review T1's plan — an AI reviewer critiques it, you drop sections inline or add a steering note, and it re-plans until you approve.

New · v0.9

⏯️

Run Resumability

Hit the budget cap on a big task? /continue resumes with a raised budget — files already created persist on disk, so only the remaining work runs. No redo.

New · v0.10.1

👥

Workers Recruit Help

A worker that discovers its task should fan out asks its manager to spawn bounded sibling workers on the fly — dynamic parallelism, no rigid up-front plan, no runaway recursion.

⚡

Live Agent Tree

Watch the T1→T2→T3 hierarchy execute in real time directly in the terminal via ink rendering.

🔒

Permission Escalation

Dangerous tool calls escalate through T2 → T1 → user before executing. Never a silent file delete.

🔄

Provider Failover

Rate-limit hit? Cascade auto-switches providers with exponential backoff. Zero config required.

🛠️

Full Tool Suite

Shell, file CRUD, git, GitHub/GitLab PRs, Playwright browser automation, PDF creation, code interpreter.

🌐

Web Dashboard

React + ReactFlow live topology graph, session browser, cost tracker, JWT auth, WebSocket updates.

🔌

MCP Support

Connect any Model Context Protocol server. Its tools become available to every T3 worker automatically.

💰

Per-Tier Cost Breakdown & Budget Control

Every result exposes costByTier, tokensByTier, and percentage attribution. Set a live session budget with /budget set 0.50 — Cascade warns you at 80% spend (configurable via warnAtPct) and stops new tasks the moment the cap is hit, with no config-file edits required.

⌨️

Guided Setup Wizard

First-run TUI collects API keys for every provider — including multiple Azure deployments and custom OpenAI-compatible endpoints. Fetches live model lists, then assign T1/T2/T3 models or let Cascade Auto decide.

🖥️

Claude Code-Style CLI

Redesigned terminal UI with a top status bar showing live tier models and cost, a compact agent tree for T1→T2→T3 progress, and a keyboard hint bar — all purpose-built for Cascade's multi-tier hierarchy.

🎛️

Interactive Model Picker

Run /model inside the REPL for a three-step picker — provider → tier → model — with Auto at every step. Arrow keys, Tab, j/k and number keys all work; selections write .cascade/config.json and hot-swap the live router, no restart required.

⛔

Task Cancellation via AbortSignal

Pass an AbortSignal to cascade.run() to stop any in-progress run mid-flight. All active tiers (T1 → T2 → T3) halt at the next safe checkpoint before the next LLM call — no mid-stream interruptions, no orphaned agents. A run:cancelled event fires with partial output so you can still surface what was produced. Prevents runaway token spend on long tasks.

cascade.config.json

// .cascade/config.json
{
  "version": "1.0",
  "providers": [
    { "type": "anthropic",
      "apiKey": "sk-ant-..." },
    { "type": "ollama" }
  ],
  "models": {
    "t1": "claude-opus-4",
    "t2": "claude-sonnet-4",
    "t3": "llama3.2:3b"
  },
  "tools": {
    "shellBlocklist": ["rm -rf"],
    "requireApprovalFor": ["shell"]
  }
}

SDK usage

Embed in any
Node.js project.

Cascade exposes a first-class TypeScript SDK. Bring your own approval flow, stream tokens to any UI, or wire it into a CI pipeline.

Full TypeScript types for every option and result

Token-by-token streaming via callback

Custom approval callbacks for tool gating

Per-tier cost & token breakdown — costByTier, tokensByTier, and percentage attribution in every result

Live budget management — /budget set <$amount> caps session spend at runtime; /budget shows a visual spend bar; proactive warning fires at 80% (configurable warnAtPct) before the hard stop

runCascade, createCascade, streamCascade — three entry points

example.ts

import { streamCascade } from 'cascade-ai';

await streamCascade(
  'Refactor auth module to use JWT, add tests, open a PR',
  (token) => process.stdout.write(token),
  {
    workspacePath: '/my/project',
    approvalCallback: async (req) => {
      console.log(`Allow ${req.toolName}?`);
      return { approved: true, always: false };
    },
  }
);

Agents that
think in tiers.

Three tiers. One coherent output.

Administrator

Complexity determines the tier count.

Production-grade from day one.

Live Benchmark Auto-Routing

Autonomous Mode

Boardroom Plan Review

Run Resumability

Workers Recruit Help

Live Agent Tree

Permission Escalation

Provider Failover

Full Tool Suite

Web Dashboard

MCP Support

Per-Tier Cost Breakdown & Budget Control

Guided Setup Wizard

Claude Code-Style CLI

Interactive Model Picker

Task Cancellation via AbortSignal

Embed in any
Node.js project.

One command away.

Agents that think in tiers.

Three tiers. One coherent output.

Administrator

Complexity determines the tier count.

Production-grade from day one.

Live Benchmark Auto-Routing

Autonomous Mode

Boardroom Plan Review

Run Resumability

Workers Recruit Help

Live Agent Tree

Permission Escalation

Provider Failover

Full Tool Suite

Web Dashboard

MCP Support

Per-Tier Cost Breakdown & Budget Control

Guided Setup Wizard

Claude Code-Style CLI

Interactive Model Picker

Task Cancellation via AbortSignal

Embed in anyNode.js project.

One command away.

Agents that
think in tiers.

Embed in any
Node.js project.