The Claude Agent SDK — Claude Code

What it is#

The Claude Agent SDK is the programmatic counterpart to Claude Code. It exposes the same agent loop — read, plan, call tools, emit text — as a library you embed in your own Python or TypeScript program. Where Claude Code is the interactive CLI, the SDK is what you reach for when the caller is code, not a person.

Same model, same tools, same conventions. The SDK ships file tools (Read, Edit, Write), shell tools (Bash), search tools (Grep, Glob), web tools (WebFetch, WebSearch), task management (Task, TodoWrite), and the same hook surface (PreToolUse, PostToolUse, UserPromptSubmit, Stop). You can register custom tools, customize the system prompt, plug in MCP servers, and stream responses — all from inside a regular program.

The mental model: Claude Code is a CLI app built on the agent loop. The Agent SDK is the same agent loop, exposed for you to build a different app on. Backend service that reviews every PR? Agent SDK. Long-running ETL job that uses Claude as a step? Agent SDK. CLI for your own product that uses Claude to talk to a custom toolset? Agent SDK.

When to use it#

Reach for the SDK when:

The caller is code, not a person. A web backend, a worker queue, a scheduled job, a CI step. Anything where there is no terminal and no human to type.
You want to embed agent behaviour in your own product. Your CLI has its own UX; underneath, an agent loop drives some of the work.
You need long-running or scheduled work. A nightly ETL that uses Claude to clean up data; a worker that processes a queue of issues; a service that responds to webhooks.
You want custom tools. Functions in your own codebase that Claude can call as if they were built-in tools.
You want structured output that downstream code parses. The SDK lets you specify response formats, stream tokens, and capture intermediate tool calls.
The control flow is non-linear. Branching, retries, fan-out across many agents, intermediate human-in-the-loop steps wired through your own UI.

Don’t reach for the SDK when:

A human will be doing the work interactively. That’s Claude Code.
A shell-script wrapper around claude --print is enough. Don’t graduate to the SDK until you’ve outgrown headless mode.
The work is one-shot. Don’t build a service for something you’ll run once.

Claude Code (CLI). Interactive, human-in-the-loop. Permission prompts, status line, slash commands. Best when a person is steering. Headless mode covers most “run from a script” cases without leaving the CLI.

Agent SDK. Library you embed. Same agent loop, your code in control. You handle stdin/stdout, you wire the tools, you decide when to stop. Best when the caller is a program and you need first-class control over the loop.

How it works#

The loop, exposed#

Internally, both Claude Code and the SDK run the same loop: send messages and tool schemas to the model, parse tool_use blocks, execute them, send results back, repeat until the model emits a stop_reason of end_turn. The SDK just exposes that loop as a function call.

In TypeScript:

import { query } from "@anthropic-ai/claude-agent-sdk";

const result = query({
  prompt: "Summarise README.md in three bullets.",
  options: {
    cwd: process.cwd(),
    allowedTools: ["Read", "Grep"],
  },
});

for await (const message of result) {
  if (message.type === "assistant") {
    console.log(message.text);
  }
}

In Python:

from claude_agent_sdk import query

async for message in query(
    prompt="Summarise README.md in three bullets.",
    options={"cwd": ".", "allowed_tools": ["Read", "Grep"]},
):
    if message.type == "assistant":
        print(message.text)

The SDK streams an async iterable of messages — assistant turns, tool calls, tool results, system events — and the calling code decides what to do with each.

Shared conventions with Claude Code#

The SDK and Claude Code share these conventions, which means anything you learn in one transfers:

Tool schemas. Same Read, Edit, Write, Bash, Grep, Glob, WebFetch, WebSearch, Task, TodoWrite. Same input shapes and same return shapes.
Settings layering. The SDK picks up the same ~/.claude/settings.json (and project-scoped) settings the CLI uses, including hooks and permissions, unless you override them.
MCP servers. Same MCP wire protocol. An MCP server you build for Claude Code works in the SDK and vice versa.
Hooks. PreToolUse, PostToolUse, UserPromptSubmit, Stop. Same payload shapes. You can also register hooks programmatically in the SDK without putting them in settings.json.
CLAUDE.md auto-loading. If you point the SDK at a working directory, the project’s CLAUDE.md is loaded as it would be in Claude Code.

This is the practical reason to invest in Claude Code conventions even if you’ll later build with the SDK: every habit transfers.

Sub-agents in the SDK#

The SDK exposes the Task tool, so sub-agents work the same way: the main agent dispatches a brief, the sub-agent runs in its own context, returns a summary. Custom sub-agents authored as markdown files in .claude/agents/ are picked up automatically.

You can also register sub-agents programmatically:

const result = query({
  prompt: "Find all open PRs and triage them.",
  options: {
    agents: {
      "pr-triager": {
        description: "Triages a single PR for risk and priority.",
        prompt: "You are a triage specialist...",
        tools: ["Bash", "Read"],
        model: "claude-opus-4-7",
      },
    },
  },
});

The main agent now sees pr-triager as a dispatchable sub-agent.

Custom tools#

Beyond the built-ins, you can register your own tools — functions in your codebase that Claude can call as if they were native. A TypeScript example:

import { tool, createSdkMcpServer } from "@anthropic-ai/claude-agent-sdk";
import { z } from "zod";

const lookupCustomer = tool(
  "lookup_customer",
  "Look up a customer by ID and return their plan and signup date.",
  { id: z.string() },
  async ({ id }) => {
    const row = await db.customers.findById(id);
    return { content: [{ type: "text", text: JSON.stringify(row) }] };
  }
);

const mcpServer = createSdkMcpServer({
  name: "billing-internal",
  version: "1.0.0",
  tools: [lookupCustomer],
});

The tool now appears in the agent’s tool list with the description as its schema doc. The model decides when to call it; you decide what it does. This is the entry point for “Claude that knows about your product”.

Streaming and message types#

The SDK exposes messages as an async stream. Message types:

assistant — the model’s text turn.
tool_use — the model is calling a tool.
tool_result — a tool has returned.
user — anything injected by the SDK harness or your code.
result — the final outcome of the conversation, with token counts and stop reason.

This means your application code can update UI, write to logs, surface intermediate progress, or cancel mid-stream — all natively.

Configuration#

Installation#

TypeScript:

npm install @anthropic-ai/claude-agent-sdk

Python:

pip install claude-agent-sdk

Both require ANTHROPIC_API_KEY in the environment, or one of the supported managed-credential paths. The SDK respects the same auth conventions as the Claude Code CLI.

Options worth knowing about#

Option	Effect
`cwd`	Working directory for file/Bash tools. Often `process.cwd()` or a temporary repo checkout.
`allowedTools`	Tool allowlist for the entire query. Lock down what the agent can do.
`permissionMode`	`default` / `acceptEdits` / `bypassPermissions` / `plan`. The CLI’s permission modes, exposed programmatically.
`systemPrompt`	Append additional system instructions.
`agents`	Inline sub-agent definitions (in addition to anything in `.claude/agents/`).
`hooks`	Programmatic hooks — callbacks instead of subprocesses. Same lifecycle events as the CLI.
`mcpServers`	MCP server connections to register for the query.
`model`	Override the model for this query.
`maxTurns`	Cap on agent loop iterations — useful for cost control in batch jobs.

Most of these can also be set via the settings layering. Programmatic overrides win — useful when one batch job needs bypassPermissions and the rest of your codebase shouldn’t.

Examples#

A nightly issue-triager#

import { query } from "@anthropic-ai/claude-agent-sdk";

async function triageOpenIssues() {
  const result = query({
    prompt: `
      Fetch open issues with no label via \`gh issue list --search "is:open no:label"\`.
      For each, propose a label set and a one-line summary. Output a JSON array.
    `,
    options: {
      cwd: "/repos/my-product",
      allowedTools: ["Bash", "Read"],
      permissionMode: "acceptEdits",
      maxTurns: 50,
    },
  });

  let lastText = "";
  for await (const m of result) {
    if (m.type === "assistant") lastText = m.text;
  }
  return JSON.parse(lastText);
}

Wire this to cron. Every morning you wake up to a JSON array of triage suggestions ready for review.

A worker that responds to webhook events#

from claude_agent_sdk import query

async def handle_pr_opened(pr_url: str):
    async for m in query(
        prompt=f"Review {pr_url} — risk, blast radius, recommended action.",
        options={
            "cwd": "/repos/my-product",
            "allowed_tools": ["Bash", "Read", "Grep"],
            "max_turns": 30,
        },
    ):
        if m.type == "assistant":
            await post_review_comment(pr_url, m.text)

Every PR-opened webhook spawns a worker. The worker runs an agent that reviews and posts back. No human in the loop — but the same agent loop you’d run interactively.

A CLI with a custom tool#

A CLI for a billing product that lets a support engineer ask Claude things like “what plan is customer X on?” — with Claude calling into the product’s database directly:

import { query, tool, createSdkMcpServer } from "@anthropic-ai/claude-agent-sdk";
import { z } from "zod";

const lookupCustomer = tool(
  "lookup_customer",
  "Look up a customer by ID. Returns plan, signup date, last login.",
  { id: z.string() },
  async ({ id }) => {
    const row = await billing.customers.findById(id);
    return { content: [{ type: "text", text: JSON.stringify(row) }] };
  },
);

const mcp = createSdkMcpServer({
  name: "billing",
  version: "1.0.0",
  tools: [lookupCustomer],
});

async function run(userQuestion: string) {
  for await (const m of query({
    prompt: userQuestion,
    options: {
      mcpServers: { billing: mcp },
      systemPrompt: "You are a support assistant for the billing product.",
    },
  })) {
    if (m.type === "assistant") process.stdout.write(m.text);
  }
}

Now the engineer types lookup customer 1234 and Claude calls lookup_customer natively.

Programmatic hooks#

Instead of subprocess hooks via settings, register callbacks in your code:

const result = query({
  prompt: "Run the test suite and summarise failures.",
  options: {
    hooks: {
      PreToolUse: async (event) => {
        if (event.tool === "Bash" && event.input.command.includes("rm -rf /")) {
          return { allow: false, message: "Refused by SDK guard." };
        }
        return { allow: true };
      },
      PostToolUse: async (event) => {
        await metrics.increment(`tool_use.${event.tool}`);
        return {};
      },
    },
  },
});

The CLI hook surface and the SDK hook surface map onto each other: the SDK callback signature mirrors the JSON payload the subprocess hook receives. Anything you’d put in a subprocess hook can be a callback instead — useful when you don’t want a separate executable on disk.

Gotchas#

The SDK uses your API budget directly. Every query is API calls and tokens. A misbehaving loop with no maxTurns cap can run up an expensive bill quickly. Always set maxTurns in batch contexts.
bypassPermissions is dangerous. It’s the right mode for many automated jobs — but it means the agent can run any allowed tool without prompting. Combine it with a tight allowedTools and PreToolUse hooks.
The async iterable must be consumed. Drop the iterator and the underlying request may stay open. Either consume to the end or call the cancellation method.
Streamed text is not finalised. Partial assistant text arrives as deltas; if you concatenate, you get the full message at the end of the stream. If your downstream parser needs a complete payload, wait for the result message.
MCP tools have a startup cost. Each MCP server is a connection. Booting many servers per query adds latency; pool or persist them when you can.
Hook callbacks block the loop. A callback that takes 2 seconds adds 2 seconds to every matching event. Keep callbacks fast or do the heavy work asynchronously.
cwd is load-bearing. File and Bash tools resolve paths relative to cwd. Forget to set it and the agent operates in your service’s process cwd — which is rarely what you want.
Errors during tool execution are surfaced to the model, not raised. A failed tool call shows up as a tool_result with error content. The model usually recovers gracefully, but you don’t get an exception to catch unless you intercept via a hook.

Why the SDK and Claude Code share so much

The pragmatic reason is that Claude Code itself is implemented on top of the same agent loop the SDK exposes. The tool schemas, the permission model, the hook contract, the MCP integration, the CLAUDE.md loader — all live in a shared substrate. The CLI is one app built on it; the SDK is the substrate handed to you. This is why investments in Claude Code conventions (custom slash commands, custom sub-agents, hooks, MCP servers) transfer almost lossless to SDK-based agents: they’re literally the same primitives. The reverse is also true — anything you build with the SDK that wraps a custom tool can later be exposed to Claude Code via an MCP server.