Build production AI agents with Claude Code as a library
Note
The Claude Code SDK has been renamed to the Claude Agent SDK. If you're migrating from the old SDK, see the Migration Guide.
Build AI agents that autonomously read files, run commands, search the web, edit code, and more. The Agent SDK gives you the same tools, agent loop, and context management that power Claude Code, programmable in Python and TypeScript.
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions
async def main():
async for message in query(
prompt="Find and fix the bug in auth.py",
options=ClaudeAgentOptions(allowed_tools=["Read", "Edit", "Bash"]),
):
print(message) # Claude reads the file, finds the bug, edits it
asyncio.run(main())
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "Find and fix the bug in auth.py",
options: { allowedTools: ["Read", "Edit", "Bash"] }
})) {
console.log(message); // Claude reads the file, finds the bug, edits it
}
The Agent SDK includes built-in tools for reading files, running commands, and editing code, so your agent can start working immediately without you implementing tool execution. Dive into the quickstart or explore real agents built with the SDK:
Build a bug-fixing agent in minutes
Email assistant, research agent, and more
npm install @anthropic-ai/claude-agent-sdk
pip install claude-agent-sdk
Get an API key from the Console, then set it as an environment variable:
export ANTHROPIC_API_KEY=your-api-key
The SDK also supports authentication via third-party API providers:
CLAUDE_CODE_USE_BEDROCK=1 environment variable and configure AWS credentialsCLAUDE_CODE_USE_VERTEX=1 environment variable and configure Google Cloud credentialsCLAUDE_CODE_USE_FOUNDRY=1 environment variable and configure Azure credentialsSee the setup guides for Bedrock, Vertex AI, or Azure AI Foundry for details.
Note
Unless previously approved, Anthropic does not allow third party developers to offer claude.ai login or rate limits for their products, including agents built on the Claude Agent SDK. Please use the API key authentication methods described in this document instead.
This example creates an agent that lists files in your current directory using built-in tools.
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions
async def main():
async for message in query(
prompt="What files are in this directory?",
options=ClaudeAgentOptions(allowed_tools=["Bash", "Glob"]),
):
if hasattr(message, "result"):
print(message.result)
asyncio.run(main())
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "What files are in this directory?",
options: { allowedTools: ["Bash", "Glob"] }
})) {
if ("result" in message) console.log(message.result);
}
Ready to build? Follow the Quickstart to create an agent that finds and fixes bugs in minutes.
Everything that makes Claude Code powerful is available in the SDK:
Your agent can read files, run commands, and search codebases out of the box. Key tools include:
| Tool | What it does |
|---|---|
| Read | Read any file in the working directory |
| Write | Create new files |
| Edit | Make precise edits to existing files |
| Bash | Run terminal commands, scripts, git operations |
| Monitor | Watch a background script and react to each output line as an event |
| Glob | Find files by pattern (**/*.ts, src/**/*.py) |
| Grep | Search file contents with regex |
| WebSearch | Search the web for current information |
| WebFetch | Fetch and parse web page content |
| AskUserQuestion | Ask the user clarifying questions with multiple choice options |
This example creates an agent that searches your codebase for TODO comments:
import asyncio from claude_agent_sdk import query, ClaudeAgentOptionsasync def main: async for message in query( prompt="Find all TODO comments and create a summary", options=ClaudeAgentOptions(allowed_tools=["Read", "Glob", "Grep"]), ): if hasattr(message, "result"): print(message.result)
asyncio.run(main)
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "Find all TODO comments and create a summary",
options: { allowedTools: ["Read", "Glob", "Grep"] }
})) {
if ("result" in message) console.log(message.result);
}
Run custom code at key points in the agent lifecycle. SDK hooks use callback functions to validate, log, block, or transform agent behavior.
Available hooks: PreToolUse, PostToolUse, Stop, SessionStart, SessionEnd, UserPromptSubmit, and more.
This example logs all file changes to an audit file:
import asyncio from datetime import datetime from claude_agent_sdk import query, ClaudeAgentOptions, HookMatcherasync def log_file_change(input_data, tool_use_id, context): file_path = input_data.get("tool_input", {}).get("file_path", "unknown") with open("./audit.log", "a") as f: f.write(f"{datetime.now}: modified {file_path}\n") return {}
async def main: async for message in query( prompt="Refactor utils.py to improve readability", options=ClaudeAgentOptions( permission_mode="acceptEdits", hooks={ "PostToolUse": [ HookMatcher(matcher="Edit|Write", hooks=[log_file_change]) ] }, ), ): if hasattr(message, "result"): print(message.result)
asyncio.run(main)
import { query, HookCallback } from "@anthropic-ai/claude-agent-sdk";
import { appendFile } from "fs/promises";
const logFileChange: HookCallback = async (input) => {
const filePath = (input as any).tool_input?.file_path ?? "unknown";
await appendFile("./audit.log", `${new Date().toISOString()}: modified ${filePath}\n`);
return {};
};
for await (const message of query({
prompt: "Refactor utils.py to improve readability",
options: {
permissionMode: "acceptEdits",
hooks: {
PostToolUse: [{ matcher: "Edit|Write", hooks: [logFileChange] }]
}
}
})) {
if ("result" in message) console.log(message.result);
}
Spawn specialized agents to handle focused subtasks. Your main agent delegates work, and subagents report back with results.
Define custom agents with specialized instructions. Include Agent in allowedTools since subagents are invoked via the Agent tool:
import asyncio from claude_agent_sdk import query, ClaudeAgentOptions, AgentDefinitionasync def main: async for message in query( prompt="Use the code-reviewer agent to review this codebase", options=ClaudeAgentOptions( allowed_tools=["Read", "Glob", "Grep", "Agent"], agents={ "code-reviewer": AgentDefinition( description="Expert code reviewer for quality and security reviews.", prompt="Analyze code quality and suggest improvements.", tools=["Read", "Glob", "Grep"], ) }, ), ): if hasattr(message, "result"): print(message.result)
asyncio.run(main)
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "Use the code-reviewer agent to review this codebase",
options: {
allowedTools: ["Read", "Glob", "Grep", "Agent"],
agents: {
"code-reviewer": {
description: "Expert code reviewer for quality and security reviews.",
prompt: "Analyze code quality and suggest improvements.",
tools: ["Read", "Glob", "Grep"]
}
}
}
})) {
if ("result" in message) console.log(message.result);
}
Messages from within a subagent's context include a parent_tool_use_id field, letting you track which messages belong to which subagent execution.
Connect to external systems via the Model Context Protocol: databases, browsers, APIs, and hundreds more.
This example connects the Playwright MCP server to give your agent browser automation capabilities:
import asyncio from claude_agent_sdk import query, ClaudeAgentOptionsasync def main: async for message in query( prompt="Open example.com and describe what you see", options=ClaudeAgentOptions( mcp_servers={ "playwright": {"command": "npx", "args": ["@playwright/mcp@latest"]} } ), ): if hasattr(message, "result"): print(message.result)
asyncio.run(main)
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "Open example.com and describe what you see",
options: {
mcpServers: {
playwright: { command: "npx", args: ["@playwright/mcp@latest"] }
}
}
})) {
if ("result" in message) console.log(message.result);
}
Control exactly which tools your agent can use. Allow safe operations, block dangerous ones, or require approval for sensitive actions.
Note
For interactive approval prompts and the AskUserQuestion tool, see Handle approvals and user input.
This example creates a read-only agent that can analyze but not modify code. allowed_tools pre-approves Read, Glob, and Grep.
import asyncio from claude_agent_sdk import query, ClaudeAgentOptionsasync def main: async for message in query( prompt="Review this code for best practices", options=ClaudeAgentOptions( allowed_tools=["Read", "Glob", "Grep"], ), ): if hasattr(message, "result"): print(message.result)
asyncio.run(main)
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "Review this code for best practices",
options: {
allowedTools: ["Read", "Glob", "Grep"]
}
})) {
if ("result" in message) console.log(message.result);
}
Maintain context across multiple exchanges. Claude remembers files read, analysis done, and conversation history. Resume sessions later, or fork them to explore different approaches.
This example captures the session ID from the first query, then resumes to continue with full context:
import asyncio from claude_agent_sdk import query, ClaudeAgentOptions, SystemMessage, ResultMessageasync def main: session_id = None
# First query: capture the session ID async for message in query( prompt="Read the authentication module", options=ClaudeAgentOptions(allowed_tools=["Read", "Glob"]), ): if isinstance(message, SystemMessage) and message.subtype == "init": session_id = message.data["session_id"] # Resume with full context from the first query async for message in query( prompt="Now find all places that call it", # "it" = auth module options=ClaudeAgentOptions(resume=session_id), ): if isinstance(message, ResultMessage): print(message.result)
asyncio.run(main)
import { query } from "@anthropic-ai/claude-agent-sdk";
let sessionId: string | undefined;
// First query: capture the session ID
for await (const message of query({
prompt: "Read the authentication module",
options: { allowedTools: ["Read", "Glob"] }
})) {
if (message.type === "system" && message.subtype === "init") {
sessionId = message.session_id;
}
}
// Resume with full context from the first query
for await (const message of query({
prompt: "Now find all places that call it", // "it" = auth module
options: { resume: sessionId }
})) {
if ("result" in message) console.log(message.result);
}
The SDK also supports Claude Code's filesystem-based configuration. To use these features, set setting_sources=["project"] (Python) or settingSources: ['project'] (TypeScript) in your options.
| Feature | Description | Location |
|---|---|---|
| Skills | Specialized capabilities defined in Markdown | .claude/skills/*/SKILL.md |
| Slash commands | Custom commands for common tasks | .claude/commands/*.md |
| Memory | Project context and instructions | CLAUDE.md or .claude/CLAUDE.md |
| Plugins | Extend with custom commands, agents, and MCP servers | Programmatic via plugins option |
The Claude Platform offers multiple ways to build with Claude. Here's how the Agent SDK fits in:
The Anthropic Client SDK gives you direct API access: you send prompts and implement tool execution yourself. The Agent SDK gives you Claude with built-in tool execution.
With the Client SDK, you implement a tool loop. With the Agent SDK, Claude handles it:
# Client SDK: You implement the tool loop response = client.messages.create(...) while response.stop_reason == "tool_use": result = your_tool_executor(response.tool_use) response = client.messages.create(tool_result=result, **params)Agent SDK: Claude handles tools autonomously
async for message in query(prompt="Fix the bug in auth.py"): print(message)
// Client SDK: You implement the tool loop
let response = await client.messages.create({ ...params });
while (response.stop_reason === "tool_use") {
const result = yourToolExecutor(response.tool_use);
response = await client.messages.create({ tool_result: result, ...params });
}
// Agent SDK: Claude handles tools autonomously
for await (const message of query({ prompt: "Fix the bug in auth.py" })) {
console.log(message);
}
Same capabilities, different interface:
| Use case | Best choice |
|---|---|
| Interactive development | CLI |
| CI/CD pipelines | SDK |
| Custom applications | SDK |
| One-off tasks | CLI |
| Production automation | SDK |
Many teams use both: CLI for daily development, SDK for production. Workflows translate directly between them.
View the full changelog for SDK updates, bug fixes, and new features:
If you encounter bugs or issues with the Agent SDK:
For partners integrating the Claude Agent SDK, use of Claude branding is optional. When referencing Claude in your product:
Allowed:
Not permitted:
Your product should maintain its own branding and not appear to be Claude Code or any Anthropic product. For questions about branding compliance, contact the Anthropic sales team.
Use of the Claude Agent SDK is governed by Anthropic's Commercial Terms of Service, including when you use it to power products and services that you make available to your own customers and end users, except to the extent a specific component or dependency is covered by a different license as indicated in that component's LICENSE file.
Get started with the Python or TypeScript Agent SDK to build AI agents that work autonomously
Use the Agent SDK to build an AI agent that reads your code, finds bugs, and fixes them, all without manual intervention.
What you'll do:
Create a new directory for this quickstart:
mkdir my-agent && cd my-agent
For your own projects, you can run the SDK from any folder; it will have access to files in that directory and its subdirectories by default.
Install the Agent SDK package for your language:
npm install @anthropic-ai/claude-agent-sdk
uv Python package manager is a fast Python package manager that handles virtual environments automatically:
uv init && uv add claude-agent-sdk
Create a virtual environment first, then install:
python3 -m venv .venv && source .venv/bin/activate
pip3 install claude-agent-sdk
Get an API key from the Claude Console, then create a .env file in your project directory:
ANTHROPIC_API_KEY=your-api-key
The SDK also supports authentication via third-party API providers:
CLAUDE_CODE_USE_BEDROCK=1 environment variable and configure AWS credentialsCLAUDE_CODE_USE_VERTEX=1 environment variable and configure Google Cloud credentialsCLAUDE_CODE_USE_FOUNDRY=1 environment variable and configure Azure credentialsSee the setup guides for Bedrock, Vertex AI, or Azure AI Foundry for details.
Note
Unless previously approved, Anthropic does not allow third party developers to offer claude.ai login or rate limits for their products, including agents built on the Claude Agent SDK. Please use the API key authentication methods described in this document instead.
This quickstart walks you through building an agent that can find and fix bugs in code. First, you need a file with some intentional bugs for the agent to fix. Create utils.py in the my-agent directory and paste the following code:
def calculate_average(numbers):
total = 0
for num in numbers:
total += num
return total / len(numbers)
def get_user_name(user):
return user["name"].upper()
This code has two bugs:
calculate_average([]) crashes with division by zeroget_user_name(None) crashes with a TypeErrorCreate agent.py if you're using the Python SDK, or agent.ts for TypeScript:
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, AssistantMessage, ResultMessage
async def main():
# Agentic loop: streams messages as Claude works
async for message in query(
prompt="Review utils.py for bugs that would cause crashes. Fix any issues you find.",
options=ClaudeAgentOptions(
allowed_tools=["Read", "Edit", "Glob"], # Tools Claude can use
permission_mode="acceptEdits", # Auto-approve file edits
),
):
# Print human-readable output
if isinstance(message, AssistantMessage):
for block in message.content:
if hasattr(block, "text"):
print(block.text) # Claude's reasoning
elif hasattr(block, "name"):
print(f"Tool: {block.name}") # Tool being called
elif isinstance(message, ResultMessage):
print(f"Done: {message.subtype}") # Final result
asyncio.run(main())
import { query } from "@anthropic-ai/claude-agent-sdk";
// Agentic loop: streams messages as Claude works
for await (const message of query({
prompt: "Review utils.py for bugs that would cause crashes. Fix any issues you find.",
options: {
allowedTools: ["Read", "Edit", "Glob"], // Tools Claude can use
permissionMode: "acceptEdits" // Auto-approve file edits
}
})) {
// Print human-readable output
if (message.type === "assistant" && message.message?.content) {
for (const block of message.message.content) {
if ("text" in block) {
console.log(block.text); // Claude's reasoning
} else if ("name" in block) {
console.log(`Tool: ${block.name}`); // Tool being called
}
}
} else if (message.type === "result") {
console.log(`Done: ${message.subtype}`); // Final result
}
}
This code has three main parts:
query: the main entry point that creates the agentic loop. It returns an async iterator, so you use async for to stream messages as Claude works. See the full API in the Python or TypeScript SDK reference.
prompt: what you want Claude to do. Claude figures out which tools to use based on the task.
options: configuration for the agent. This example uses allowedTools to pre-approve Read, Edit, and Glob, and permissionMode: "acceptEdits" to auto-approve file changes. Other options include systemPrompt, mcpServers, and more. See all options for Python or TypeScript.
The async for loop keeps running as Claude thinks, calls tools, observes results, and decides what to do next. Each iteration yields a message: Claude's reasoning, a tool call, a tool result, or the final outcome. The SDK handles the orchestration (tool execution, context management, retries) so you just consume the stream. The loop ends when Claude finishes the task or hits an error.
The message handling inside the loop filters for human-readable output. Without filtering, you'd see raw message objects including system initialization and internal state, which is useful for debugging but noisy otherwise.
Note
This example uses streaming to show progress in real-time. If you don't need live output (e.g., for background jobs or CI pipelines), you can collect all messages at once. See Streaming vs. single-turn mode for details.
Your agent is ready. Run it with the following command:
python3 agent.py
npx tsx agent.ts
After running, check utils.py. You'll see defensive code handling empty lists and null users. Your agent autonomously:
utils.py to understand the codeThis is what makes the Agent SDK different: Claude executes tools directly instead of asking you to implement them.
Note
If you see "API key not found", make sure you've set the ANTHROPIC_API_KEY environment variable in your .env file or shell environment. See the full troubleshooting guide for more help.
Now that your agent is set up, try some different prompts:
"Add docstrings to all functions in utils.py""Add type hints to all functions in utils.py""Create a README.md documenting the functions in utils.py"You can modify your agent's behavior by changing the options. Here are a few examples:
Add web search capability:
options = ClaudeAgentOptions(
allowed_tools=["Read", "Edit", "Glob", "WebSearch"], permission_mode="acceptEdits"
)
const _ = {
options: {
allowedTools: ["Read", "Edit", "Glob", "WebSearch"],
permissionMode: "acceptEdits"
}
};
Give Claude a custom system prompt:
options = ClaudeAgentOptions(
allowed_tools=["Read", "Edit", "Glob"],
permission_mode="acceptEdits",
system_prompt="You are a senior Python developer. Always follow PEP 8 style guidelines.",
)
const _ = {
options: {
allowedTools: ["Read", "Edit", "Glob"],
permissionMode: "acceptEdits",
systemPrompt: "You are a senior Python developer. Always follow PEP 8 style guidelines."
}
};
Run commands in the terminal:
options = ClaudeAgentOptions(
allowed_tools=["Read", "Edit", "Glob", "Bash"], permission_mode="acceptEdits"
)
const _ = {
options: {
allowedTools: ["Read", "Edit", "Glob", "Bash"],
permissionMode: "acceptEdits"
}
};
With Bash enabled, try: "Write unit tests for utils.py, run them, and fix any failures"
Tools control what your agent can do:
| Tools | What the agent can do |
|---|---|
Read, Glob, Grep |
Read-only analysis |
Read, Edit, Glob |
Analyze and modify code |
Read, Edit, Bash, Glob, Grep |
Full automation |
Permission modes control how much human oversight you want:
| Mode | Behavior | Use case |
|---|---|---|
acceptEdits |
Auto-approves file edits and common filesystem commands, asks for other actions | Trusted development workflows |
dontAsk |
Denies anything not in allowedTools |
Locked-down headless agents |
auto (TypeScript only) |
A model classifier approves or denies each tool call | Autonomous agents with safety guardrails |
bypassPermissions |
Runs every tool without prompts | Sandboxed CI, fully trusted environments |
default |
Requires a canUseTool callback to handle approval |
Custom approval flows |
The example above uses acceptEdits mode, which auto-approves file operations so the agent can run without interactive prompts. If you want to prompt users for approval, use default mode and provide a canUseTool callback that collects user input. For more control, see Permissions.
Now that you've created your first agent, learn how to extend its capabilities and tailor it to your use case:
Understand the message lifecycle, tool execution, context window, and architecture that power your SDK agents.
The Agent SDK lets you embed Claude Code's autonomous agent loop in your own applications. The SDK is a standalone package that gives you programmatic control over tools, permissions, cost limits, and output. You don't need the Claude Code CLI installed to use it.
When you start an agent, the SDK runs the same execution loop that powers Claude Code: Claude evaluates your prompt, calls tools to take action, receives the results, and repeats until the task is complete. This page explains what happens inside that loop so you can build, debug, and optimize your agents effectively.
Every agent session follows the same cycle:
SystemMessage with subtype "init" containing session metadata.AssistantMessage containing the text and any tool call requests.AssistantMessage with the text response (no tool calls), followed by a ResultMessage with the final text, token usage, cost, and session ID.A quick question ("what files are here?") might take one or two turns of calling Glob and responding with the results. A complex task ("refactor the auth module and update the tests") can chain dozens of tool calls across many turns, reading files, editing code, and running tests, with Claude adjusting its approach based on each result.
A turn is one round trip inside the loop: Claude produces output that includes tool calls, the SDK executes those tools, and the results feed back to Claude automatically. This happens without yielding control back to your code. Turns continue until Claude produces output with no tool calls, at which point the loop ends and the final result is delivered.
Consider what a full session might look like for the prompt "Fix the failing tests in auth.ts".
First, the SDK sends your prompt to Claude and yields a SystemMessage with the session metadata. Then the loop begins:
Bash to run npm test. The SDK yields an AssistantMessage with the tool call, executes the command, then yields a UserMessage with the output (three failures).Read on auth.ts and auth.test.ts. The SDK returns the file contents and yields an AssistantMessage.Edit to fix auth.ts, then calls Bash to re-run npm test. All three tests pass. The SDK yields an AssistantMessage.AssistantMessage with this text, then a ResultMessage with the same text plus cost and usage.That was four turns: three with tool calls, one final text-only response.
You can cap the loop with max_turns / maxTurns, which counts tool-use turns only. For example, max_turns=2 in the loop above would have stopped before the edit step. You can also use max_budget_usd / maxBudgetUsd to cap turns based on a spend threshold.
Without limits, the loop runs until Claude finishes on its own, which is fine for well-scoped tasks but can run long on open-ended prompts ("improve this codebase"). Setting a budget is a good default for production agents. See Turns and budget below for the option reference.
As the loop runs, the SDK yields a stream of messages. Each message carries a type that tells you what stage of the loop it came from. The five core types are:
SystemMessage: session lifecycle events. The subtype field distinguishes them: "init" is the first message (session metadata), and "compact_boundary" fires after compaction. In TypeScript, the compact boundary is its own SDKCompactBoundaryMessage type rather than a subtype of SDKSystemMessage.AssistantMessage: emitted after each Claude response, including the final text-only one. Contains text content blocks and tool call blocks from that turn.UserMessage: emitted after each tool execution with the tool result content sent back to Claude. Also emitted for any user inputs you stream mid-loop.StreamEvent: only emitted when partial messages are enabled. Contains raw API streaming events (text deltas, tool input chunks). See Stream responses.ResultMessage: the last message, always. Contains the final text result, token usage, cost, and session ID. Check the subtype field to determine whether the task succeeded or hit a limit. See Handle the result.These five types cover the full agent loop lifecycle in both SDKs. The TypeScript SDK also yields additional observability events (hook events, tool progress, rate limits, task notifications) that provide extra detail but are not required to drive the loop. See the Python message types reference and TypeScript message types reference for the complete lists.
Which messages you handle depends on what you're building:
ResultMessage to get the output, cost, and whether the task succeeded or hit a limit.AssistantMessage to see what Claude is doing each turn, including which tools it called.include_partial_messages in Python, includePartialMessages in TypeScript) to get StreamEvent messages in real time. See Stream responses in real-time.How you check message types depends on the SDK:
isinstance against classes imported from claude_agent_sdk (for example, isinstance(message, ResultMessage)).type string field (for example, message.type === "result"). AssistantMessage and UserMessage wrap the raw API message in a .message field, so content blocks are at message.message.content, not message.content.from claude_agent_sdk import query, AssistantMessage, ResultMessage
async for message in query(prompt="Summarize this project"):
if isinstance(message, AssistantMessage):
print(f"Turn completed: {len(message.content)} content blocks")
if isinstance(message, ResultMessage):
if message.subtype == "success":
print(message.result)
else:
print(f"Stopped: {message.subtype}")
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({ prompt: "Summarize this project" })) {
if (message.type === "assistant") {
console.log(`Turn completed: ${message.message.content.length} content blocks`);
}
if (message.type === "result") {
if (message.subtype === "success") {
console.log(message.result);
} else {
console.log(`Stopped: ${message.subtype}`);
}
}
}
Tools give your agent the ability to take action. Without tools, Claude can only respond with text. With tools, Claude can read files, run commands, search code, and interact with external services.
The SDK includes the same tools that power Claude Code:
| Category | Tools | What they do |
|---|---|---|
| File operations | Read, Edit, Write |
Read, modify, and create files |
| Search | Glob, Grep |
Find files by pattern, search content with regex |
| Execution | Bash |
Run shell commands, scripts, git operations |
| Web | WebSearch, WebFetch |
Search the web, fetch and parse pages |
| Discovery | ToolSearch |
Dynamically find and load tools on-demand instead of preloading all of them |
| Orchestration | Agent, Skill, AskUserQuestion, TodoWrite |
Spawn subagents, invoke skills, ask the user, track tasks |
Beyond built-in tools, you can:
Claude determines which tools to call based on the task, but you control whether those calls are allowed to execute. You can auto-approve specific tools, block others entirely, or require approval for everything. Three options work together to determine what runs:
allowed_tools / allowedTools auto-approves listed tools. A read-only agent with ["Read", "Glob", "Grep"] in its allowed tools list runs those tools without prompting. Tools not listed are still available but require permission.disallowed_tools / disallowedTools blocks listed tools, regardless of other settings. See Permissions for the order that rules are checked before a tool runs.permission_mode / permissionMode controls what happens to tools that aren't covered by allow or deny rules. See Permission mode for available modes.You can also scope individual tools with rules like "Bash(npm:*)" to allow only specific commands. See Permissions for the full rule syntax.
When a tool is denied, Claude receives a rejection message as the tool result and typically attempts a different approach or reports that it couldn't proceed.
When Claude requests multiple tool calls in a single turn, both SDKs can run them concurrently or sequentially depending on the tool. Read-only tools (like Read, Glob, Grep, and MCP tools marked as read-only) can run concurrently. Tools that modify state (like Edit, Write, and Bash) run sequentially to avoid conflicts.
Custom tools default to sequential execution. To enable parallel execution for a custom tool, mark it as read-only in its annotations: readOnly in TypeScript or readOnlyHint in Python.
You can limit how many turns the loop takes, how much it costs, how deeply Claude reasons, and whether tools require approval before running. All of these are fields on ClaudeAgentOptions (Python) / Options (TypeScript).
| Option | What it controls | Default |
|---|---|---|
Max turns (max_turns / maxTurns) |
Maximum tool-use round trips | No limit |
Max budget (max_budget_usd / maxBudgetUsd) |
Maximum cost before stopping | No limit |
When either limit is hit, the SDK returns a ResultMessage with a corresponding error subtype (error_max_turns or error_max_budget_usd). See Handle the result for how to check these subtypes and ClaudeAgentOptions / Options for syntax.
The effort option controls how much reasoning Claude applies. Lower effort levels use fewer tokens per turn and reduce cost. Not all models support the effort parameter. See Effort for which models support it.
| Level | Behavior | Good for |
|---|---|---|
"low" |
Minimal reasoning, fast responses | File lookups, listing directories |
"medium" |
Balanced reasoning | Routine edits, standard tasks |
"high" |
Thorough analysis | Refactors, debugging |
"max" |
Maximum reasoning depth | Multi-step problems requiring deep analysis |
If you don't set effort, the Python SDK leaves the parameter unset and defers to the model's default behavior. The TypeScript SDK defaults to "high".
Note
effort trades latency and token cost for reasoning depth within each response. Extended thinking is a separate feature that produces visible chain-of-thought blocks in the output. They are independent: you can set effort: "low" with extended thinking enabled, or effort: "max" without it.
Use lower effort for agents doing simple, well-scoped tasks (like listing files or running a single grep) to reduce cost and latency. effort is set at the top-level query options, not per-subagent.
The permission mode option (permission_mode in Python, permissionMode in TypeScript) controls whether the agent asks for approval before using tools:
| Mode | Behavior |
|---|---|
"default" |
Tools not covered by allow rules trigger your approval callback; no callback means deny |
"acceptEdits" |
Auto-approves file edits and common filesystem commands (mkdir, touch, mv, cp, etc.); other Bash commands follow default rules |
"plan" |
No tool execution; Claude produces a plan for review |
"dontAsk" |
Never prompts. Tools pre-approved by permission rules run, everything else is denied |
"auto" (TypeScript only) |
Uses a model classifier to approve or deny each tool call. See Auto mode for availability and behavior |
"bypassPermissions" |
Runs all allowed tools without asking. Cannot be used when running as root on Unix. Use only in isolated environments where the agent's actions cannot affect systems you care about |
For interactive applications, use "default" with a tool approval callback to surface approval prompts. For autonomous agents on a dev machine, "acceptEdits" auto-approves file edits and common filesystem commands (mkdir, touch, mv, cp, etc.) while still gating other Bash commands behind allow rules. Reserve "bypassPermissions" for CI, containers, or other isolated environments. See Permissions for full details.
If you don't set model, the SDK uses Claude Code's default, which depends on your authentication method and subscription. Set it explicitly (for example, model="claude-sonnet-4-6") to pin a specific model or to use a smaller model for faster, cheaper agents. See models for available IDs.
The context window is the total amount of information available to Claude during a session. It does not reset between turns within a session. Everything accumulates: the system prompt, tool definitions, conversation history, tool inputs, and tool outputs. Content that stays the same across turns (system prompt, tool definitions, CLAUDE.md) is automatically prompt cached, which reduces cost and latency for repeated prefixes.
Here's how each component affects context in the SDK:
| Source | When it loads | Impact |
|---|---|---|
| System prompt | Every request | Small fixed cost, always present |
| CLAUDE.md files | Session start, when settingSources is enabled |
Full content in every request (but prompt-cached, so only the first request pays full cost) |
| Tool definitions | Every request | Each tool adds its schema; use MCP tool search to load tools on-demand instead of all at once |
| Conversation history | Accumulates over turns | Grows with each turn: prompts, responses, tool inputs, tool outputs |
| Skill descriptions | Session start (with setting sources enabled) | Short summaries; full content loads only when invoked |
Large tool outputs consume significant context. Reading a big file or running a command with verbose output can use thousands of tokens in a single turn. Context accumulates across turns, so longer sessions with many tool calls build up significantly more context than short ones.
When the context window approaches its limit, the SDK automatically compacts the conversation: it summarizes older history to free space, keeping your most recent exchanges and key decisions intact. The SDK emits a message with type: "system" and subtype: "compact_boundary" in the stream when this happens (in Python this is a SystemMessage; in TypeScript it is a separate SDKCompactBoundaryMessage type).
Compaction replaces older messages with a summary, so specific instructions from early in the conversation may not be preserved. Persistent rules belong in CLAUDE.md (loaded via settingSources) rather than in the initial prompt, because CLAUDE.md content is re-injected on every request.
You can customize compaction behavior in several ways:
PreCompact hook: Run custom logic before compaction occurs, for example to archive the full transcript. The hook receives a trigger field (manual or auto). See hooks./compact as a prompt string to trigger compaction on demand. (Slash commands sent this way are SDK inputs, not CLI-only shortcuts. See slash commands in the SDK.)Add a section to your project's CLAUDE.md telling the compactor what to preserve. The header name isn't special; use any clear label.
# Summary instructions
When summarizing this conversation, always preserve:
- The current task objective and acceptance criteria
- File paths that have been read or modified
- Test results and error messages
- Decisions made and the reasoning behind them
A few strategies for long-running agents:
tools field on AgentDefinition to scope subagents to the minimum set they need, and use MCP tool search to load tools on demand instead of preloading all of them.ToolSearch tool can help by loading tools on-demand instead of preloading all of them. See MCP tool search for configuration."low" for agents that only need to read files or list directories. This reduces token usage and cost.For a detailed breakdown of per-feature context costs, see Understand context costs.
Each interaction with the SDK creates or continues a session. Capture the session ID from ResultMessage.session_id (available in both SDKs) to resume later. The TypeScript SDK also exposes it as a direct field on the init SystemMessage; in Python it's nested in SystemMessage.data.
When you resume, the full context from previous turns is restored: files that were read, analysis that was performed, and actions that were taken. You can also fork a session to branch into a different approach without modifying the original.
See Session management for the full guide on resume, continue, and fork patterns.
Note
In Python, ClaudeSDKClient handles session IDs automatically across multiple calls. See the Python SDK reference for details.
When the loop ends, the ResultMessage tells you what happened and gives you the output. The subtype field (available in both SDKs) is the primary way to check termination state.
| Result subtype | What happened | result field available? |
|---|---|---|
success |
Claude finished the task normally | Yes |
error_max_turns |
Hit the maxTurns limit before finishing |
No |
error_max_budget_usd |
Hit the maxBudgetUsd limit before finishing |
No |
error_during_execution |
An error interrupted the loop (for example, an API failure or cancelled request) | No |
error_max_structured_output_retries |
Structured output validation failed after the configured retry limit | No |
The result field (the final text output) is only present on the success variant, so always check the subtype before reading it. All result subtypes carry total_cost_usd, usage, num_turns, and session_id so you can track cost and resume even after errors. In Python, total_cost_usd and usage are typed as optional and may be None on some error paths, so guard before formatting them. See Tracking costs and usage for details on interpreting the usage fields.
The result also includes a stop_reason field (string | null in TypeScript, str | None in Python) indicating why the model stopped generating on its final turn. Common values are end_turn (model finished normally), max_tokens (hit the output token limit), and refusal (the model declined the request). On error result subtypes, stop_reason carries the value from the last assistant response before the loop ended. To detect refusals, check stop_reason === "refusal" (TypeScript) or stop_reason == "refusal" (Python). See SDKResultMessage (TypeScript) or ResultMessage (Python) for the full type.
Hooks are callbacks that fire at specific points in the loop: before a tool runs, after it returns, when the agent finishes, and so on. Some commonly used hooks are:
| Hook | When it fires | Common uses |
|---|---|---|
PreToolUse |
Before a tool executes | Validate inputs, block dangerous commands |
PostToolUse |
After a tool returns | Audit outputs, trigger side effects |
UserPromptSubmit |
When a prompt is sent | Inject additional context into prompts |
Stop |
When the agent finishes | Validate the result, save session state |
SubagentStart / SubagentStop |
When a subagent spawns or completes | Track and aggregate parallel task results |
PreCompact |
Before context compaction | Archive full transcript before summarizing |
Hooks run in your application process, not inside the agent's context window, so they don't consume context. Hooks can also short-circuit the loop: a PreToolUse hook that rejects a tool call prevents it from executing, and Claude receives the rejection message instead.
Both SDKs support all the events above. The TypeScript SDK includes additional events that Python does not yet support. See Control execution with hooks for the complete event list, per-SDK availability, and the full callback API.
This example combines the key concepts from this page into a single agent that fixes failing tests. It configures the agent with allowed tools (auto-approved so the agent runs autonomously), project settings, and safety limits on turns and reasoning effort. As the loop runs, it captures the session ID for potential resumption, handles the final result, and prints the total cost.
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, ResultMessage
async def run_agent():
session_id = None
async for message in query(
prompt="Find and fix the bug causing test failures in the auth module",
options=ClaudeAgentOptions(
allowed_tools=[
"Read",
"Edit",
"Bash",
"Glob",
"Grep",
], # Listing tools here auto-approves them (no prompting)
setting_sources=[
"project"
], # Load CLAUDE.md, skills, hooks from current directory
max_turns=30, # Prevent runaway sessions
effort="high", # Thorough reasoning for complex debugging
),
):
# Handle the final result
if isinstance(message, ResultMessage):
session_id = message.session_id # Save for potential resumption
if message.subtype == "success":
print(f"Done: {message.result}")
elif message.subtype == "error_max_turns":
# Agent ran out of turns. Resume with a higher limit.
print(f"Hit turn limit. Resume session {session_id} to continue.")
elif message.subtype == "error_max_budget_usd":
print("Hit budget limit.")
else:
print(f"Stopped: {message.subtype}")
if message.total_cost_usd is not None:
print(f"Cost: ${message.total_cost_usd:.4f}")
asyncio.run(run_agent())
import { query } from "@anthropic-ai/claude-agent-sdk";
let sessionId: string | undefined;
for await (const message of query({
prompt: "Find and fix the bug causing test failures in the auth module",
options: {
allowedTools: ["Read", "Edit", "Bash", "Glob", "Grep"], // Listing tools here auto-approves them (no prompting)
settingSources: ["project"], // Load CLAUDE.md, skills, hooks from current directory
maxTurns: 30, // Prevent runaway sessions
effort: "high" // Thorough reasoning for complex debugging
}
})) {
// Save the session ID to resume later if needed
if (message.type === "system" && message.subtype === "init") {
sessionId = message.session_id;
}
// Handle the final result
if (message.type === "result") {
if (message.subtype === "success") {
console.log(`Done: ${message.result}`);
} else if (message.subtype === "error_max_turns") {
// Agent ran out of turns. Resume with a higher limit.
console.log(`Hit turn limit. Resume session ${sessionId} to continue.`);
} else if (message.subtype === "error_max_budget_usd") {
console.log("Hit budget limit.");
} else {
console.log(`Stopped: ${message.subtype}`);
}
console.log(`Cost: $${message.total_cost_usd.toFixed(4)}`);
}
}
Now that you understand the loop, here's where to go depending on what you're building:
For the broader conceptual picture of the agentic loop (not SDK-specific), see How Claude Code works.
Load project instructions, skills, hooks, and other Claude Code features into your SDK agents.
The Agent SDK is built on the same foundation as Claude Code, which means your SDK agents have access to the same filesystem-based features: project instructions (CLAUDE.md and rules), skills, hooks, and more.
By default, the SDK loads no filesystem settings. Your agent runs in isolation mode with only what you pass programmatically. To load CLAUDE.md, skills, or filesystem hooks, set settingSources to tell the SDK where to look.
For a conceptual overview of what each feature does and when to use it, see Extend Claude Code.
The setting sources option (setting_sources in Python, settingSources in TypeScript) controls which filesystem-based settings the SDK loads. Without it, your agent won't discover skills, CLAUDE.md files, or project-level hooks.
This example loads both user-level and project-level settings by setting settingSources to ["user", "project"]:
from claude_agent_sdk import query, ClaudeAgentOptions, AssistantMessage, ResultMessage
async for message in query(
prompt="Help me refactor the auth module",
options=ClaudeAgentOptions(
# "user" loads from ~/.claude/, "project" loads from ./.claude/ in cwd.
# Together they give the agent access to CLAUDE.md, skills, hooks, and
# permissions from both locations.
setting_sources=["user", "project"],
allowed_tools=["Read", "Edit", "Bash"],
),
):
if isinstance(message, AssistantMessage):
for block in message.content:
if hasattr(block, "text"):
print(block.text)
if isinstance(message, ResultMessage) and message.subtype == "success":
print(f"\nResult: {message.result}")
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "Help me refactor the auth module",
options: {
// "user" loads from ~/.claude/, "project" loads from ./.claude/ in cwd.
// Together they give the agent access to CLAUDE.md, skills, hooks, and
// permissions from both locations.
settingSources: ["user", "project"],
allowedTools: ["Read", "Edit", "Bash"]
}
})) {
if (message.type === "assistant") {
for (const block of message.message.content) {
if (block.type === "text") console.log(block.text);
}
}
if (message.type === "result" && message.subtype === "success") {
console.log(`\nResult: ${message.result}`);
}
}
Each source loads settings from a specific location, where <cwd> is the working directory you pass via the cwd option (or the process's current directory if unset). For the full type definition, see SettingSource (TypeScript) or SettingSource (Python).
| Source | What it loads | Location |
|---|---|---|
"project" |
Project CLAUDE.md, .claude/rules/*.md, project skills, project hooks, project settings.json |
<cwd>/.claude/ and each parent directory up to the filesystem root (stopping when a .claude/ is found or no more parents exist) |
"user" |
User CLAUDE.md, ~/.claude/rules/*.md, user skills, user settings |
~/.claude/ |
"local" |
CLAUDE.local.md (gitignored), .claude/settings.local.json |
<cwd>/ |
To match the full Claude Code CLI behavior, use ["user", "project", "local"].
Warning
The cwd option determines where the SDK looks for project settings. If neither cwd nor any of its parent directories contains a .claude/ folder, project-level features won't load. Auto memory (the ~/.claude/projects/<project>/memory/ directory that Claude Code uses to persist notes across interactive sessions) is a CLI-only feature and is never loaded by the SDK.
CLAUDE.md files and .claude/rules/*.md files give your agent persistent context about your project: coding conventions, build commands, architecture decisions, and instructions. When settingSources includes "project" (as in the example above), the SDK loads these files into context at session start. The agent then follows your project conventions without you repeating them in every prompt.
| Level | Location | When loaded |
|---|---|---|
| Project (root) | <cwd>/CLAUDE.md or <cwd>/.claude/CLAUDE.md |
settingSources includes "project" |
| Project rules | <cwd>/.claude/rules/*.md |
settingSources includes "project" |
| Project (parent dirs) | CLAUDE.md files in directories above cwd |
settingSources includes "project", loaded at session start |
| Project (child dirs) | CLAUDE.md files in subdirectories of cwd |
settingSources includes "project", loaded on demand when the agent reads a file in that subtree |
| Local (gitignored) | <cwd>/CLAUDE.local.md |
settingSources includes "local" |
| User | ~/.claude/CLAUDE.md |
settingSources includes "user" |
| User rules | ~/.claude/rules/*.md |
settingSources includes "user" |
All levels are additive: if both project and user CLAUDE.md files exist, the agent sees both. There is no hard precedence rule between levels; if instructions conflict, the outcome depends on how Claude interprets them. Write non-conflicting rules, or state precedence explicitly in the more specific file ("These project instructions override any conflicting user-level defaults").
Tip
You can also inject context directly via systemPrompt without using CLAUDE.md files. See Modify system prompts. Use CLAUDE.md when you want the same context shared between interactive Claude Code sessions and your SDK agents.
For how to structure and organize CLAUDE.md content, see Manage Claude's memory.
Skills are markdown files that give your agent specialized knowledge and invocable workflows. Unlike CLAUDE.md (which loads every session), skills load on demand. The agent receives skill descriptions at startup and loads the full content when relevant.
To use skills in the SDK, set settingSources so the agent discovers skill files from the filesystem. The Skill tool is enabled by default when you don't specify allowedTools. If you are using an allowedTools allowlist, include "Skill" explicitly.
from claude_agent_sdk import query, ClaudeAgentOptions, ResultMessage
# Skills in .claude/skills/ are discovered automatically
# when settingSources includes "project"
async for message in query(
prompt="Review this PR using our code review checklist",
options=ClaudeAgentOptions(
setting_sources=["user", "project"],
allowed_tools=["Skill", "Read", "Grep", "Glob"],
),
):
if isinstance(message, ResultMessage) and message.subtype == "success":
print(message.result)
import { query } from "@anthropic-ai/claude-agent-sdk";
// Skills in .claude/skills/ are discovered automatically
// when settingSources includes "project"
for await (const message of query({
prompt: "Review this PR using our code review checklist",
options: {
settingSources: ["user", "project"],
allowedTools: ["Skill", "Read", "Grep", "Glob"]
}
})) {
if (message.type === "result" && message.subtype === "success") {
console.log(message.result);
}
}
Note
Skills must be created as filesystem artifacts (.claude/skills/<name>/SKILL.md). The SDK does not have a programmatic API for registering skills. See Agent Skills in the SDK for full details.
For more on creating and using skills, see Agent Skills in the SDK.
The SDK supports two ways to define hooks, and they run side by side:
settings.json, loaded when settingSources includes the relevant source. These are the same hooks you'd configure for interactive Claude Code sessions.query. These run in your application process and can return structured decisions. See Control execution with hooks.Both types execute during the same hook lifecycle. If you already have hooks in your project's .claude/settings.json and you set settingSources: ["project"], those hooks run automatically in the SDK with no extra configuration.
Hook callbacks receive the tool input and return a decision dict. Returning {} (an empty dict) means allow the tool to proceed. Returning {"decision": "block", "reason": "..."} prevents execution and the reason is sent to Claude as the tool result. See the hooks guide for the full callback signature and return types.
from claude_agent_sdk import query, ClaudeAgentOptions, HookMatcher, ResultMessage
# PreToolUse hook callback. Positional args:
# input_data: HookInput dict with tool_name, tool_input, hook_event_name
# tool_use_id: str | None, the ID of the tool call being intercepted
# context: HookContext, carries session metadata
async def audit_bash(input_data, tool_use_id, context):
command = input_data.get("tool_input", {}).get("command", "")
if "rm -rf" in command:
return {"decision": "block", "reason": "Destructive command blocked"}
return {} # Empty dict: allow the tool to proceed
# Filesystem hooks from .claude/settings.json run automatically
# when settingSources loads them. You can also add programmatic hooks:
async for message in query(
prompt="Refactor the auth module",
options=ClaudeAgentOptions(
setting_sources=["project"], # Loads hooks from .claude/settings.json
hooks={
"PreToolUse": [
HookMatcher(matcher="Bash", hooks=[audit_bash]),
]
},
),
):
if isinstance(message, ResultMessage) and message.subtype == "success":
print(message.result)
import { query, type HookInput, type HookJSONOutput } from "@anthropic-ai/claude-agent-sdk";
// PreToolUse hook callback. HookInput is a discriminated union on
// hook_event_name, so narrowing on it gives TypeScript the right
// tool_input shape for this event.
const auditBash = async (input: HookInput): Promise<HookJSONOutput> => {
if (input.hook_event_name !== "PreToolUse") return {};
const toolInput = input.tool_input as { command?: string };
if (toolInput.command?.includes("rm -rf")) {
return { decision: "block", reason: "Destructive command blocked" };
}
return {}; // Empty object: allow the tool to proceed
};
// Filesystem hooks from .claude/settings.json run automatically
// when settingSources loads them. You can also add programmatic hooks:
for await (const message of query({
prompt: "Refactor the auth module",
options: {
settingSources: ["project"], // Loads hooks from .claude/settings.json
hooks: {
PreToolUse: [{ matcher: "Bash", hooks: [auditBash] }]
}
}
})) {
if (message.type === "result" && message.subtype === "success") {
console.log(message.result);
}
}
| Hook type | Best for |
|---|---|
Filesystem (settings.json) |
Sharing hooks between CLI and SDK sessions. Supports "command" (shell scripts), "http" (POST to an endpoint), "prompt" (LLM evaluates a prompt), and "agent" (spawns a verifier agent). These fire in the main agent and any subagents it spawns. |
Programmatic (callbacks in query) |
Application-specific logic; returning structured decisions; in-process integration. Scoped to the main session only. |
Note
The TypeScript SDK supports additional hook events beyond Python, including SessionStart, SessionEnd, TeammateIdle, and TaskCompleted. See the hooks guide for the full event compatibility table.
For full details on programmatic hooks, see Control execution with hooks. For filesystem hook syntax, see Hooks.
The Agent SDK gives you access to several ways to extend your agent's behavior. If you're unsure which to use, this table maps common goals to the right approach.
| You want to... | Use | SDK surface |
|---|---|---|
| Set project conventions your agent always follows | CLAUDE.md | settingSources: ["project"] loads it automatically |
| Give the agent reference material it loads when relevant | Skills | settingSources + allowedTools: ["Skill"] |
| Run a reusable workflow (deploy, review, release) | User-invocable skills | settingSources + allowedTools: ["Skill"] |
| Delegate an isolated subtask to a fresh context (research, review) | Subagents | agents parameter + allowedTools: ["Agent"] |
| Coordinate multiple Claude Code instances with shared task lists and direct inter-agent messaging | Agent teams | Not directly configured via SDK options. Agent teams are a CLI feature where one session acts as the team lead, coordinating work across independent teammates |
| Run deterministic logic on tool calls (audit, block, transform) | Hooks | hooks parameter with callbacks, or shell scripts loaded via settingSources |
| Give Claude structured tool access to an external service | MCP | mcpServers parameter |
Tip
Subagents versus agent teams: Subagents are ephemeral and isolated: fresh conversation, one task, summary returned to parent. Agent teams coordinate multiple independent Claude Code instances that share a task list and message each other directly. Agent teams are a CLI feature. See What subagents inherit and the agent teams comparison for details.
Every feature you enable adds to your agent's context window. For per-feature costs and how these features layer together, see Extend Claude Code.
How sessions persist agent conversation history, and when to use continue, resume, and fork to return to a prior run.
A session is the conversation history the SDK accumulates while your agent works. It contains your prompt, every tool call the agent made, every tool result, and every response. The SDK writes it to disk automatically so you can return to it later.
Returning to a session means the agent has full context from before: files it already read, analysis it already performed, decisions it already made. You can ask a follow-up question, recover from an interruption, or branch off to try a different approach.
Note
Sessions persist the conversation, not the filesystem. To snapshot and revert file changes the agent made, use file checkpointing.
This guide covers how to pick the right approach for your app, the SDK interfaces that track sessions automatically, how to capture session IDs and use resume and fork manually, and what to know about resuming sessions across hosts.
How much session handling you need depends on your application's shape. Session management comes into play when you send multiple prompts that should share context. Within a single query call, the agent already takes as many turns as it needs, and permission prompts and AskUserQuestion are handled in-loop (they don't end the call).
| What you're building | What to use |
|---|---|
| One-shot task: single prompt, no follow-up | Nothing extra. One query call handles it. |
| Multi-turn chat in one process | ClaudeSDKClient (Python) or continue: true (TypeScript). The SDK tracks the session for you with no ID handling. |
| Pick up where you left off after a process restart | continue_conversation=True (Python) / continue: true (TypeScript). Resumes the most recent session in the directory, no ID needed. |
| Resume a specific past session (not the most recent) | Capture the session ID and pass it to resume. |
| Try an alternative approach without losing the original | Fork the session. |
| Stateless task, don't want anything written to disk (TypeScript only) | Set persistSession: false. The session exists only in memory for the duration of the call. Python always persists to disk. |
Continue, resume, and fork are option fields you set on query (ClaudeAgentOptions in Python, Options in TypeScript).
Continue and resume both pick up an existing session and add to it. The difference is how they find that session:
Fork is different: it creates a new session that starts with a copy of the original's history. The original stays unchanged. Use fork to try a different direction while keeping the option to go back.
Both SDKs offer an interface that tracks session state for you across calls, so you don't pass IDs around manually. Use these for multi-turn conversations within a single process.
ClaudeSDKClientClaudeSDKClient handles session IDs internally. Each call to client.query automatically continues the same session. Call client.receive_response to iterate over the messages for the current query. The client must be used as an async context manager.
This example runs two queries against the same client. The first asks the agent to analyze a module; the second asks it to refactor that module. Because both calls go through the same client instance, the second query has full context from the first without any explicit resume or session ID:
import asyncio
from claude_agent_sdk import (
ClaudeSDKClient,
ClaudeAgentOptions,
AssistantMessage,
ResultMessage,
TextBlock,
)
def print_response(message):
"""Print only the human-readable parts of a message."""
if isinstance(message, AssistantMessage):
for block in message.content:
if isinstance(block, TextBlock):
print(block.text)
elif isinstance(message, ResultMessage):
cost = (
f"${message.total_cost_usd:.4f}"
if message.total_cost_usd is not None
else "N/A"
)
print(f"[done: {message.subtype}, cost: {cost}]")
async def main():
options = ClaudeAgentOptions(
allowed_tools=["Read", "Edit", "Glob", "Grep"],
)
async with ClaudeSDKClient(options=options) as client:
# First query: client captures the session ID internally
await client.query("Analyze the auth module")
async for message in client.receive_response():
print_response(message)
# Second query: automatically continues the same session
await client.query("Now refactor it to use JWT")
async for message in client.receive_response():
print_response(message)
asyncio.run(main())
See the Python SDK reference for details on when to use ClaudeSDKClient vs the standalone query function.
continue: trueThe stable TypeScript SDK (the query function used throughout these docs, sometimes called V1) doesn't have a session-holding client object like Python's ClaudeSDKClient. Instead, pass continue: true on each subsequent query call and the SDK picks up the most recent session in the current directory. No ID tracking required.
This example makes two separate query calls. The first creates a fresh session; the second sets continue: true, which tells the SDK to find and resume the most recent session on disk. The agent has full context from the first call:
import { query } from "@anthropic-ai/claude-agent-sdk";
// First query: creates a new session
for await (const message of query({
prompt: "Analyze the auth module",
options: { allowedTools: ["Read", "Glob", "Grep"] }
})) {
if (message.type === "result" && message.subtype === "success") {
console.log(message.result);
}
}
// Second query: continue: true resumes the most recent session
for await (const message of query({
prompt: "Now refactor it to use JWT",
options: {
continue: true,
allowedTools: ["Read", "Edit", "Write", "Glob", "Grep"]
}
})) {
if (message.type === "result" && message.subtype === "success") {
console.log(message.result);
}
}
Note
There's also a V2 preview of the TypeScript SDK that provides createSession with a send / stream pattern, closer to Python's ClaudeSDKClient in feel. V2 is unstable and its APIs may change; the rest of this documentation uses the stable V1 query function.
query()Resume and fork require a session ID. Read it from the session_id field on the result message (ResultMessage in Python, SDKResultMessage in TypeScript), which is present on every result regardless of success or error. In TypeScript the ID is also available earlier as a direct field on the init SystemMessage; in Python it's nested inside SystemMessage.data.
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, ResultMessage
async def main():
session_id = None
async for message in query(
prompt="Analyze the auth module and suggest improvements",
options=ClaudeAgentOptions(
allowed_tools=["Read", "Glob", "Grep"],
),
):
if isinstance(message, ResultMessage):
session_id = message.session_id
if message.subtype == "success":
print(message.result)
print(f"Session ID: {session_id}")
return session_id
session_id = asyncio.run(main())
import { query } from "@anthropic-ai/claude-agent-sdk";
let sessionId: string | undefined;
for await (const message of query({
prompt: "Analyze the auth module and suggest improvements",
options: { allowedTools: ["Read", "Glob", "Grep"] }
})) {
if (message.type === "result") {
sessionId = message.session_id;
if (message.subtype === "success") {
console.log(message.result);
}
}
}
console.log(`Session ID: ${sessionId}`);
Pass a session ID to resume to return to that specific session. The agent picks up with full context from wherever the session left off. Common reasons to resume:
error_max_turns or error_max_budget_usd (see Handle the result); resume with a higher limit.This example resumes the session from Capture the session ID with a follow-up prompt. Because you're resuming, the agent already has the prior analysis in context:
# Earlier session analyzed the code; now build on that analysis
async for message in query(
prompt="Now implement the refactoring you suggested",
options=ClaudeAgentOptions(
resume=session_id,
allowed_tools=["Read", "Edit", "Write", "Glob", "Grep"],
),
):
if isinstance(message, ResultMessage) and message.subtype == "success":
print(message.result)
// Earlier session analyzed the code; now build on that analysis
for await (const message of query({
prompt: "Now implement the refactoring you suggested",
options: {
resume: sessionId,
allowedTools: ["Read", "Edit", "Write", "Glob", "Grep"]
}
})) {
if (message.type === "result" && message.subtype === "success") {
console.log(message.result);
}
}
Tip
If a resume call returns a fresh session instead of the expected history, the most common cause is a mismatched cwd. Sessions are stored under ~/.claude/projects/<encoded-cwd>/*.jsonl, where <encoded-cwd> is the absolute working directory with every non-alphanumeric character replaced by - (so /Users/me/proj becomes -Users-me-proj). If your resume call runs from a different directory, the SDK looks in the wrong place. The session file also needs to exist on the current machine.
Forking creates a new session that starts with a copy of the original's history but diverges from that point. The fork gets its own session ID; the original's ID and history stay unchanged. You end up with two independent sessions you can resume separately.
Note
Forking branches the conversation history, not the filesystem. If a forked agent edits files, those changes are real and visible to any session working in the same directory. To branch and revert file changes, use file checkpointing.
This example builds on Capture the session ID: you've already analyzed an auth module in session_id and want to explore OAuth2 without losing the JWT-focused thread. The first block forks the session and captures the fork's ID (forked_id); the second block resumes the original session_id to continue down the JWT path. You now have two session IDs pointing at two separate histories:
# Fork: branch from session_id into a new session
forked_id = None
async for message in query(
prompt="Instead of JWT, implement OAuth2 for the auth module",
options=ClaudeAgentOptions(
resume=session_id,
fork_session=True,
),
):
if isinstance(message, ResultMessage):
forked_id = message.session_id # The fork's ID, distinct from session_id
if message.subtype == "success":
print(message.result)
print(f"Forked session: {forked_id}")
# Original session is untouched; resuming it continues the JWT thread
async for message in query(
prompt="Continue with the JWT approach",
options=ClaudeAgentOptions(resume=session_id),
):
if isinstance(message, ResultMessage) and message.subtype == "success":
print(message.result)
// Fork: branch from sessionId into a new session
let forkedId: string | undefined;
for await (const message of query({
prompt: "Instead of JWT, implement OAuth2 for the auth module",
options: {
resume: sessionId,
forkSession: true
}
})) {
if (message.type === "system" && message.subtype === "init") {
forkedId = message.session_id; // The fork's ID, distinct from sessionId
}
if (message.type === "result" && message.subtype === "success") {
console.log(message.result);
}
}
console.log(`Forked session: ${forkedId}`);
// Original session is untouched; resuming it continues the JWT thread
for await (const message of query({
prompt: "Continue with the JWT approach",
options: { resume: sessionId }
})) {
if (message.type === "result" && message.subtype === "success") {
console.log(message.result);
}
}
Session files are local to the machine that created them. To resume a session on a different host (CI workers, ephemeral containers, serverless), you have two options:
~/.claude/projects/<encoded-cwd>/<session-id>.jsonl from the first run and restore it to the same path on the new host before calling resume. The cwd must match.Both SDKs expose functions for enumerating sessions on disk and reading their messages: listSessions and getSessionMessages in TypeScript, list_sessions and get_session_messages in Python. Use them to build custom session pickers, cleanup logic, or transcript viewers.
Both SDKs also expose functions for looking up and mutating individual sessions: get_session_info, rename_session, and tag_session in Python, and getSessionInfo, renameSession, and tagSession in TypeScript. Use them to organize sessions by tag or give them human-readable titles.
ClaudeAgentOptions: Full session option reference for PythonOptions: Full session option reference for TypeScriptUnderstanding the two input modes for Claude Agent SDK and when to use each
The Claude Agent SDK supports two distinct input modes for interacting with agents:
This guide explains the differences, benefits, and use cases for each mode to help you choose the right approach for your application.
Streaming input mode is the preferred way to use the Claude Agent SDK. It provides full access to the agent's capabilities and enables rich, interactive experiences.
It allows the agent to operate as a long lived process that takes in user input, handles interruptions, surfaces permission requests, and handles session management.
sequenceDiagram
participant App as Your Application
participant Agent as Claude Agent
participant Tools as Tools/Hooks
participant FS as Environment/<br/>File System
App->>Agent: Initialize with AsyncGenerator
activate Agent
App->>Agent: Yield Message 1
Agent->>Tools: Execute tools
Tools->>FS: Read files
FS-->>Tools: File contents
Tools->>FS: Write/Edit files
FS-->>Tools: Success/Error
Agent-->>App: Stream partial response
Agent-->>App: Stream more content...
Agent->>App: Complete Message 1
App->>Agent: Yield Message 2 + Image
Agent->>Tools: Process image & execute
Tools->>FS: Access filesystem
FS-->>Tools: Operation results
Agent-->>App: Stream response 2
App->>Agent: Queue Message 3
App->>Agent: Interrupt/Cancel
Agent->>App: Handle interruption
Note over App,Agent: Session stays alive
Note over Tools,FS: Persistent file system<br/>state maintained
deactivate Agent
Attach images directly to messages for visual analysis and understanding
Send multiple messages that process sequentially, with ability to interrupt
Full access to all tools and custom MCP servers during the session
Use lifecycle hooks to customize behavior at various points
See responses as they're generated, not just final results
Maintain conversation context across multiple turns naturally
import { query } from "@anthropic-ai/claude-agent-sdk";
import { readFile } from "fs/promises";
async function* generateMessages() {
// First message
yield {
type: "user" as const,
message: {
role: "user" as const,
content: "Analyze this codebase for security issues"
}
};
// Wait for conditions or user input
await new Promise((resolve) => setTimeout(resolve, 2000));
// Follow-up with image
yield {
type: "user" as const,
message: {
role: "user" as const,
content: [
{
type: "text",
text: "Review this architecture diagram"
},
{
type: "image",
source: {
type: "base64",
media_type: "image/png",
data: await readFile("diagram.png", "base64")
}
}
]
}
};
}
// Process streaming responses
for await (const message of query({
prompt: generateMessages(),
options: {
maxTurns: 10,
allowedTools: ["Read", "Grep"]
}
})) {
if (message.type === "result") {
console.log(message.result);
}
}
from claude_agent_sdk import (
ClaudeSDKClient,
ClaudeAgentOptions,
AssistantMessage,
TextBlock,
)
import asyncio
import base64
async def streaming_analysis():
async def message_generator():
# First message
yield {
"type": "user",
"message": {
"role": "user",
"content": "Analyze this codebase for security issues",
},
}
# Wait for conditions
await asyncio.sleep(2)
# Follow-up with image
with open("diagram.png", "rb") as f:
image_data = base64.b64encode(f.read()).decode()
yield {
"type": "user",
"message": {
"role": "user",
"content": [
{"type": "text", "text": "Review this architecture diagram"},
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/png",
"data": image_data,
},
},
],
},
}
# Use ClaudeSDKClient for streaming input
options = ClaudeAgentOptions(max_turns=10, allowed_tools=["Read", "Grep"])
async with ClaudeSDKClient(options) as client:
# Send streaming input
await client.query(message_generator())
# Process responses
async for message in client.receive_response():
if isinstance(message, AssistantMessage):
for block in message.content:
if isinstance(block, TextBlock):
print(block.text)
asyncio.run(streaming_analysis())
Single message input is simpler but more limited.
Use single message input when:
Warning
Single message input mode does not support:
import { query } from "@anthropic-ai/claude-agent-sdk";
// Simple one-shot query
for await (const message of query({
prompt: "Explain the authentication flow",
options: {
maxTurns: 1,
allowedTools: ["Read", "Grep"]
}
})) {
if (message.type === "result") {
console.log(message.result);
}
}
// Continue conversation with session management
for await (const message of query({
prompt: "Now explain the authorization process",
options: {
continue: true,
maxTurns: 1
}
})) {
if (message.type === "result") {
console.log(message.result);
}
}
from claude_agent_sdk import query, ClaudeAgentOptions, ResultMessage
import asyncio
async def single_message_example():
# Simple one-shot query using query() function
async for message in query(
prompt="Explain the authentication flow",
options=ClaudeAgentOptions(max_turns=1, allowed_tools=["Read", "Grep"]),
):
if isinstance(message, ResultMessage):
print(message.result)
# Continue conversation with session management
async for message in query(
prompt="Now explain the authorization process",
options=ClaudeAgentOptions(continue_conversation=True, max_turns=1),
):
if isinstance(message, ResultMessage):
print(message.result)
asyncio.run(single_message_example())
Surface Claude's approval requests and clarifying questions to users, then return their decisions to the SDK.
While working on a task, Claude sometimes needs to check in with users. It might need permission before deleting files, or need to ask which database to use for a new project. Your application needs to surface these requests to users so Claude can continue with their input.
Claude requests user input in two situations: when it needs permission to use a tool (like deleting files or running commands), and when it has clarifying questions (via the AskUserQuestion tool). Both trigger your canUseTool callback, which pauses execution until you return a response. This is different from normal conversation turns where Claude finishes and waits for your next message.
For clarifying questions, Claude generates the questions and options. Your role is to present them to users and return their selections. You can't add your own questions to this flow; if you need to ask users something yourself, do that separately in your application logic.
This guide shows you how to detect each type of request and respond appropriately.
Pass a canUseTool callback in your query options. The callback fires whenever Claude needs user input, receiving the tool name and input as arguments:
async def handle_tool_request(tool_name, input_data, context):
# Prompt user and return allow or deny
...
options = ClaudeAgentOptions(can_use_tool=handle_tool_request)
async function handleToolRequest(toolName, input, options) {
// options includes { signal: AbortSignal, suggestions?: PermissionUpdate[] }
// Prompt user and return allow or deny
}
const options = { canUseTool: handleToolRequest };
The callback fires in two cases:
tool_name for the tool (e.g., "Bash", "Write").AskUserQuestion tool. Check if tool_name == "AskUserQuestion" to handle it differently. If you specify a tools array, include AskUserQuestion for this to work. See Handle clarifying questions for details.Note
To automatically allow or deny tools without prompting users, use hooks instead. Hooks execute before canUseTool and can allow, deny, or modify requests based on your own logic. You can also use the PermissionRequest hook to send external notifications (Slack, email, push) when Claude is waiting for approval.
Once you've passed a canUseTool callback in your query options, it fires when Claude wants to use a tool that isn't auto-approved. Your callback receives three arguments:
| Argument | Description |
|---|---|
toolName |
The name of the tool Claude wants to use (e.g., "Bash", "Write", "Edit") |
input |
The parameters Claude is passing to the tool. Contents vary by tool. |
options (TS) / context (Python) |
Additional context including optional suggestions (proposed PermissionUpdate entries to avoid re-prompting) and a cancellation signal. In TypeScript, signal is an AbortSignal; in Python, the signal field is reserved for future use. See ToolPermissionContext for Python. |
The input object contains tool-specific parameters. Common examples:
| Tool | Input fields |
|---|---|
Bash |
command, description, timeout |
Write |
file_path, content |
Edit |
file_path, old_string, new_string |
Read |
file_path, offset, limit |
See the SDK reference for complete input schemas: Python | TypeScript.
You can display this information to the user so they can decide whether to allow or reject the action, then return the appropriate response.
The following example asks Claude to create and delete a test file. When Claude attempts each operation, the callback prints the tool request to the terminal and prompts for y/n approval.
import asyncio
from claude_agent_sdk import ClaudeAgentOptions, ResultMessage, query
from claude_agent_sdk.types import (
HookMatcher,
PermissionResultAllow,
PermissionResultDeny,
ToolPermissionContext,
)
async def can_use_tool(
tool_name: str, input_data: dict, context: ToolPermissionContext
) -> PermissionResultAllow | PermissionResultDeny:
# Display the tool request
print(f"\nTool: {tool_name}")
if tool_name == "Bash":
print(f"Command: {input_data.get('command')}")
if input_data.get("description"):
print(f"Description: {input_data.get('description')}")
else:
print(f"Input: {input_data}")
# Get user approval
response = input("Allow this action? (y/n): ")
# Return allow or deny based on user's response
if response.lower() == "y":
# Allow: tool executes with the original (or modified) input
return PermissionResultAllow(updated_input=input_data)
else:
# Deny: tool doesn't execute, Claude sees the message
return PermissionResultDeny(message="User denied this action")
# Required workaround: dummy hook keeps the stream open for can_use_tool
async def dummy_hook(input_data, tool_use_id, context):
return {"continue_": True}
async def prompt_stream():
yield {
"type": "user",
"message": {
"role": "user",
"content": "Create a test file in /tmp and then delete it",
},
}
async def main():
async for message in query(
prompt=prompt_stream(),
options=ClaudeAgentOptions(
can_use_tool=can_use_tool,
hooks={"PreToolUse": [HookMatcher(matcher=None, hooks=[dummy_hook])]},
),
):
if isinstance(message, ResultMessage) and message.subtype == "success":
print(message.result)
asyncio.run(main())
import { query } from "@anthropic-ai/claude-agent-sdk";
import * as readline from "readline";
// Helper to prompt user for input in the terminal
function prompt(question: string): Promise<string> {
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout
});
return new Promise((resolve) =>
rl.question(question, (answer) => {
rl.close();
resolve(answer);
})
);
}
for await (const message of query({
prompt: "Create a test file in /tmp and then delete it",
options: {
canUseTool: async (toolName, input) => {
// Display the tool request
console.log(`\nTool: ${toolName}`);
if (toolName === "Bash") {
console.log(`Command: ${input.command}`);
if (input.description) console.log(`Description: ${input.description}`);
} else {
console.log(`Input: ${JSON.stringify(input, null, 2)}`);
}
// Get user approval
const response = await prompt("Allow this action? (y/n): ");
// Return allow or deny based on user's response
if (response.toLowerCase() === "y") {
// Allow: tool executes with the original (or modified) input
return { behavior: "allow", updatedInput: input };
} else {
// Deny: tool doesn't execute, Claude sees the message
return { behavior: "deny", message: "User denied this action" };
}
}
}
})) {
if ("result" in message) console.log(message.result);
}
Note
In Python, can_use_tool requires streaming mode and a PreToolUse hook that returns {"continue_": True} to keep the stream open. Without this hook, the stream closes before the permission callback can be invoked.
This example uses a y/n flow where any input other than y is treated as a denial. In practice, you might build a richer UI that lets users modify the request, provide feedback, or redirect Claude entirely. See Respond to tool requests for all the ways you can respond.
Your callback returns one of two response types:
| Response | Python | TypeScript |
|---|---|---|
| Allow | PermissionResultAllow(updated_input=...) |
{ behavior: "allow", updatedInput } |
| Deny | PermissionResultDeny(message=...) |
{ behavior: "deny", message } |
When allowing, pass the tool input (original or modified). When denying, provide a message explaining why. Claude sees this message and may adjust its approach.
from claude_agent_sdk.types import PermissionResultAllow, PermissionResultDeny
# Allow the tool to execute
return PermissionResultAllow(updated_input=input_data)
# Block the tool
return PermissionResultDeny(message="User rejected this action")
// Allow the tool to execute
return { behavior: "allow", updatedInput: input };
// Block the tool
return { behavior: "deny", message: "User rejected this action" };
Beyond allowing or denying, you can modify the tool's input or provide context that helps Claude adjust its approach:
The user approves the action as-is. Pass through the input from your callback unchanged and the tool executes exactly as Claude requested.
async def can_use_tool(tool_name, input_data, context): print(f"Claude wants to use {tool_name}") approved = await ask_user("Allow this action?")if approved: return PermissionResultAllow(updated_input=input_data) return PermissionResultDeny(message="User declined")
canUseTool: async (toolName, input) => {
console.log(`Claude wants to use ${toolName}`);
const approved = await askUser("Allow this action?");
if (approved) {
return { behavior: "allow", updatedInput: input };
}
return { behavior: "deny", message: "User declined" };
};
The user approves but wants to modify the request first. You can change the input before the tool executes. Claude sees the result but isn't told you changed anything. Useful for sanitizing parameters, adding constraints, or scoping access.
async def can_use_tool(tool_name, input_data, context):
if tool_name == "Bash":
# User approved, but scope all commands to sandbox
sandboxed_input = {**input_data}
sandboxed_input["command"] = input_data["command"].replace(
"/tmp", "/tmp/sandbox"
)
return PermissionResultAllow(updated_input=sandboxed_input)
return PermissionResultAllow(updated_input=input_data)
canUseTool: async (toolName, input) => {
if (toolName === "Bash") {
// User approved, but scope all commands to sandbox
const sandboxedInput = {
...input,
command: input.command.replace("/tmp", "/tmp/sandbox")
};
return { behavior: "allow", updatedInput: sandboxedInput };
}
return { behavior: "allow", updatedInput: input };
};
The user doesn't want this action to happen. Block the tool and provide a message explaining why. Claude sees this message and may try a different approach.
async def can_use_tool(tool_name, input_data, context): approved = await ask_user(f"Allow {tool_name}?")if not approved: return PermissionResultDeny(message="User rejected this action") return PermissionResultAllow(updated_input=input_data)
canUseTool: async (toolName, input) => {
const approved = await askUser(`Allow ${toolName}?`);
if (!approved) {
return {
behavior: "deny",
message: "User rejected this action"
};
}
return { behavior: "allow", updatedInput: input };
};
The user doesn't want this specific action, but has a different idea. Block the tool and include guidance in your message. Claude will read this and decide how to proceed based on your feedback.
async def can_use_tool(tool_name, input_data, context):
if tool_name == "Bash" and "rm" in input_data.get("command", ""):
# User doesn't want to delete, suggest archiving instead
return PermissionResultDeny(
message="User doesn't want to delete files. They asked if you could compress them into an archive instead."
)
return PermissionResultAllow(updated_input=input_data)
canUseTool: async (toolName, input) => {
if (toolName === "Bash" && input.command.includes("rm")) {
// User doesn't want to delete, suggest archiving instead
return {
behavior: "deny",
message:
"User doesn't want to delete files. They asked if you could compress them into an archive instead."
};
}
return { behavior: "allow", updatedInput: input };
};
For a complete change of direction (not just a nudge), use streaming input to send Claude a new instruction directly. This bypasses the current tool request and gives Claude entirely new instructions to follow.
When Claude needs more direction on a task with multiple valid approaches, it calls the AskUserQuestion tool. This triggers your canUseTool callback with toolName set to AskUserQuestion. The input contains Claude's questions as multiple-choice options, which you display to the user and return their selections.
Tip
Clarifying questions are especially common in plan mode, where Claude explores the codebase and asks questions before proposing a plan. This makes plan mode ideal for interactive workflows where you want Claude to gather requirements before making changes.
The following steps show how to handle clarifying questions:
Pass a canUseTool callback in your query options. By default, AskUserQuestion is available. If you specify a tools array to restrict Claude's capabilities (for example, a read-only agent with only Read, Glob, and Grep), include AskUserQuestion in that array. Otherwise, Claude won't be able to ask clarifying questions:
async for message in query(
prompt="Analyze this codebase",
options=ClaudeAgentOptions(
# Include AskUserQuestion in your tools list
tools=["Read", "Glob", "Grep", "AskUserQuestion"],
can_use_tool=can_use_tool,
),
):
print(message)
for await (const message of query({
prompt: "Analyze this codebase",
options: {
// Include AskUserQuestion in your tools list
tools: ["Read", "Glob", "Grep", "AskUserQuestion"],
canUseTool: async (toolName, input) => {
// Handle clarifying questions here
}
}
})) {
console.log(message);
}
In your callback, check if toolName equals AskUserQuestion to handle it differently from other tools:
async def can_use_tool(tool_name: str, input_data: dict, context):
if tool_name == "AskUserQuestion":
# Your implementation to collect answers from the user
return await handle_clarifying_questions(input_data)
# Handle other tools normally
return await prompt_for_approval(tool_name, input_data)
canUseTool: async (toolName, input) => {
if (toolName === "AskUserQuestion") {
// Your implementation to collect answers from the user
return handleClarifyingQuestions(input);
}
// Handle other tools normally
return promptForApproval(toolName, input);
};
The input contains Claude's questions in a questions array. Each question has a question (the text to display), options (the choices), and multiSelect (whether multiple selections are allowed):
{
"questions": [
{
"question": "How should I format the output?",
"header": "Format",
"options": [
{ "label": "Summary", "description": "Brief overview" },
{ "label": "Detailed", "description": "Full explanation" }
],
"multiSelect": false
},
{
"question": "Which sections should I include?",
"header": "Sections",
"options": [
{ "label": "Introduction", "description": "Opening context" },
{ "label": "Conclusion", "description": "Final summary" }
],
"multiSelect": true
}
]
}
See Question format for full field descriptions.
Present the questions to the user and collect their selections. How you do this depends on your application: a terminal prompt, a web form, a mobile dialog, etc.
Build the answers object as a record where each key is the question text and each value is the selected option's label:
| From the question object | Use as |
|---|---|
question field (e.g., "How should I format the output?") |
Key |
Selected option's label field (e.g., "Summary") |
Value |
For multi-select questions, join multiple labels with ", ". If you support free-text input, use the user's custom text as the value.
return PermissionResultAllow(
updated_input={
"questions": input_data.get("questions", []),
"answers": {
"How should I format the output?": "Summary",
"Which sections should I include?": "Introduction, Conclusion",
},
}
)
return {
behavior: "allow",
updatedInput: {
questions: input.questions,
answers: {
"How should I format the output?": "Summary",
"Which sections should I include?": "Introduction, Conclusion"
}
}
};
The input contains Claude's generated questions in a questions array. Each question has these fields:
| Field | Description |
|---|---|
question |
The full question text to display |
header |
Short label for the question (max 12 characters) |
options |
Array of 2-4 choices, each with label and description. TypeScript: optionally preview (see below) |
multiSelect |
If true, users can select multiple options |
The structure your callback receives:
{
"questions": [
{
"question": "How should I format the output?",
"header": "Format",
"options": [
{ "label": "Summary", "description": "Brief overview of key points" },
{ "label": "Detailed", "description": "Full explanation with examples" }
],
"multiSelect": false
}
]
}
toolConfig.askUserQuestion.previewFormat adds a preview field to each option so your app can show a visual mockup alongside the label. Without this setting, Claude does not generate previews and the field is absent.
previewFormat |
preview contains |
|---|---|
| unset (default) | Field is absent. Claude does not generate previews. |
"markdown" |
ASCII art and fenced code blocks |
"html" |
A styled <div> fragment (the SDK rejects <script>, <style>, and <!DOCTYPE> before your callback runs) |
The format applies to all questions in the session. Claude includes preview on options where a visual comparison helps (layout choices, color schemes) and omits it where one wouldn't (yes/no confirmations, text-only choices). Check for undefined before rendering.
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "Help me choose a card layout",
options: {
toolConfig: {
askUserQuestion: { previewFormat: "html" }
},
canUseTool: async (toolName, input) => {
// input.questions[].options[].preview is an HTML string or undefined
return { behavior: "allow", updatedInput: input };
}
}
})) {
// ...
}
An option with an HTML preview:
{
"label": "Compact",
"description": "Title and metric value only",
"preview": "<div style=\"padding:12px;border:1px solid #ddd;border-radius:8px\"><div style=\"font-size:12px;color:#666\">Active users</div><div style=\"font-size:28px;font-weight:600\">1,284</div></div>"
}
Return an answers object mapping each question's question field to the selected option's label:
| Field | Description |
|---|---|
questions |
Pass through the original questions array (required for tool processing) |
answers |
Object where keys are question text and values are selected labels |
For multi-select questions, join multiple labels with ", ". For free-text input, use the user's custom text directly.
{
"questions": [
// ...
],
"answers": {
"How should I format the output?": "Summary",
"Which sections should I include?": "Introduction, Conclusion"
}
}
Claude's predefined options won't always cover what users want. To let users type their own answer:
See the complete example below for a full implementation.
Claude asks clarifying questions when it needs user input to proceed. For example, when asked to help decide on a tech stack for a mobile app, Claude might ask about cross-platform vs native, backend preferences, or target platforms. These questions help Claude make decisions that match the user's preferences rather than guessing.
This example handles those questions in a terminal application. Here's what happens at each step:
canUseTool callback checks if the tool name is "AskUserQuestion" and routes to a dedicated handlerquestions array and prints each question with numbered optionsquestions array and the answers mappingimport asyncio
from claude_agent_sdk import ClaudeAgentOptions, ResultMessage, query
from claude_agent_sdk.types import HookMatcher, PermissionResultAllow
def parse_response(response: str, options: list) -> str:
"""Parse user input as option number(s) or free text."""
try:
indices = [int(s.strip()) - 1 for s in response.split(",")]
labels = [options[i]["label"] for i in indices if 0 <= i < len(options)]
return ", ".join(labels) if labels else response
except ValueError:
return response
async def handle_ask_user_question(input_data: dict) -> PermissionResultAllow:
"""Display Claude's questions and collect user answers."""
answers = {}
for q in input_data.get("questions", []):
print(f"\n{q['header']}: {q['question']}")
options = q["options"]
for i, opt in enumerate(options):
print(f" {i + 1}. {opt['label']} - {opt['description']}")
if q.get("multiSelect"):
print(" (Enter numbers separated by commas, or type your own answer)")
else:
print(" (Enter a number, or type your own answer)")
response = input("Your choice: ").strip()
answers[q["question"]] = parse_response(response, options)
return PermissionResultAllow(
updated_input={
"questions": input_data.get("questions", []),
"answers": answers,
}
)
async def can_use_tool(
tool_name: str, input_data: dict, context
) -> PermissionResultAllow:
# Route AskUserQuestion to our question handler
if tool_name == "AskUserQuestion":
return await handle_ask_user_question(input_data)
# Auto-approve other tools for this example
return PermissionResultAllow(updated_input=input_data)
async def prompt_stream():
yield {
"type": "user",
"message": {
"role": "user",
"content": "Help me decide on the tech stack for a new mobile app",
},
}
# Required workaround: dummy hook keeps the stream open for can_use_tool
async def dummy_hook(input_data, tool_use_id, context):
return {"continue_": True}
async def main():
async for message in query(
prompt=prompt_stream(),
options=ClaudeAgentOptions(
can_use_tool=can_use_tool,
hooks={"PreToolUse": [HookMatcher(matcher=None, hooks=[dummy_hook])]},
),
):
if isinstance(message, ResultMessage) and message.subtype == "success":
print(message.result)
asyncio.run(main())
import { query } from "@anthropic-ai/claude-agent-sdk";
import * as readline from "readline/promises";
// Helper to prompt user for input in the terminal
async function prompt(question: string): Promise<string> {
const rl = readline.createInterface({ input: process.stdin, output: process.stdout });
const answer = await rl.question(question);
rl.close();
return answer;
}
// Parse user input as option number(s) or free text
function parseResponse(response: string, options: any[]): string {
const indices = response.split(",").map((s) => parseInt(s.trim()) - 1);
const labels = indices
.filter((i) => !isNaN(i) && i >= 0 && i < options.length)
.map((i) => options[i].label);
return labels.length > 0 ? labels.join(", ") : response;
}
// Display Claude's questions and collect user answers
async function handleAskUserQuestion(input: any) {
const answers: Record<string, string> = {};
for (const q of input.questions) {
console.log(`\n${q.header}: ${q.question}`);
const options = q.options;
options.forEach((opt: any, i: number) => {
console.log(` ${i + 1}. ${opt.label} - ${opt.description}`);
});
if (q.multiSelect) {
console.log(" (Enter numbers separated by commas, or type your own answer)");
} else {
console.log(" (Enter a number, or type your own answer)");
}
const response = (await prompt("Your choice: ")).trim();
answers[q.question] = parseResponse(response, options);
}
// Return the answers to Claude (must include original questions)
return {
behavior: "allow",
updatedInput: { questions: input.questions, answers }
};
}
async function main() {
for await (const message of query({
prompt: "Help me decide on the tech stack for a new mobile app",
options: {
canUseTool: async (toolName, input) => {
// Route AskUserQuestion to our question handler
if (toolName === "AskUserQuestion") {
return handleAskUserQuestion(input);
}
// Auto-approve other tools for this example
return { behavior: "allow", updatedInput: input };
}
}
})) {
if ("result" in message) console.log(message.result);
}
}
main();
AskUserQuestion is not currently available in subagents spawned via the Agent toolAskUserQuestion call supports 1-4 questions with 2-4 options eachThe canUseTool callback and AskUserQuestion tool cover most approval and clarification scenarios, but the SDK offers other ways to get input from users:
Use streaming input when you need to:
Streaming input is ideal for conversational UIs where users interact with the agent throughout execution, not just at approval checkpoints.
Use custom tools when you need to:
AskUserQuestion's multiple-choice formatCustom tools give you full control over the interaction, but require more implementation work than using the built-in canUseTool callback.
Get real-time responses from the Agent SDK as text and tool calls stream in
By default, the Agent SDK yields complete AssistantMessage objects after Claude finishes generating each response. To receive incremental updates as text and tool calls are generated, enable partial message streaming by setting include_partial_messages (Python) or includePartialMessages (TypeScript) to true in your options.
Tip
This page covers output streaming (receiving tokens in real-time). For input modes (how you send messages), see Send messages to agents. You can also stream responses using the Agent SDK via the CLI.
To enable streaming, set include_partial_messages (Python) or includePartialMessages (TypeScript) to true in your options. This causes the SDK to yield StreamEvent messages containing raw API events as they arrive, in addition to the usual AssistantMessage and ResultMessage.
Your code then needs to:
StreamEvent from other message typesStreamEvent, extract the event field and check its typecontent_block_delta events where delta.type is text_delta, which contain the actual text chunksThe example below enables streaming and prints text chunks as they arrive. Notice the nested type checks: first for StreamEvent, then for content_block_delta, then for text_delta:
from claude_agent_sdk import query, ClaudeAgentOptions
from claude_agent_sdk.types import StreamEvent
import asyncio
async def stream_response():
options = ClaudeAgentOptions(
include_partial_messages=True,
allowed_tools=["Bash", "Read"],
)
async for message in query(prompt="List the files in my project", options=options):
if isinstance(message, StreamEvent):
event = message.event
if event.get("type") == "content_block_delta":
delta = event.get("delta", {})
if delta.get("type") == "text_delta":
print(delta.get("text", ""), end="", flush=True)
asyncio.run(stream_response())
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "List the files in my project",
options: {
includePartialMessages: true,
allowedTools: ["Bash", "Read"]
}
})) {
if (message.type === "stream_event") {
const event = message.event;
if (event.type === "content_block_delta") {
if (event.delta.type === "text_delta") {
process.stdout.write(event.delta.text);
}
}
}
}
When partial messages are enabled, you receive raw Claude API streaming events wrapped in an object. The type has different names in each SDK:
StreamEvent (import from claude_agent_sdk.types)SDKPartialAssistantMessage with type: 'stream_event'Both contain raw Claude API events, not accumulated text. You need to extract and accumulate text deltas yourself. Here's the structure of each type:
@dataclass
class StreamEvent:
uuid: str # Unique identifier for this event
session_id: str # Session identifier
event: dict[str, Any] # The raw Claude API stream event
parent_tool_use_id: str | None # Parent tool ID if from a subagent
type SDKPartialAssistantMessage = {
type: "stream_event";
event: RawMessageStreamEvent; // From Anthropic SDK
parent_tool_use_id: string | null;
uuid: UUID;
session_id: string;
};
The event field contains the raw streaming event from the Claude API. Common event types include:
| Event Type | Description |
|---|---|
message_start |
Start of a new message |
content_block_start |
Start of a new content block (text or tool use) |
content_block_delta |
Incremental update to content |
content_block_stop |
End of a content block |
message_delta |
Message-level updates (stop reason, usage) |
message_stop |
End of the message |
With partial messages enabled, you receive messages in this order:
StreamEvent (message_start)
StreamEvent (content_block_start) - text block
StreamEvent (content_block_delta) - text chunks...
StreamEvent (content_block_stop)
StreamEvent (content_block_start) - tool_use block
StreamEvent (content_block_delta) - tool input chunks...
StreamEvent (content_block_stop)
StreamEvent (message_delta)
StreamEvent (message_stop)
AssistantMessage - complete message with all content
... tool executes ...
... more streaming events for next turn ...
ResultMessage - final result
Without partial messages enabled (include_partial_messages in Python, includePartialMessages in TypeScript), you receive all message types except StreamEvent. Common types include SystemMessage (session initialization), AssistantMessage (complete responses), ResultMessage (final result), and a compact boundary message indicating when conversation history was compacted (SDKCompactBoundaryMessage in TypeScript; SystemMessage with subtype "compact_boundary" in Python).
To display text as it's generated, look for content_block_delta events where delta.type is text_delta. These contain the incremental text chunks. The example below prints each chunk as it arrives:
from claude_agent_sdk import query, ClaudeAgentOptions
from claude_agent_sdk.types import StreamEvent
import asyncio
async def stream_text():
options = ClaudeAgentOptions(include_partial_messages=True)
async for message in query(prompt="Explain how databases work", options=options):
if isinstance(message, StreamEvent):
event = message.event
if event.get("type") == "content_block_delta":
delta = event.get("delta", {})
if delta.get("type") == "text_delta":
# Print each text chunk as it arrives
print(delta.get("text", ""), end="", flush=True)
print() # Final newline
asyncio.run(stream_text())
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "Explain how databases work",
options: { includePartialMessages: true }
})) {
if (message.type === "stream_event") {
const event = message.event;
if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
process.stdout.write(event.delta.text);
}
}
}
console.log(); // Final newline
Tool calls also stream incrementally. You can track when tools start, receive their input as it's generated, and see when they complete. The example below tracks the current tool being called and accumulates the JSON input as it streams in. It uses three event types:
content_block_start: tool beginscontent_block_delta with input_json_delta: input chunks arrivecontent_block_stop: tool call completefrom claude_agent_sdk import query, ClaudeAgentOptions
from claude_agent_sdk.types import StreamEvent
import asyncio
async def stream_tool_calls():
options = ClaudeAgentOptions(
include_partial_messages=True,
allowed_tools=["Read", "Bash"],
)
# Track the current tool and accumulate its input JSON
current_tool = None
tool_input = ""
async for message in query(prompt="Read the README.md file", options=options):
if isinstance(message, StreamEvent):
event = message.event
event_type = event.get("type")
if event_type == "content_block_start":
# New tool call is starting
content_block = event.get("content_block", {})
if content_block.get("type") == "tool_use":
current_tool = content_block.get("name")
tool_input = ""
print(f"Starting tool: {current_tool}")
elif event_type == "content_block_delta":
delta = event.get("delta", {})
if delta.get("type") == "input_json_delta":
# Accumulate JSON input as it streams in
chunk = delta.get("partial_json", "")
tool_input += chunk
print(f" Input chunk: {chunk}")
elif event_type == "content_block_stop":
# Tool call complete - show final input
if current_tool:
print(f"Tool {current_tool} called with: {tool_input}")
current_tool = None
asyncio.run(stream_tool_calls())
import { query } from "@anthropic-ai/claude-agent-sdk";
// Track the current tool and accumulate its input JSON
let currentTool: string | null = null;
let toolInput = "";
for await (const message of query({
prompt: "Read the README.md file",
options: {
includePartialMessages: true,
allowedTools: ["Read", "Bash"]
}
})) {
if (message.type === "stream_event") {
const event = message.event;
if (event.type === "content_block_start") {
// New tool call is starting
if (event.content_block.type === "tool_use") {
currentTool = event.content_block.name;
toolInput = "";
console.log(`Starting tool: ${currentTool}`);
}
} else if (event.type === "content_block_delta") {
if (event.delta.type === "input_json_delta") {
// Accumulate JSON input as it streams in
const chunk = event.delta.partial_json;
toolInput += chunk;
console.log(` Input chunk: ${chunk}`);
}
} else if (event.type === "content_block_stop") {
// Tool call complete - show final input
if (currentTool) {
console.log(`Tool ${currentTool} called with: ${toolInput}`);
currentTool = null;
}
}
}
}
This example combines text and tool streaming into a cohesive UI. It tracks whether the agent is currently executing a tool (using an in_tool flag) to show status indicators like [Using Read...] while tools run. Text streams normally when not in a tool, and tool completion triggers a "done" message. This pattern is useful for chat interfaces that need to show progress during multi-step agent tasks.
from claude_agent_sdk import query, ClaudeAgentOptions, ResultMessage
from claude_agent_sdk.types import StreamEvent
import asyncio
import sys
async def streaming_ui():
options = ClaudeAgentOptions(
include_partial_messages=True,
allowed_tools=["Read", "Bash", "Grep"],
)
# Track whether we're currently in a tool call
in_tool = False
async for message in query(
prompt="Find all TODO comments in the codebase", options=options
):
if isinstance(message, StreamEvent):
event = message.event
event_type = event.get("type")
if event_type == "content_block_start":
content_block = event.get("content_block", {})
if content_block.get("type") == "tool_use":
# Tool call is starting - show status indicator
tool_name = content_block.get("name")
print(f"\n[Using {tool_name}...]", end="", flush=True)
in_tool = True
elif event_type == "content_block_delta":
delta = event.get("delta", {})
# Only stream text when not executing a tool
if delta.get("type") == "text_delta" and not in_tool:
sys.stdout.write(delta.get("text", ""))
sys.stdout.flush()
elif event_type == "content_block_stop":
if in_tool:
# Tool call finished
print(" done", flush=True)
in_tool = False
elif isinstance(message, ResultMessage):
# Agent finished all work
print(f"\n\n--- Complete ---")
asyncio.run(streaming_ui())
import { query } from "@anthropic-ai/claude-agent-sdk";
// Track whether we're currently in a tool call
let inTool = false;
for await (const message of query({
prompt: "Find all TODO comments in the codebase",
options: {
includePartialMessages: true,
allowedTools: ["Read", "Bash", "Grep"]
}
})) {
if (message.type === "stream_event") {
const event = message.event;
if (event.type === "content_block_start") {
if (event.content_block.type === "tool_use") {
// Tool call is starting - show status indicator
process.stdout.write(`\n[Using ${event.content_block.name}...]`);
inTool = true;
}
} else if (event.type === "content_block_delta") {
// Only stream text when not executing a tool
if (event.delta.type === "text_delta" && !inTool) {
process.stdout.write(event.delta.text);
}
} else if (event.type === "content_block_stop") {
if (inTool) {
// Tool call finished
console.log(" done");
inTool = false;
}
}
} else if (message.type === "result") {
// Agent finished all work
console.log("\n\n--- Complete ---");
}
}
Some SDK features are incompatible with streaming:
max_thinking_tokens (Python) or maxThinkingTokens (TypeScript), StreamEvent messages are not emitted. You'll only receive complete messages after each turn. Note that thinking is disabled by default in the SDK, so streaming works unless you enable it.ResultMessage.structured_output, not as streaming deltas. See structured outputs for details.Now that you can stream text and tool calls in real-time, explore these related topics:
Return validated JSON from agent workflows using JSON Schema, Zod, or Pydantic. Get type-safe, structured data after multi-turn tool use.
Structured outputs let you define the exact shape of data you want back from an agent. The agent can use any tools it needs to complete the task, and you still get validated JSON matching your schema at the end. Define a JSON Schema for the structure you need, and the SDK guarantees the output matches it.
For full type safety, use Zod (TypeScript) or Pydantic (Python) to define your schema and get strongly-typed objects back.
Agents return free-form text by default, which works for chat but not when you need to use the output programmatically. Structured outputs give you typed data you can pass directly to your application logic, database, or UI components.
Consider a recipe app where an agent searches the web and brings back recipes. Without structured outputs, you get free-form text that you'd need to parse yourself. With structured outputs, you define the shape you want and get typed data you can use directly in your app.
Here's a classic chocolate chip cookie recipe!
**Chocolate Chip Cookies**
Prep time: 15 minutes | Cook time: 10 minutes
Ingredients:
- 2 1/4 cups all-purpose flour
- 1 cup butter, softened
...
To use this in your app, you'd need to parse out the title, convert "15 minutes" to a number, separate ingredients from instructions, and handle inconsistent formatting across responses.
{
"name": "Chocolate Chip Cookies",
"prep_time_minutes": 15,
"cook_time_minutes": 10,
"ingredients": [
{ "item": "all-purpose flour", "amount": 2.25, "unit": "cups" },
{ "item": "butter, softened", "amount": 1, "unit": "cup" }
// ...
],
"steps": ["Preheat oven to 375°F", "Cream butter and sugar" /* ... */]
}
Typed data you can use directly in your UI.
To use structured outputs, define a JSON Schema describing the shape of data you want, then pass it to query via the outputFormat option (TypeScript) or output_format option (Python). When the agent finishes, the result message includes a structured_output field with validated data matching your schema.
The example below asks the agent to research Anthropic and return the company name, year founded, and headquarters as structured output.
import { query } from "@anthropic-ai/claude-agent-sdk";
// Define the shape of data you want back
const schema = {
type: "object",
properties: {
company_name: { type: "string" },
founded_year: { type: "number" },
headquarters: { type: "string" }
},
required: ["company_name"]
};
for await (const message of query({
prompt: "Research Anthropic and provide key company information",
options: {
outputFormat: {
type: "json_schema",
schema: schema
}
}
})) {
// The result message contains structured_output with validated data
if (message.type === "result" && message.subtype === "success" && message.structured_output) {
console.log(message.structured_output);
// { company_name: "Anthropic", founded_year: 2021, headquarters: "San Francisco, CA" }
}
}
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, ResultMessage
# Define the shape of data you want back
schema = {
"type": "object",
"properties": {
"company_name": {"type": "string"},
"founded_year": {"type": "number"},
"headquarters": {"type": "string"},
},
"required": ["company_name"],
}
async def main():
async for message in query(
prompt="Research Anthropic and provide key company information",
options=ClaudeAgentOptions(
output_format={"type": "json_schema", "schema": schema}
),
):
# The result message contains structured_output with validated data
if isinstance(message, ResultMessage) and message.structured_output:
print(message.structured_output)
# {'company_name': 'Anthropic', 'founded_year': 2021, 'headquarters': 'San Francisco, CA'}
asyncio.run(main())
Instead of writing JSON Schema by hand, you can use Zod (TypeScript) or Pydantic (Python) to define your schema. These libraries generate the JSON Schema for you and let you parse the response into a fully-typed object you can use throughout your codebase with autocomplete and type checking.
The example below defines a schema for a feature implementation plan with a summary, list of steps (each with complexity level), and potential risks. The agent plans the feature and returns a typed FeaturePlan object. You can then access properties like plan.summary and iterate over plan.steps with full type safety.
import { z } from "zod";
import { query } from "@anthropic-ai/claude-agent-sdk";
// Define schema with Zod
const FeaturePlan = z.object({
feature_name: z.string(),
summary: z.string(),
steps: z.array(
z.object({
step_number: z.number(),
description: z.string(),
estimated_complexity: z.enum(["low", "medium", "high"])
})
),
risks: z.array(z.string())
});
type FeaturePlan = z.infer<typeof FeaturePlan>;
// Convert to JSON Schema
const schema = z.toJSONSchema(FeaturePlan);
// Use in query
for await (const message of query({
prompt:
"Plan how to add dark mode support to a React app. Break it into implementation steps.",
options: {
outputFormat: {
type: "json_schema",
schema: schema
}
}
})) {
if (message.type === "result" && message.subtype === "success" && message.structured_output) {
// Validate and get fully typed result
const parsed = FeaturePlan.safeParse(message.structured_output);
if (parsed.success) {
const plan: FeaturePlan = parsed.data;
console.log(`Feature: ${plan.feature_name}`);
console.log(`Summary: ${plan.summary}`);
plan.steps.forEach((step) => {
console.log(`${step.step_number}. [${step.estimated_complexity}] ${step.description}`);
});
}
}
}
import asyncio
from pydantic import BaseModel
from claude_agent_sdk import query, ClaudeAgentOptions, ResultMessage
class Step(BaseModel):
step_number: int
description: str
estimated_complexity: str # 'low', 'medium', 'high'
class FeaturePlan(BaseModel):
feature_name: str
summary: str
steps: list[Step]
risks: list[str]
async def main():
async for message in query(
prompt="Plan how to add dark mode support to a React app. Break it into implementation steps.",
options=ClaudeAgentOptions(
output_format={
"type": "json_schema",
"schema": FeaturePlan.model_json_schema(),
}
),
):
if isinstance(message, ResultMessage) and message.structured_output:
# Validate and get fully typed result
plan = FeaturePlan.model_validate(message.structured_output)
print(f"Feature: {plan.feature_name}")
print(f"Summary: {plan.summary}")
for step in plan.steps:
print(
f"{step.step_number}. [{step.estimated_complexity}] {step.description}"
)
asyncio.run(main())
Benefits:
safeParse or model_validateThe outputFormat (TypeScript) or output_format (Python) option accepts an object with:
type: Set to "json_schema" for structured outputsschema: A JSON Schema object defining your output structure. You can generate this from a Zod schema with z.toJSONSchema or a Pydantic model with .model_json_schemaThe SDK supports standard JSON Schema features including all basic types (object, array, string, number, boolean, null), enum, const, required, nested objects, and $ref definitions. For the full list of supported features and limitations, see JSON Schema limitations.
This example demonstrates how structured outputs work with multi-step tool use. The agent needs to find TODO comments in the codebase, then look up git blame information for each one. It autonomously decides which tools to use (Grep to search, Bash to run git commands) and combines the results into a single structured response.
The schema includes optional fields (author and date) since git blame information might not be available for all files. The agent fills in what it can find and omits the rest.
import { query } from "@anthropic-ai/claude-agent-sdk";
// Define structure for TODO extraction
const todoSchema = {
type: "object",
properties: {
todos: {
type: "array",
items: {
type: "object",
properties: {
text: { type: "string" },
file: { type: "string" },
line: { type: "number" },
author: { type: "string" },
date: { type: "string" }
},
required: ["text", "file", "line"]
}
},
total_count: { type: "number" }
},
required: ["todos", "total_count"]
};
// Agent uses Grep to find TODOs, Bash to get git blame info
for await (const message of query({
prompt: "Find all TODO comments in this codebase and identify who added them",
options: {
outputFormat: {
type: "json_schema",
schema: todoSchema
}
}
})) {
if (message.type === "result" && message.subtype === "success" && message.structured_output) {
const data = message.structured_output as { total_count: number; todos: Array<{ file: string; line: number; text: string; author?: string; date?: string }> };
console.log(`Found ${data.total_count} TODOs`);
data.todos.forEach((todo) => {
console.log(`${todo.file}:${todo.line} - ${todo.text}`);
if (todo.author) {
console.log(` Added by ${todo.author} on ${todo.date}`);
}
});
}
}
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, ResultMessage
# Define structure for TODO extraction
todo_schema = {
"type": "object",
"properties": {
"todos": {
"type": "array",
"items": {
"type": "object",
"properties": {
"text": {"type": "string"},
"file": {"type": "string"},
"line": {"type": "number"},
"author": {"type": "string"},
"date": {"type": "string"},
},
"required": ["text", "file", "line"],
},
},
"total_count": {"type": "number"},
},
"required": ["todos", "total_count"],
}
async def main():
# Agent uses Grep to find TODOs, Bash to get git blame info
async for message in query(
prompt="Find all TODO comments in this codebase and identify who added them",
options=ClaudeAgentOptions(
output_format={"type": "json_schema", "schema": todo_schema}
),
):
if isinstance(message, ResultMessage) and message.structured_output:
data = message.structured_output
print(f"Found {data['total_count']} TODOs")
for todo in data["todos"]:
print(f"{todo['file']}:{todo['line']} - {todo['text']}")
if "author" in todo:
print(f" Added by {todo['author']} on {todo['date']}")
asyncio.run(main())
Structured output generation can fail when the agent cannot produce valid JSON matching your schema. This typically happens when the schema is too complex for the task, the task itself is ambiguous, or the agent hits its retry limit trying to fix validation errors.
When an error occurs, the result message has a subtype indicating what went wrong:
| Subtype | Meaning |
|---|---|
success |
Output was generated and validated successfully |
error_max_structured_output_retries |
Agent couldn't produce valid output after multiple attempts |
The example below checks the subtype field to determine whether the output was generated successfully or if you need to handle a failure:
for await (const msg of query({
prompt: "Extract contact info from the document",
options: {
outputFormat: {
type: "json_schema",
schema: contactSchema
}
}
})) {
if (msg.type === "result") {
if (msg.subtype === "success" && msg.structured_output) {
// Use the validated output
console.log(msg.structured_output);
} else if (msg.subtype === "error_max_structured_output_retries") {
// Handle the failure - retry with simpler prompt, fall back to unstructured, etc.
console.error("Could not produce valid output");
}
}
}
async for message in query(
prompt="Extract contact info from the document",
options=ClaudeAgentOptions(
output_format={"type": "json_schema", "schema": contact_schema}
),
):
if isinstance(message, ResultMessage):
if message.subtype == "success" and message.structured_output:
# Use the validated output
print(message.structured_output)
elif message.subtype == "error_max_structured_output_retries":
# Handle the failure
print("Could not produce valid output")
Tips for avoiding errors:
Define custom tools with the Claude Agent SDK's in-process MCP server so Claude can call your functions, hit your APIs, and perform domain-specific operations.
Custom tools extend the Agent SDK by letting you define your own functions that Claude can call during a conversation. Using the SDK's in-process MCP server, you can give Claude access to databases, external APIs, domain-specific logic, or any other capability your application needs.
This guide covers how to define tools with input schemas and handlers, bundle them into an MCP server, pass them to query, and control which tools Claude can access. It also covers error handling, tool annotations, and returning non-text content like images.
| If you want to... | Do this |
|---|---|
| Define a tool | Use @tool (Python) or tool (TypeScript) with a name, description, schema, and handler. See Create a custom tool. |
| Register a tool with Claude | Wrap in create_sdk_mcp_server / createSdkMcpServer and pass to mcpServers in query. See Call a custom tool. |
| Pre-approve a tool | Add to your allowed tools. See Configure allowed tools. |
| Remove a built-in tool from Claude's context | Pass a tools array listing only the built-ins you want. See Configure allowed tools. |
| Let Claude call tools in parallel | Set readOnlyHint: true on tools with no side effects. See Add tool annotations. |
| Handle errors without stopping the loop | Return isError: true instead of throwing. See Handle errors. |
| Return images or files | Use image or resource blocks in the content array. See Return images and resources. |
| Scale to many tools | Use tool search to load tools on demand. |
A tool is defined by four parts, passed as arguments to the tool helper in TypeScript or the @tool decorator in Python:
args are typed from it automatically. In Python this is a dict mapping names to types, like {"latitude": float}, which the SDK converts to JSON Schema for you. The Python decorator also accepts a full JSON Schema dict directly when you need enums, ranges, optional fields, or nested objects.content (required): an array of result blocks, each with a type of "text", "image", or "resource". See Return images and resources for non-text blocks.isError (optional): set to true to signal a tool failure so Claude can react to it. See Handle errors.After defining a tool, wrap it in a server with createSdkMcpServer (TypeScript) or create_sdk_mcp_server (Python). The server runs in-process inside your application, not as a separate process.
This example defines a get_temperature tool and wraps it in an MCP server. It only sets up the tool; to pass it to query and run it, see Call a custom tool below.
from typing import Any
import httpx
from claude_agent_sdk import tool, create_sdk_mcp_server
# Define a tool: name, description, input schema, handler
@tool(
"get_temperature",
"Get the current temperature at a location",
{"latitude": float, "longitude": float},
)
async def get_temperature(args: dict[str, Any]) -> dict[str, Any]:
async with httpx.AsyncClient() as client:
response = await client.get(
"https://api.open-meteo.com/v1/forecast",
params={
"latitude": args["latitude"],
"longitude": args["longitude"],
"current": "temperature_2m",
"temperature_unit": "fahrenheit",
},
)
data = response.json()
# Return a content array - Claude sees this as the tool result
return {
"content": [
{
"type": "text",
"text": f"Temperature: {data['current']['temperature_2m']}°F",
}
]
}
# Wrap the tool in an in-process MCP server
weather_server = create_sdk_mcp_server(
name="weather",
version="1.0.0",
tools=[get_temperature],
)
import { tool, createSdkMcpServer } from "@anthropic-ai/claude-agent-sdk";
import { z } from "zod";
// Define a tool: name, description, input schema, handler
const getTemperature = tool(
"get_temperature",
"Get the current temperature at a location",
{
latitude: z.number().describe("Latitude coordinate"), // .describe() adds a field description Claude sees
longitude: z.number().describe("Longitude coordinate")
},
async (args) => {
// args is typed from the schema: { latitude: number; longitude: number }
const response = await fetch(
`https://api.open-meteo.com/v1/forecast?latitude=${args.latitude}&longitude=${args.longitude}¤t=temperature_2m&temperature_unit=fahrenheit`
);
const data: any = await response.json();
// Return a content array - Claude sees this as the tool result
return {
content: [{ type: "text", text: `Temperature: ${data.current.temperature_2m}°F` }]
};
}
);
// Wrap the tool in an in-process MCP server
const weatherServer = createSdkMcpServer({
name: "weather",
version: "1.0.0",
tools: [getTemperature]
});
See the tool TypeScript reference or the @tool Python reference for full parameter details, including JSON Schema input formats and return value structure.
Tip
To make a parameter optional: in TypeScript, add .default to the Zod field. In Python, the dict schema treats every key as required, so leave the parameter out of the schema, mention it in the description string, and read it with args.get in the handler. The get_precipitation_chance tool below shows both patterns.
Pass the MCP server you created to query via the mcpServers option. The key in mcpServers becomes the {server_name} segment in each tool's fully qualified name: mcp__{server_name}__{tool_name}. List that name in allowedTools so the tool runs without a permission prompt.
These snippets reuse the weatherServer from the example above to ask Claude what the weather is in a specific location.
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, ResultMessage
async def main():
options = ClaudeAgentOptions(
mcp_servers={"weather": weather_server},
allowed_tools=["mcp__weather__get_temperature"],
)
async for message in query(
prompt="What's the temperature in San Francisco?",
options=options,
):
# ResultMessage is the final message after all tool calls complete
if isinstance(message, ResultMessage) and message.subtype == "success":
print(message.result)
asyncio.run(main())
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "What's the temperature in San Francisco?",
options: {
mcpServers: { weather: weatherServer },
allowedTools: ["mcp__weather__get_temperature"]
}
})) {
// "result" is the final message after all tool calls complete
if (message.type === "result" && message.subtype === "success") {
console.log(message.result);
}
}
A server holds as many tools as you list in its tools array. With more than one tool on a server, you can list each one in allowedTools individually or use the wildcard mcp__weather__* to cover every tool the server exposes.
The example below adds a second tool, get_precipitation_chance, to the weatherServer from the weather tool example and rebuilds it with both tools in the array.
# Define a second tool for the same server
@tool(
"get_precipitation_chance",
"Get the hourly precipitation probability for a location. "
"Optionally pass 'hours' (1-24) to control how many hours to return.",
{"latitude": float, "longitude": float},
)
async def get_precipitation_chance(args: dict[str, Any]) -> dict[str, Any]:
# 'hours' isn't in the schema - read it with .get() to make it optional
hours = args.get("hours", 12)
async with httpx.AsyncClient() as client:
response = await client.get(
"https://api.open-meteo.com/v1/forecast",
params={
"latitude": args["latitude"],
"longitude": args["longitude"],
"hourly": "precipitation_probability",
"forecast_days": 1,
},
)
data = response.json()
chances = data["hourly"]["precipitation_probability"][:hours]
return {
"content": [
{
"type": "text",
"text": f"Next {hours} hours: {'%, '.join(map(str, chances))}%",
}
]
}
# Rebuild the server with both tools in the array
weather_server = create_sdk_mcp_server(
name="weather",
version="1.0.0",
tools=[get_temperature, get_precipitation_chance],
)
// Define a second tool for the same server
const getPrecipitationChance = tool(
"get_precipitation_chance",
"Get the hourly precipitation probability for a location",
{
latitude: z.number(),
longitude: z.number(),
hours: z
.number()
.int()
.min(1)
.max(24)
.default(12) // .default() makes the parameter optional
.describe("How many hours of forecast to return")
},
async (args) => {
const response = await fetch(
`https://api.open-meteo.com/v1/forecast?latitude=${args.latitude}&longitude=${args.longitude}&hourly=precipitation_probability&forecast_days=1`
);
const data: any = await response.json();
const chances = data.hourly.precipitation_probability.slice(0, args.hours);
return {
content: [{ type: "text", text: `Next ${args.hours} hours: ${chances.join("%, ")}%` }]
};
}
);
// Rebuild the server with both tools in the array
const weatherServer = createSdkMcpServer({
name: "weather",
version: "1.0.0",
tools: [getTemperature, getPrecipitationChance]
});
Every tool in this array consumes context window space on every turn. If you're defining dozens of tools, see tool search to load them on demand instead.
Tool annotations are optional metadata describing how a tool behaves. Pass them as the fifth argument to tool helper in TypeScript or via the annotations keyword argument for the @tool decorator in Python. All hint fields are Booleans.
| Field | Default | Meaning |
|---|---|---|
readOnlyHint |
false |
Tool does not modify its environment. Controls whether the tool can be called in parallel with other read-only tools. |
destructiveHint |
true |
Tool may perform destructive updates. Informational only. |
idempotentHint |
false |
Repeated calls with the same arguments have no additional effect. Informational only. |
openWorldHint |
true |
Tool reaches systems outside your process. Informational only. |
Annotations are metadata, not enforcement. A tool marked readOnlyHint: true can still write to disk if that's what the handler does. Keep the annotation accurate to the handler.
This example adds readOnlyHint to the get_temperature tool from the weather tool example.
from claude_agent_sdk import tool, ToolAnnotations
@tool(
"get_temperature",
"Get the current temperature at a location",
{"latitude": float, "longitude": float},
annotations=ToolAnnotations(
readOnlyHint=True
), # Lets Claude batch this with other read-only calls
)
async def get_temperature(args):
return {"content": [{"type": "text", "text": "..."}]}
tool(
"get_temperature",
"Get the current temperature at a location",
{ latitude: z.number(), longitude: z.number() },
async (args) => ({ content: [{ type: "text", text: `...` }] }),
{ annotations: { readOnlyHint: true } } // Lets Claude batch this with other read-only calls
);
See ToolAnnotations in the TypeScript or Python reference.
The weather tool example registered a server and listed tools in allowedTools. This section covers how tool names are constructed and how to scope access when you have multiple tools or want to restrict built-ins.
When MCP tools are exposed to Claude, their names follow a specific format:
mcp__{server_name}__{tool_name}get_temperature in server weather becomes mcp__weather__get_temperatureThe tools option and the allowed/disallowed lists operate on separate layers. tools controls which built-in tools appear in Claude's context. Allowed and disallowed tool lists control whether calls are approved or denied once Claude attempts them.
| Option | Layer | Effect |
|---|---|---|
tools: ["Read", "Grep"] |
Availability | Only the listed built-ins are in Claude's context. Unlisted built-ins are removed. MCP tools are unaffected. |
tools: [] |
Availability | All built-ins are removed. Claude can only use your MCP tools. |
| allowed tools | Permission | Listed tools run without a permission prompt. Unlisted tools remain available; calls go through the permission flow. |
| disallowed tools | Permission | Every call to a listed tool is denied. The tool stays in Claude's context, so Claude may still attempt it before the call is rejected. |
To limit which built-ins Claude can use, prefer tools over disallowed tools. Omitting a tool from tools removes it from context so Claude never attempts it; listing it in disallowedTools (Python: disallowed_tools) blocks the call but leaves the tool visible, so Claude may waste a turn trying it. See Configure permissions for the full evaluation order.
How your handler reports errors determines whether the agent loop continues or stops:
| What happens | Result |
|---|---|
| Handler throws an uncaught exception | Agent loop stops. Claude never sees the error, and the query call fails. |
Handler catches the error and returns isError: true (TS) / "is_error": True (Python) |
Agent loop continues. Claude sees the error as data and can retry, try a different tool, or explain the failure. |
The example below catches two kinds of failures inside the handler instead of letting them throw. A non-200 HTTP status is caught from the response and returned as an error result. A network error or invalid JSON is caught by the surrounding try/except (Python) or try/catch (TypeScript) and also returned as an error result. In both cases the handler returns normally and the agent loop continues.
import json
import httpx
from typing import Any
@tool(
"fetch_data",
"Fetch data from an API",
{"endpoint": str}, # Simple schema
)
async def fetch_data(args: dict[str, Any]) -> dict[str, Any]:
try:
async with httpx.AsyncClient() as client:
response = await client.get(args["endpoint"])
if response.status_code != 200:
# Return the failure as a tool result so Claude can react to it.
# is_error marks this as a failed call rather than odd-looking data.
return {
"content": [
{
"type": "text",
"text": f"API error: {response.status_code} {response.reason_phrase}",
}
],
"is_error": True,
}
data = response.json()
return {"content": [{"type": "text", "text": json.dumps(data, indent=2)}]}
except Exception as e:
# Catching here keeps the agent loop alive. An uncaught exception
# would end the whole query() call.
return {
"content": [{"type": "text", "text": f"Failed to fetch data: {str(e)}"}],
"is_error": True,
}
tool(
"fetch_data",
"Fetch data from an API",
{
endpoint: z.string().url().describe("API endpoint URL")
},
async (args) => {
try {
const response = await fetch(args.endpoint);
if (!response.ok) {
// Return the failure as a tool result so Claude can react to it.
// isError marks this as a failed call rather than odd-looking data.
return {
content: [
{
type: "text",
text: `API error: ${response.status} ${response.statusText}`
}
],
isError: true
};
}
const data = await response.json();
return {
content: [
{
type: "text",
text: JSON.stringify(data, null, 2)
}
]
};
} catch (error) {
// Catching here keeps the agent loop alive. An uncaught throw
// would end the whole query() call.
return {
content: [
{
type: "text",
text: `Failed to fetch data: ${error instanceof Error ? error.message : String(error)}`
}
],
isError: true
};
}
}
);
The content array in a tool result accepts text, image, and resource blocks. You can mix them in the same response.
An image block carries the image bytes inline, encoded as base64. There is no URL field. To return an image that lives at a URL, fetch it in the handler, read the response bytes, and base64-encode them before returning. The result is processed as visual input.
| Field | Type | Notes |
|---|---|---|
type |
"image" |
|
data |
string |
Base64-encoded bytes. Raw base64 only, no data:image/...;base64, prefix |
mimeType |
string |
Required. For example image/png, image/jpeg, image/webp, image/gif |
import base64
import httpx
# Define a tool that fetches an image from a URL and returns it to Claude
@tool("fetch_image", "Fetch an image from a URL and return it to Claude", {"url": str})
async def fetch_image(args):
async with httpx.AsyncClient() as client: # Fetch the image bytes
response = await client.get(args["url"])
return {
"content": [
{
"type": "image",
"data": base64.b64encode(response.content).decode(
"ascii"
), # Base64-encode the raw bytes
"mimeType": response.headers.get(
"content-type", "image/png"
), # Read MIME type from the response
}
]
}
tool(
"fetch_image",
"Fetch an image from a URL and return it to Claude",
{
url: z.string().url()
},
async (args) => {
const response = await fetch(args.url); // Fetch the image bytes
const buffer = Buffer.from(await response.arrayBuffer()); // Read into a Buffer for base64 encoding
const mimeType = response.headers.get("content-type") ?? "image/png";
return {
content: [
{
type: "image",
data: buffer.toString("base64"), // Base64-encode the raw bytes
mimeType
}
]
};
}
);
A resource block embeds a piece of content identified by a URI. The URI is a label for Claude to reference; the actual content rides in the block's text or blob field. Use this when your tool produces something that makes sense to address by name later, such as a generated file or a record from an external system.
| Field | Type | Notes |
|---|---|---|
type |
"resource" |
|
resource.uri |
string |
Identifier for the content. Any URI scheme |
resource.text |
string |
The content, if it's text. Provide this or blob, not both |
resource.blob |
string |
The content base64-encoded, if it's binary |
resource.mimeType |
string |
Optional |
This example shows a resource block returned from inside a tool handler. The URI file:///tmp/report.md is a label that Claude can reference later; the SDK does not read from that path.
return {
content: [
{
type: "resource",
resource: {
uri: "file:///tmp/report.md", // Label for Claude to reference, not a path the SDK reads
mimeType: "text/markdown",
text: "# Report\n..." // The actual content, inline
}
}
]
};
return {
"content": [
{
"type": "resource",
"resource": {
"uri": "file:///tmp/report.md", # Label for Claude to reference, not a path the SDK reads
"mimeType": "text/markdown",
"text": "# Report\n...", # The actual content, inline
},
}
]
}
These block shapes come from the MCP CallToolResult type. See the MCP specification for the full definition.
This tool converts values between units of length, temperature, and weight. A user can ask "convert 100 kilometers to miles" or "what is 72°F in Celsius," and Claude picks the right unit type and units from the request.
It demonstrates two patterns:
unit_type is constrained to a fixed set of values. In TypeScript, use z.enum. In Python, the dict schema doesn't support enums, so the full JSON Schema dict is required.isError: true so Claude can tell the user what went wrong rather than treating a failure as a normal result.from typing import Any
from claude_agent_sdk import tool, create_sdk_mcp_server
# z.enum() in TypeScript becomes an "enum" constraint in JSON Schema.
# The dict schema has no equivalent, so full JSON Schema is required.
@tool(
"convert_units",
"Convert a value from one unit to another",
{
"type": "object",
"properties": {
"unit_type": {
"type": "string",
"enum": ["length", "temperature", "weight"],
"description": "Category of unit",
},
"from_unit": {
"type": "string",
"description": "Unit to convert from, e.g. kilometers, fahrenheit, pounds",
},
"to_unit": {"type": "string", "description": "Unit to convert to"},
"value": {"type": "number", "description": "Value to convert"},
},
"required": ["unit_type", "from_unit", "to_unit", "value"],
},
)
async def convert_units(args: dict[str, Any]) -> dict[str, Any]:
conversions = {
"length": {
"kilometers_to_miles": lambda v: v * 0.621371,
"miles_to_kilometers": lambda v: v * 1.60934,
"meters_to_feet": lambda v: v * 3.28084,
"feet_to_meters": lambda v: v * 0.3048,
},
"temperature": {
"celsius_to_fahrenheit": lambda v: (v * 9) / 5 + 32,
"fahrenheit_to_celsius": lambda v: (v - 32) * 5 / 9,
"celsius_to_kelvin": lambda v: v + 273.15,
"kelvin_to_celsius": lambda v: v - 273.15,
},
"weight": {
"kilograms_to_pounds": lambda v: v * 2.20462,
"pounds_to_kilograms": lambda v: v * 0.453592,
"grams_to_ounces": lambda v: v * 0.035274,
"ounces_to_grams": lambda v: v * 28.3495,
},
}
key = f"{args['from_unit']}_to_{args['to_unit']}"
fn = conversions.get(args["unit_type"], {}).get(key)
if not fn:
return {
"content": [
{
"type": "text",
"text": f"Unsupported conversion: {args['from_unit']} to {args['to_unit']}",
}
],
"is_error": True,
}
result = fn(args["value"])
return {
"content": [
{
"type": "text",
"text": f"{args['value']} {args['from_unit']} = {result:.4f} {args['to_unit']}",
}
]
}
converter_server = create_sdk_mcp_server(
name="converter",
version="1.0.0",
tools=[convert_units],
)
import { tool, createSdkMcpServer } from "@anthropic-ai/claude-agent-sdk";
import { z } from "zod";
const convert = tool(
"convert_units",
"Convert a value from one unit to another",
{
unit_type: z.enum(["length", "temperature", "weight"]).describe("Category of unit"),
from_unit: z
.string()
.describe("Unit to convert from, e.g. kilometers, fahrenheit, pounds"),
to_unit: z.string().describe("Unit to convert to"),
value: z.number().describe("Value to convert")
},
async (args) => {
type Conversions = Record<string, Record<string, (v: number) => number>>;
const conversions: Conversions = {
length: {
kilometers_to_miles: (v) => v * 0.621371,
miles_to_kilometers: (v) => v * 1.60934,
meters_to_feet: (v) => v * 3.28084,
feet_to_meters: (v) => v * 0.3048
},
temperature: {
celsius_to_fahrenheit: (v) => (v * 9) / 5 + 32,
fahrenheit_to_celsius: (v) => ((v - 32) * 5) / 9,
celsius_to_kelvin: (v) => v + 273.15,
kelvin_to_celsius: (v) => v - 273.15
},
weight: {
kilograms_to_pounds: (v) => v * 2.20462,
pounds_to_kilograms: (v) => v * 0.453592,
grams_to_ounces: (v) => v * 0.035274,
ounces_to_grams: (v) => v * 28.3495
}
};
const key = `${args.from_unit}_to_${args.to_unit}`;
const fn = conversions[args.unit_type]?.[key];
if (!fn) {
return {
content: [
{
type: "text",
text: `Unsupported conversion: ${args.from_unit} to ${args.to_unit}`
}
],
isError: true
};
}
const result = fn(args.value);
return {
content: [
{
type: "text",
text: `${args.value} ${args.from_unit} = ${result.toFixed(4)} ${args.to_unit}`
}
]
};
}
);
const converterServer = createSdkMcpServer({
name: "converter",
version: "1.0.0",
tools: [convert]
});
Once the server is defined, pass it to query the same way as the weather example. This example sends three different prompts in a loop to show the same tool handling different unit types. For each response, it inspects AssistantMessage objects (which contain the tool calls Claude made during that turn) and prints each ToolUseBlock before printing the final ResultMessage text. This lets you see when Claude is using the tool versus answering from its own knowledge.
import asyncio
from claude_agent_sdk import (
query,
ClaudeAgentOptions,
ResultMessage,
AssistantMessage,
ToolUseBlock,
)
async def main():
options = ClaudeAgentOptions(
mcp_servers={"converter": converter_server},
allowed_tools=["mcp__converter__convert_units"],
)
prompts = [
"Convert 100 kilometers to miles.",
"What is 72°F in Celsius?",
"How many pounds is 5 kilograms?",
]
for prompt in prompts:
async for message in query(prompt=prompt, options=options):
if isinstance(message, AssistantMessage):
for block in message.content:
if isinstance(block, ToolUseBlock):
print(f"[tool call] {block.name}({block.input})")
elif isinstance(message, ResultMessage) and message.subtype == "success":
print(f"Q: {prompt}\nA: {message.result}\n")
asyncio.run(main())
import { query } from "@anthropic-ai/claude-agent-sdk";
const prompts = [
"Convert 100 kilometers to miles.",
"What is 72°F in Celsius?",
"How many pounds is 5 kilograms?"
];
for (const prompt of prompts) {
for await (const message of query({
prompt,
options: {
mcpServers: { converter: converterServer },
allowedTools: ["mcp__converter__convert_units"]
}
})) {
if (message.type === "assistant") {
for (const block of message.message.content) {
if (block.type === "tool_use") {
console.log(`[tool call] ${block.name}`, block.input);
}
}
} else if (message.type === "result" && message.subtype === "success") {
console.log(`Q: ${prompt}\nA: ${message.result}\n`);
}
}
}
Custom tools wrap async functions in a standard interface. You can mix the patterns on this page in the same server: a single server can hold a database tool, an API gateway tool, and an image renderer alongside each other.
From here:
Configure MCP servers to extend your agent with external tools. Covers transport types, tool search for large tool sets, authentication, and error handling.
The Model Context Protocol (MCP) is an open standard for connecting AI agents to external tools and data sources. With MCP, your agent can query databases, integrate with APIs like Slack and GitHub, and connect to other services without writing custom tool implementations.
MCP servers can run as local processes, connect over HTTP, or execute directly within your SDK application.
This example connects to the Claude Code documentation MCP server using HTTP transport and uses allowedTools with a wildcard to permit all tools from the server.
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "Use the docs MCP server to explain what hooks are in Claude Code",
options: {
mcpServers: {
"claude-code-docs": {
type: "http",
url: "https://code.claude.com/docs/mcp"
}
},
allowedTools: ["mcp__claude-code-docs__*"]
}
})) {
if (message.type === "result" && message.subtype === "success") {
console.log(message.result);
}
}
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, ResultMessage
async def main():
options = ClaudeAgentOptions(
mcp_servers={
"claude-code-docs": {
"type": "http",
"url": "https://code.claude.com/docs/mcp",
}
},
allowed_tools=["mcp__claude-code-docs__*"],
)
async for message in query(
prompt="Use the docs MCP server to explain what hooks are in Claude Code",
options=options,
):
if isinstance(message, ResultMessage) and message.subtype == "success":
print(message.result)
asyncio.run(main())
The agent connects to the documentation server, searches for information about hooks, and returns the results.
You can configure MCP servers in code when calling query, or in a .mcp.json file loaded via settingSources.
Pass MCP servers directly in the mcpServers option:
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "List files in my project",
options: {
mcpServers: {
filesystem: {
command: "npx",
args: ["-y", "@modelcontextprotocol/server-filesystem", "/Users/me/projects"]
}
},
allowedTools: ["mcp__filesystem__*"]
}
})) {
if (message.type === "result" && message.subtype === "success") {
console.log(message.result);
}
}
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, ResultMessage
async def main():
options = ClaudeAgentOptions(
mcp_servers={
"filesystem": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-filesystem",
"/Users/me/projects",
],
}
},
allowed_tools=["mcp__filesystem__*"],
)
async for message in query(prompt="List files in my project", options=options):
if isinstance(message, ResultMessage) and message.subtype == "success":
print(message.result)
asyncio.run(main())
Create a .mcp.json file at your project root. The SDK does not load filesystem settings by default, so set settingSources: ["project"] (Python: setting_sources=["project"]) in your options for the file to be picked up:
{
"mcpServers": {
"filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/Users/me/projects"]
}
}
}
MCP tools require explicit permission before Claude can use them. Without permission, Claude will see that tools are available but won't be able to call them.
MCP tools follow the naming pattern mcp__<server-name>__<tool-name>. For example, a GitHub server named "github" with a list_issues tool becomes mcp__github__list_issues.
Use allowedTools to specify which MCP tools Claude can use:
const _ = {
options: {
mcpServers: {
// your servers
},
allowedTools: [
"mcp__github__*", // All tools from the github server
"mcp__db__query", // Only the query tool from db server
"mcp__slack__send_message" // Only send_message from slack server
]
}
};
Wildcards (*) let you allow all tools from a server without listing each one individually.
Note
Prefer allowedTools over permission modes for MCP access. permissionMode: "acceptEdits" does not auto-approve MCP tools (only file edits and filesystem Bash commands). permissionMode: "bypassPermissions" does auto-approve MCP tools but also disables all other safety prompts, which is broader than necessary. A wildcard in allowedTools grants exactly the MCP server you want and nothing more. See Permission modes for a full comparison.
To see what tools an MCP server provides, check the server's documentation or connect to the server and inspect the system init message:
for await (const message of query({ prompt: "...", options })) {
if (message.type === "system" && message.subtype === "init") {
console.log("Available MCP tools:", message.mcp_servers);
}
}
MCP servers communicate with your agent using different transport protocols. Check the server's documentation to see which transport it supports:
npx @modelcontextprotocol/server-github), use stdioLocal processes that communicate via stdin/stdout. Use this for MCP servers you run on the same machine:
const _ = {
options: {
mcpServers: {
github: {
command: "npx",
args: ["-y", "@modelcontextprotocol/server-github"],
env: {
GITHUB_TOKEN: process.env.GITHUB_TOKEN
}
}
},
allowedTools: ["mcp__github__list_issues", "mcp__github__search_issues"]
}
};
options = ClaudeAgentOptions(
mcp_servers={
"github": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": {"GITHUB_TOKEN": os.environ["GITHUB_TOKEN"]},
}
},
allowed_tools=["mcp__github__list_issues", "mcp__github__search_issues"],
)
{
"mcpServers": {
"github": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": {
"GITHUB_TOKEN": "${GITHUB_TOKEN}"
}
}
}
}
Use HTTP or SSE for cloud-hosted MCP servers and remote APIs:
const _ = {
options: {
mcpServers: {
"remote-api": {
type: "sse",
url: "https://api.example.com/mcp/sse",
headers: {
Authorization: `Bearer ${process.env.API_TOKEN}`
}
}
},
allowedTools: ["mcp__remote-api__*"]
}
};
options = ClaudeAgentOptions(
mcp_servers={
"remote-api": {
"type": "sse",
"url": "https://api.example.com/mcp/sse",
"headers": {"Authorization": f"Bearer {os.environ['API_TOKEN']}"},
}
},
allowed_tools=["mcp__remote-api__*"],
)
{
"mcpServers": {
"remote-api": {
"type": "sse",
"url": "https://api.example.com/mcp/sse",
"headers": {
"Authorization": "Bearer ${API_TOKEN}"
}
}
}
}
For HTTP (non-streaming), use "type": "http" instead.
Define custom tools directly in your application code instead of running a separate server process. See the custom tools guide for implementation details.
When you have many MCP tools configured, tool definitions can consume a significant portion of your context window. Tool search solves this by withholding tool definitions from context and loading only the ones Claude needs for each turn.
Tool search is enabled by default. See Tool search for configuration options and details.
For more detail, including best practices and using tool search with custom SDK tools, see the tool search guide.
Most MCP servers require authentication to access external services. Pass credentials through environment variables in the server configuration.
Use the env field to pass API keys, tokens, and other credentials to the MCP server:
const _ = {
options: {
mcpServers: {
github: {
command: "npx",
args: ["-y", "@modelcontextprotocol/server-github"],
env: {
GITHUB_TOKEN: process.env.GITHUB_TOKEN
}
}
},
allowedTools: ["mcp__github__list_issues"]
}
};
options = ClaudeAgentOptions(
mcp_servers={
"github": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": {"GITHUB_TOKEN": os.environ["GITHUB_TOKEN"]},
}
},
allowed_tools=["mcp__github__list_issues"],
)
{
"mcpServers": {
"github": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": {
"GITHUB_TOKEN": "${GITHUB_TOKEN}"
}
}
}
}
The ${GITHUB_TOKEN} syntax expands environment variables at runtime.
See List issues from a repository for a complete working example with debug logging.
For HTTP and SSE servers, pass authentication headers directly in the server configuration:
const _ = {
options: {
mcpServers: {
"secure-api": {
type: "http",
url: "https://api.example.com/mcp",
headers: {
Authorization: `Bearer ${process.env.API_TOKEN}`
}
}
},
allowedTools: ["mcp__secure-api__*"]
}
};
options = ClaudeAgentOptions(
mcp_servers={
"secure-api": {
"type": "http",
"url": "https://api.example.com/mcp",
"headers": {"Authorization": f"Bearer {os.environ['API_TOKEN']}"},
}
},
allowed_tools=["mcp__secure-api__*"],
)
{
"mcpServers": {
"secure-api": {
"type": "http",
"url": "https://api.example.com/mcp",
"headers": {
"Authorization": "Bearer ${API_TOKEN}"
}
}
}
}
The ${API_TOKEN} syntax expands environment variables at runtime.
The MCP specification supports OAuth 2.1 for authorization. The SDK doesn't handle OAuth flows automatically, but you can pass access tokens via headers after completing the OAuth flow in your application:
// After completing OAuth flow in your app
const accessToken = await getAccessTokenFromOAuthFlow();
const options = {
mcpServers: {
"oauth-api": {
type: "http",
url: "https://api.example.com/mcp",
headers: {
Authorization: `Bearer ${accessToken}`
}
}
},
allowedTools: ["mcp__oauth-api__*"]
};
# After completing OAuth flow in your app
access_token = await get_access_token_from_oauth_flow()
options = ClaudeAgentOptions(
mcp_servers={
"oauth-api": {
"type": "http",
"url": "https://api.example.com/mcp",
"headers": {"Authorization": f"Bearer {access_token}"},
}
},
allowed_tools=["mcp__oauth-api__*"],
)
This example connects to the GitHub MCP server to list recent issues. The example includes debug logging to verify the MCP connection and tool calls.
Before running, create a GitHub personal access token with repo scope and set it as an environment variable:
}
if (message.type === "result" && message.subtype === "error_during_execution") {
console.error("Execution failed");
}
}
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, SystemMessage, ResultMessage
async def main():
options = ClaudeAgentOptions(mcp_servers={"data-processor": data_server})
async for message in query(prompt="Process data", options=options):
if isinstance(message, SystemMessage) and message.subtype == "init":
failed_servers = [
s
for s in message.data.get("mcp_servers", [])
if s.get("status") != "connected"
]
if failed_servers:
print(f"Failed to connect: {failed_servers}")
if (
isinstance(message, ResultMessage)
and message.subtype == "error_during_execution"
):
print("Execution failed")
asyncio.run(main())
Check the init message to see which servers failed to connect:
if (message.type === "system" && message.subtype === "init") {
for (const server of message.mcp_servers) {
if (server.status === "failed") {
console.error(`Server ${server.name} failed to connect`);
}
}
}
Common causes:
env field matches what the server expects.npx commands, verify the package exists and Node.js is in your PATH.If Claude sees tools but doesn't use them, check that you've granted permission with allowedTools:
const _ = {
options: {
mcpServers: {
// your servers
},
allowedTools: ["mcp__servername__*"] // Required for Claude to use the tools
}
};
The MCP SDK has a default timeout of 60 seconds for server connections. If your server takes longer to start, the connection will fail. For servers that need more startup time, consider:
allowedTools and disallowedToolsScale your agent to thousands of tools by discovering and loading only what's needed, on demand.
Tool search enables your agent to work with hundreds or thousands of tools by dynamically discovering and loading them on demand. Instead of loading all tool definitions into the context window upfront, the agent searches your tool catalog and loads only the tools it needs.
This approach solves two challenges as tool libraries scale:
Tool search is enabled by default. This page covers how it works, how to configure it, and how to optimize tool discovery.
When tool search is active, tool definitions are withheld from the context window. The agent receives a summary of available tools and searches for relevant ones when the task requires a capability not already loaded. The 3-5 most relevant tools are loaded into context, where they stay available for subsequent turns. If the conversation is long enough that the SDK compacts earlier messages to free space, previously discovered tools may be removed, and the agent searches again as needed.
Tool search adds one extra round-trip the first time Claude discovers a tool (the search step), but for large tool sets this is offset by smaller context on every turn. With fewer than ~10 tools, loading everything upfront is typically faster.
For details on the underlying API mechanism, see Tool search in the API.
Note
Tool search requires Claude Sonnet 4 or later, or Claude Opus 4 or later. Haiku models do not support tool search.
By default, tool search is always on. You can change this with the ENABLE_TOOL_SEARCH environment variable:
| Value | Behavior |
|---|---|
| (unset) | Tool search is always on. Tool definitions are never loaded into context. This is the default. |
true |
Same as unset. |
auto |
Checks the combined token count of all tool definitions against the model's context window. If they exceed 10%, tool search activates. If they're under 10%, all tools are loaded into context normally. |
auto:N |
Same as auto with a custom percentage. auto:5 activates when tool definitions exceed 5% of the context window. Lower values activate sooner. |
false |
Tool search is off. All tool definitions are loaded into context on every turn. |
Tool search applies to all registered tools, whether they come from remote MCP servers or custom SDK MCP servers. When using auto, the threshold is based on the combined size of all tool definitions across all servers.
Set the value in the env option on query. This example connects to a remote MCP server that exposes many tools, pre-approves all of them with a wildcard, and uses auto:5 so tool search activates when their definitions exceed 5% of the context window:
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "Find and run the appropriate database query",
options: {
mcpServers: {
"enterprise-tools": {
// Connect to a remote MCP server
type: "http",
url: "https://tools.example.com/mcp"
}
},
allowedTools: ["mcp__enterprise-tools__*"], // Wildcard pre-approves all tools from this server
env: {
ENABLE_TOOL_SEARCH: "auto:5" // Activate tool search when tools exceed 5% of context
}
}
})) {
if (message.type === "result" && message.subtype === "success") {
console.log(message.result);
}
}
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, ResultMessage
async def main():
options = ClaudeAgentOptions(
mcp_servers={
"enterprise-tools": {
"type": "http",
"url": "https://tools.example.com/mcp",
}
},
allowed_tools=[
"mcp__enterprise-tools__*"
], # Wildcard pre-approves all tools from this server
env={
"ENABLE_TOOL_SEARCH": "auto:5" # Activate tool search when tools exceed 5% of context
},
)
async for message in query(
prompt="Find and run the appropriate database query",
options=options,
):
if isinstance(message, ResultMessage) and message.subtype == "success":
print(message.result)
asyncio.run(main())
Setting ENABLE_TOOL_SEARCH to "false" disables tool search and loads all tool definitions into context on every turn. This removes the search round-trip, which can be faster when the tool set is small (fewer than ~10 tools) and the definitions fit comfortably in the context window.
The search mechanism matches queries against tool names and descriptions. Names like search_slack_messages surface for a wider range of requests than query_slack. Descriptions with specific keywords ("Search Slack messages by keyword, channel, or date range") match more queries than generic ones ("Query Slack").
You can also add a system prompt section listing available tool categories. This gives the agent context about what kinds of tools are available to search for:
You can search for tools to interact with Slack, GitHub, and Jira.
Define and invoke subagents to isolate context, run tasks in parallel, and apply specialized instructions in your Claude Agent SDK applications.
Subagents are separate agent instances that your main agent can spawn to handle focused subtasks. Use subagents to isolate context for focused subtasks, run multiple analyses in parallel, and apply specialized instructions without bloating the main agent's prompt.
This guide explains how to define and use subagents in the SDK using the agents parameter.
You can create subagents in three ways:
agents parameter in your query options (TypeScript, Python).claude/agents/ directories (see defining subagents as files)general-purpose subagent at any time via the Agent tool without you defining anythingThis guide focuses on the programmatic approach, which is recommended for SDK applications.
When you define subagents, Claude determines whether to invoke them based on each subagent's description field. Write clear descriptions that explain when the subagent should be used, and Claude will automatically delegate appropriate tasks. You can also explicitly request a subagent by name in your prompt (for example, "Use the code-reviewer agent to...").
Each subagent runs in its own fresh conversation. Intermediate tool calls and results stay inside the subagent; only its final message returns to the parent. See What subagents inherit for exactly what's in the subagent's context.
Example: a research-assistant subagent can explore dozens of files without any of that content accumulating in the main conversation. The parent receives a concise summary, not every file the subagent read.
Multiple subagents can run concurrently, dramatically speeding up complex workflows.
Example: during a code review, you can run style-checker, security-scanner, and test-coverage subagents simultaneously, reducing review time from minutes to seconds.
Each subagent can have tailored system prompts with specific expertise, best practices, and constraints.
Example: a database-migration subagent can have detailed knowledge about SQL best practices, rollback strategies, and data integrity checks that would be unnecessary noise in the main agent's instructions.
Subagents can be limited to specific tools, reducing the risk of unintended actions.
Example: a doc-reviewer subagent might only have access to Read and Grep tools, ensuring it can analyze but never accidentally modify your documentation files.
Define subagents directly in your code using the agents parameter. This example creates two subagents: a code reviewer with read-only access and a test runner that can execute commands. The Agent tool must be included in allowedTools since Claude invokes subagents through the Agent tool.
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, AgentDefinition
async def main():
async for message in query(
prompt="Review the authentication module for security issues",
options=ClaudeAgentOptions(
# Agent tool is required for subagent invocation
allowed_tools=["Read", "Grep", "Glob", "Agent"],
agents={
"code-reviewer": AgentDefinition(
# description tells Claude when to use this subagent
description="Expert code review specialist. Use for quality, security, and maintainability reviews.",
# prompt defines the subagent's behavior and expertise
prompt="""You are a code review specialist with expertise in security, performance, and best practices.
When reviewing code:
- Identify security vulnerabilities
- Check for performance issues
- Verify adherence to coding standards
- Suggest specific improvements
Be thorough but concise in your feedback.""",
# tools restricts what the subagent can do (read-only here)
tools=["Read", "Grep", "Glob"],
# model overrides the default model for this subagent
model="sonnet",
),
"test-runner": AgentDefinition(
description="Runs and analyzes test suites. Use for test execution and coverage analysis.",
prompt="""You are a test execution specialist. Run tests and provide clear analysis of results.
Focus on:
- Running test commands
- Analyzing test output
- Identifying failing tests
- Suggesting fixes for failures""",
# Bash access lets this subagent run test commands
tools=["Bash", "Read", "Grep"],
),
},
),
):
if hasattr(message, "result"):
print(message.result)
asyncio.run(main())
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "Review the authentication module for security issues",
options: {
// Agent tool is required for subagent invocation
allowedTools: ["Read", "Grep", "Glob", "Agent"],
agents: {
"code-reviewer": {
// description tells Claude when to use this subagent
description:
"Expert code review specialist. Use for quality, security, and maintainability reviews.",
// prompt defines the subagent's behavior and expertise
prompt: `You are a code review specialist with expertise in security, performance, and best practices.
When reviewing code:
- Identify security vulnerabilities
- Check for performance issues
- Verify adherence to coding standards
- Suggest specific improvements
Be thorough but concise in your feedback.`,
// tools restricts what the subagent can do (read-only here)
tools: ["Read", "Grep", "Glob"],
// model overrides the default model for this subagent
model: "sonnet"
},
"test-runner": {
description:
"Runs and analyzes test suites. Use for test execution and coverage analysis.",
prompt: `You are a test execution specialist. Run tests and provide clear analysis of results.
Focus on:
- Running test commands
- Analyzing test output
- Identifying failing tests
- Suggesting fixes for failures`,
// Bash access lets this subagent run test commands
tools: ["Bash", "Read", "Grep"]
}
}
}
})) {
if ("result" in message) console.log(message.result);
}
| Field | Type | Required | Description |
|---|---|---|---|
description |
string |
Yes | Natural language description of when to use this agent |
prompt |
string |
Yes | The agent's system prompt defining its role and behavior |
tools |
string[] |
No | Array of allowed tool names. If omitted, inherits all tools |
model |
'sonnet' | 'opus' | 'haiku' | 'inherit' |
No | Model override for this agent. Defaults to main model if omitted |
skills |
string[] |
No | List of skill names available to this agent |
memory |
'user' | 'project' | 'local' |
No | Memory source for this agent (Python only) |
mcpServers |
(string | object)[] |
No | MCP servers available to this agent, by name or inline config |
Note
Subagents cannot spawn their own subagents. Don't include Agent in a subagent's tools array.
You can also define subagents as markdown files in .claude/agents/ directories. See the Claude Code subagents documentation for details on this approach. Programmatically defined agents take precedence over filesystem-based agents with the same name.
Note
Even without defining custom subagents, Claude can spawn the built-in general-purpose subagent when Agent is in your allowedTools. This is useful for delegating research or exploration tasks without creating specialized agents.
A subagent's context window starts fresh (no parent conversation) but isn't empty. The only channel from parent to subagent is the Agent tool's prompt string, so include any file paths, error messages, or decisions the subagent needs directly in that prompt.
| The subagent receives | The subagent does not receive |
|---|---|
Its own system prompt (AgentDefinition.prompt) and the Agent tool's prompt |
The parent's conversation history or tool results |
Project CLAUDE.md (loaded via settingSources) |
Skills (unless listed in AgentDefinition.skills) |
Tool definitions (inherited from parent, or the subset in tools) |
The parent's system prompt |
Note
The parent receives the subagent's final message verbatim as the Agent tool result, but may summarize it in its own response. To preserve subagent output verbatim in the user-facing response, include an instruction to do so in the prompt or systemPrompt option you pass to the main query call.
Claude automatically decides when to invoke subagents based on the task and each subagent's description. For example, if you define a performance-optimizer subagent with the description "Performance optimization specialist for query tuning", Claude will invoke it when your prompt mentions optimizing queries.
Write clear, specific descriptions so Claude can match tasks to the right subagent.
To guarantee Claude uses a specific subagent, mention it by name in your prompt:
"Use the code-reviewer agent to check the authentication module"
This bypasses automatic matching and directly invokes the named subagent.
You can create agent definitions dynamically based on runtime conditions. This example creates a security reviewer with different strictness levels, using a more powerful model for strict reviews.
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, AgentDefinition
# Factory function that returns an AgentDefinition
# This pattern lets you customize agents based on runtime conditions
def create_security_agent(security_level: str) -> AgentDefinition:
is_strict = security_level == "strict"
return AgentDefinition(
description="Security code reviewer",
# Customize the prompt based on strictness level
prompt=f"You are a {'strict' if is_strict else 'balanced'} security reviewer...",
tools=["Read", "Grep", "Glob"],
# Key insight: use a more capable model for high-stakes reviews
model="opus" if is_strict else "sonnet",
)
async def main():
# The agent is created at query time, so each request can use different settings
async for message in query(
prompt="Review this PR for security issues",
options=ClaudeAgentOptions(
allowed_tools=["Read", "Grep", "Glob", "Agent"],
agents={
# Call the factory with your desired configuration
"security-reviewer": create_security_agent("strict")
},
),
):
if hasattr(message, "result"):
print(message.result)
asyncio.run(main())
import { query, type AgentDefinition } from "@anthropic-ai/claude-agent-sdk";
// Factory function that returns an AgentDefinition
// This pattern lets you customize agents based on runtime conditions
function createSecurityAgent(securityLevel: "basic" | "strict"): AgentDefinition {
const isStrict = securityLevel === "strict";
return {
description: "Security code reviewer",
// Customize the prompt based on strictness level
prompt: `You are a ${isStrict ? "strict" : "balanced"} security reviewer...`,
tools: ["Read", "Grep", "Glob"],
// Key insight: use a more capable model for high-stakes reviews
model: isStrict ? "opus" : "sonnet"
};
}
// The agent is created at query time, so each request can use different settings
for await (const message of query({
prompt: "Review this PR for security issues",
options: {
allowedTools: ["Read", "Grep", "Glob", "Agent"],
agents: {
// Call the factory with your desired configuration
"security-reviewer": createSecurityAgent("strict")
}
}
})) {
if ("result" in message) console.log(message.result);
}
Subagents are invoked via the Agent tool. To detect when a subagent is invoked, check for tool_use blocks where name is "Agent". Messages from within a subagent's context include a parent_tool_use_id field.
Note
The tool name was renamed from "Task" to "Agent" in Claude Code v2.1.63. Current SDK releases emit "Agent" in tool_use blocks but still use "Task" in the system:init tools list and in result.permission_denials[].tool_name. Checking both values in block.name ensures compatibility across SDK versions.
This example iterates through streamed messages, logging when a subagent is invoked and when subsequent messages originate from within that subagent's execution context.
Note
The message structure differs between SDKs. In Python, content blocks are accessed directly via message.content. In TypeScript, SDKAssistantMessage wraps the Claude API message, so content is accessed via message.message.content.
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, AgentDefinition
async def main():
async for message in query(
prompt="Use the code-reviewer agent to review this codebase",
options=ClaudeAgentOptions(
allowed_tools=["Read", "Glob", "Grep", "Agent"],
agents={
"code-reviewer": AgentDefinition(
description="Expert code reviewer.",
prompt="Analyze code quality and suggest improvements.",
tools=["Read", "Glob", "Grep"],
)
},
),
):
# Check for subagent invocation. Match both names: older SDK
# versions emitted "Task", current versions emit "Agent".
if hasattr(message, "content") and message.content:
for block in message.content:
if getattr(block, "type", None) == "tool_use" and block.name in (
"Task",
"Agent",
):
print(f"Subagent invoked: {block.input.get('subagent_type')}")
# Check if this message is from within a subagent's context
if hasattr(message, "parent_tool_use_id") and message.parent_tool_use_id:
print(" (running inside subagent)")
if hasattr(message, "result"):
print(message.result)
asyncio.run(main())
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "Use the code-reviewer agent to review this codebase",
options: {
allowedTools: ["Read", "Glob", "Grep", "Agent"],
agents: {
"code-reviewer": {
description: "Expert code reviewer.",
prompt: "Analyze code quality and suggest improvements.",
tools: ["Read", "Glob", "Grep"]
}
}
}
})) {
const msg = message as any;
// Check for subagent invocation. Match both names: older SDK versions
// emitted "Task", current versions emit "Agent".
for (const block of msg.message?.content ?? []) {
if (block.type === "tool_use" && (block.name === "Task" || block.name === "Agent")) {
console.log(`Subagent invoked: ${block.input.subagent_type}`);
}
}
// Check if this message is from within a subagent's context
if (msg.parent_tool_use_id) {
console.log(" (running inside subagent)");
}
if ("result" in message) {
console.log(message.result);
}
}
Subagents can be resumed to continue where they left off. Resumed subagents retain their full conversation history, including all previous tool calls, results, and reasoning. The subagent picks up exactly where it stopped rather than starting fresh.
When a subagent completes, Claude receives its agent ID in the Agent tool result. To resume a subagent programmatically:
session_id from messages during the first queryagentId from the message contentresume: sessionId in the second query's options, and include the agent ID in your promptNote
You must resume the same session to access the subagent's transcript. Each query call starts a new session by default, so pass resume: sessionId to continue in the same session.
If you're using a custom agent (not a built-in one), you also need to pass the same agent definition in the agents parameter for both queries.
The example below demonstrates this flow: the first query runs a subagent and captures the session ID and agent ID, then the second query resumes the session to ask a follow-up question that requires context from the first analysis.
import { query, type SDKMessage } from "@anthropic-ai/claude-agent-sdk";
// Helper to extract agentId from message content
// Stringify to avoid traversing different block types (TextBlock, ToolResultBlock, etc.)
function extractAgentId(message: SDKMessage): string | undefined {
if (!("message" in message)) return undefined;
// Stringify the content so we can search it without traversing nested blocks
const content = JSON.stringify(message.message.content);
const match = content.match(/agentId:\s*([a-f0-9-]+)/);
return match?.[1];
}
let agentId: string | undefined;
let sessionId: string | undefined;
// First invocation - use the Explore agent to find API endpoints
for await (const message of query({
prompt: "Use the Explore agent to find all API endpoints in this codebase",
options: { allowedTools: ["Read", "Grep", "Glob", "Agent"] }
})) {
// Capture session_id from ResultMessage (needed to resume this session)
if ("session_id" in message) sessionId = message.session_id;
// Search message content for the agentId (appears in Agent tool results)
const extractedId = extractAgentId(message);
if (extractedId) agentId = extractedId;
// Print the final result
if ("result" in message) console.log(message.result);
}
// Second invocation - resume and ask follow-up
if (agentId && sessionId) {
for await (const message of query({
prompt: `Resume agent ${agentId} and list the top 3 most complex endpoints`,
options: { allowedTools: ["Read", "Grep", "Glob", "Agent"], resume: sessionId }
})) {
if ("result" in message) console.log(message.result);
}
}
import asyncio
import json
import re
from claude_agent_sdk import query, ClaudeAgentOptions
def extract_agent_id(text: str) -> str | None:
"""Extract agentId from Agent tool result text."""
match = re.search(r"agentId:\s*([a-f0-9-]+)", text)
return match.group(1) if match else None
async def main():
agent_id = None
session_id = None
# First invocation - use the Explore agent to find API endpoints
async for message in query(
prompt="Use the Explore agent to find all API endpoints in this codebase",
options=ClaudeAgentOptions(allowed_tools=["Read", "Grep", "Glob", "Agent"]),
):
# Capture session_id from ResultMessage (needed to resume this session)
if hasattr(message, "session_id"):
session_id = message.session_id
# Search message content for the agentId (appears in Agent tool results)
if hasattr(message, "content"):
# Stringify the content so we can search it without traversing nested blocks
content_str = json.dumps(message.content, default=str)
extracted = extract_agent_id(content_str)
if extracted:
agent_id = extracted
# Print the final result
if hasattr(message, "result"):
print(message.result)
# Second invocation - resume and ask follow-up
if agent_id and session_id:
async for message in query(
prompt=f"Resume agent {agent_id} and list the top 3 most complex endpoints",
options=ClaudeAgentOptions(
allowed_tools=["Read", "Grep", "Glob", "Agent"], resume=session_id
),
):
if hasattr(message, "result"):
print(message.result)
asyncio.run(main())
Subagent transcripts persist independently of the main conversation:
cleanupPeriodDays setting (default: 30 days).Subagents can have restricted tool access via the tools field:
This example creates a read-only analysis agent that can examine code but cannot modify files or run commands.
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, AgentDefinition
async def main():
async for message in query(
prompt="Analyze the architecture of this codebase",
options=ClaudeAgentOptions(
allowed_tools=["Read", "Grep", "Glob", "Agent"],
agents={
"code-analyzer": AgentDefinition(
description="Static code analysis and architecture review",
prompt="""You are a code architecture analyst. Analyze code structure,
identify patterns, and suggest improvements without making changes.""",
# Read-only tools: no Edit, Write, or Bash access
tools=["Read", "Grep", "Glob"],
)
},
),
):
if hasattr(message, "result"):
print(message.result)
asyncio.run(main())
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "Analyze the architecture of this codebase",
options: {
allowedTools: ["Read", "Grep", "Glob", "Agent"],
agents: {
"code-analyzer": {
description: "Static code analysis and architecture review",
prompt: `You are a code architecture analyst. Analyze code structure,
identify patterns, and suggest improvements without making changes.`,
// Read-only tools: no Edit, Write, or Bash access
tools: ["Read", "Grep", "Glob"]
}
}
}
})) {
if ("result" in message) console.log(message.result);
}
| Use case | Tools | Description |
|---|---|---|
| Read-only analysis | Read, Grep, Glob |
Can examine code but not modify or execute |
| Test execution | Bash, Read, Grep |
Can run commands and analyze output |
| Code modification | Read, Edit, Write, Grep, Glob |
Full read/write access without command execution |
| Full access | All tools | Inherits all tools from parent (omit tools field) |
If Claude completes tasks directly instead of delegating to your subagent:
allowedToolsAgents defined in .claude/agents/ are loaded at startup only. If you create a new agent file while Claude Code is running, restart the session to load it.
On Windows, subagents with very long prompts may fail due to command line length limits (8191 chars). Keep prompts concise or use filesystem-based agents for complex instructions.
Learn how to customize Claude's behavior by modifying system prompts using three approaches - output styles, systemPrompt with append, and custom system prompts.
System prompts define Claude's behavior, capabilities, and response style. The Claude Agent SDK provides three ways to customize system prompts: using output styles (persistent, file-based configurations), appending to Claude Code's prompt, or using a fully custom prompt.
A system prompt is the initial instruction set that shapes how Claude behaves throughout a conversation.
Note
Default behavior: The Agent SDK uses a minimal system prompt by default. It contains only essential tool instructions but omits Claude Code's coding guidelines, response style, and project context. To include the full Claude Code system prompt, specify systemPrompt: { type: "preset", preset: "claude_code" } in TypeScript or system_prompt={"type": "preset", "preset": "claude_code"} in Python.
Claude Code's system prompt includes:
CLAUDE.md files provide project-specific context and instructions that are automatically read by the Agent SDK when it runs in a directory. They serve as persistent "memory" for your project.
Location and discovery:
CLAUDE.md or .claude/CLAUDE.md in your working directory~/.claude/CLAUDE.md for global instructions across all projectsIMPORTANT: The SDK only reads CLAUDE.md files when you explicitly configure settingSources (TypeScript) or setting_sources (Python):
'project' to load project-level CLAUDE.md'user' to load user-level CLAUDE.md (~/.claude/CLAUDE.md)The claude_code system prompt preset does NOT automatically load CLAUDE.md - you must also specify setting sources.
Content format: CLAUDE.md files use plain markdown and can contain:
# Project Guidelines
## Code Style
- Use TypeScript strict mode
- Prefer functional components in React
- Always include JSDoc comments for public APIs
## Testing
- Run `npm test` before committing
- Maintain >80% code coverage
- Use jest for unit tests, playwright for E2E
## Commands
- Build: `npm run build`
- Dev server: `npm run dev`
- Type check: `npm run typecheck`
import { query } from "@anthropic-ai/claude-agent-sdk";
// IMPORTANT: You must specify settingSources to load CLAUDE.md
// The claude_code preset alone does NOT load CLAUDE.md files
const messages = [];
for await (const message of query({
prompt: "Add a new React component for user profiles",
options: {
systemPrompt: {
type: "preset",
preset: "claude_code" // Use Claude Code's system prompt
},
settingSources: ["project"] // Required to load CLAUDE.md from project
}
})) {
messages.push(message);
}
// Now Claude has access to your project guidelines from CLAUDE.md
from claude_agent_sdk import query, ClaudeAgentOptions
# IMPORTANT: You must specify setting_sources to load CLAUDE.md
# The claude_code preset alone does NOT load CLAUDE.md files
messages = []
async for message in query(
prompt="Add a new React component for user profiles",
options=ClaudeAgentOptions(
system_prompt={
"type": "preset",
"preset": "claude_code", # Use Claude Code's system prompt
},
setting_sources=["project"], # Required to load CLAUDE.md from project
),
):
messages.append(message)
# Now Claude has access to your project guidelines from CLAUDE.md
Best for:
Key characteristics:
settingSourcesOutput styles are saved configurations that modify Claude's system prompt. They're stored as markdown files and can be reused across sessions and projects.
import { writeFile, mkdir } from "fs/promises";
import { join } from "path";
import { homedir } from "os";
async function createOutputStyle(name: string, description: string, prompt: string) {
// User-level: ~/.claude/output-styles
// Project-level: .claude/output-styles
const outputStylesDir = join(homedir(), ".claude", "output-styles");
await mkdir(outputStylesDir, { recursive: true });
const content = `---
name: ${name}
description: ${description}
---
${prompt}`;
const filePath = join(outputStylesDir, `${name.toLowerCase().replace(/\s+/g, "-")}.md`);
await writeFile(filePath, content, "utf-8");
}
// Example: Create a code review specialist
await createOutputStyle(
"Code Reviewer",
"Thorough code review assistant",
`You are an expert code reviewer.
For every code submission:
1. Check for bugs and security issues
2. Evaluate performance
3. Suggest improvements
4. Rate code quality (1-10)`
);
from pathlib import Path
async def create_output_style(name: str, description: str, prompt: str):
# User-level: ~/.claude/output-styles
# Project-level: .claude/output-styles
output_styles_dir = Path.home() / ".claude" / "output-styles"
output_styles_dir.mkdir(parents=True, exist_ok=True)
content = f"""---
name: {name}
description: {description}
---
{prompt}"""
file_name = name.lower().replace(" ", "-") + ".md"
file_path = output_styles_dir / file_name
file_path.write_text(content, encoding="utf-8")
# Example: Create a code review specialist
await create_output_style(
"Code Reviewer",
"Thorough code review assistant",
"""You are an expert code reviewer.
For every code submission:
1. Check for bugs and security issues
2. Evaluate performance
3. Suggest improvements
4. Rate code quality (1-10)""",
)
Once created, activate output styles via:
/output-style [style-name].claude/settings.local.json/output-style:new [description]Note for SDK users: Output styles are loaded when you include settingSources: ['user'] or settingSources: ['project'] (TypeScript) / setting_sources=["user"] or setting_sources=["project"] (Python) in your options.
systemPrompt with appendYou can use the Claude Code preset with an append property to add your custom instructions while preserving all built-in functionality.
import { query } from "@anthropic-ai/claude-agent-sdk";
const messages = [];
for await (const message of query({
prompt: "Help me write a Python function to calculate fibonacci numbers",
options: {
systemPrompt: {
type: "preset",
preset: "claude_code",
append: "Always include detailed docstrings and type hints in Python code."
}
}
})) {
messages.push(message);
if (message.type === "assistant") {
console.log(message.message.content);
}
}
from claude_agent_sdk import query, ClaudeAgentOptions
messages = []
async for message in query(
prompt="Help me write a Python function to calculate fibonacci numbers",
options=ClaudeAgentOptions(
system_prompt={
"type": "preset",
"preset": "claude_code",
"append": "Always include detailed docstrings and type hints in Python code.",
}
),
):
messages.append(message)
if message.type == "assistant":
print(message.message.content)
You can provide a custom string as systemPrompt to replace the default entirely with your own instructions.
import { query } from "@anthropic-ai/claude-agent-sdk";
const customPrompt = `You are a Python coding specialist.
Follow these guidelines:
- Write clean, well-documented code
- Use type hints for all functions
- Include comprehensive docstrings
- Prefer functional programming patterns when appropriate
- Always explain your code choices`;
const messages = [];
for await (const message of query({
prompt: "Create a data processing pipeline",
options: {
systemPrompt: customPrompt
}
})) {
messages.push(message);
if (message.type === "assistant") {
console.log(message.message.content);
}
}
from claude_agent_sdk import query, ClaudeAgentOptions
custom_prompt = """You are a Python coding specialist.
Follow these guidelines:
- Write clean, well-documented code
- Use type hints for all functions
- Include comprehensive docstrings
- Prefer functional programming patterns when appropriate
- Always explain your code choices"""
messages = []
async for message in query(
prompt="Create a data processing pipeline",
options=ClaudeAgentOptions(system_prompt=custom_prompt),
):
messages.append(message)
if message.type == "assistant":
print(message.message.content)
| Feature | CLAUDE.md | Output Styles | systemPrompt with append |
Custom systemPrompt |
|---|---|---|---|---|
| Persistence | Per-project file | Saved as files | Session only | Session only |
| Reusability | Per-project | Across projects | Code duplication | Code duplication |
| Management | On filesystem | CLI + files | In code | In code |
| Default tools | Preserved | Preserved | Preserved | Lost (unless included) |
| Built-in safety | Maintained | Maintained | Maintained | Must be added |
| Environment context | Automatic | Automatic | Automatic | Must be provided |
| Customization level | Additions only | Replace default | Additions only | Complete control |
| Version control | With project | Yes | With code | With code |
| Scope | Project-specific | User or project | Code session | Code session |
Note: "With append" means using systemPrompt: { type: "preset", preset: "claude_code", append: "..." } in TypeScript or system_prompt={"type": "preset", "preset": "claude_code", "append": "..."} in Python.
Best for:
Examples:
npm run lint:fix before committing"migrations/ directory"Important: To load CLAUDE.md files, you must explicitly set settingSources: ['project'] (TypeScript) or setting_sources=["project"] (Python). The claude_code system prompt preset does NOT automatically load CLAUDE.md without this setting.
Best for:
Examples:
systemPrompt with appendBest for:
systemPromptBest for:
You can combine these methods for maximum flexibility:
import { query } from "@anthropic-ai/claude-agent-sdk";
// Assuming "Code Reviewer" output style is active (via /output-style)
// Add session-specific focus areas
const messages = [];
for await (const message of query({
prompt: "Review this authentication module",
options: {
systemPrompt: {
type: "preset",
preset: "claude_code",
append: `
For this review, prioritize:
- OAuth 2.0 compliance
- Token storage security
- Session management
`
}
}
})) {
messages.push(message);
}
from claude_agent_sdk import query, ClaudeAgentOptions
# Assuming "Code Reviewer" output style is active (via /output-style)
# Add session-specific focus areas
messages = []
async for message in query(
prompt="Review this authentication module",
options=ClaudeAgentOptions(
system_prompt={
"type": "preset",
"preset": "claude_code",
"append": """
For this review, prioritize:
- OAuth 2.0 compliance
- Token storage security
- Session management
""",
}
),
):
messages.append(message)
Learn how to use slash commands to control Claude Code sessions through the SDK
Slash commands provide a way to control Claude Code sessions with special commands that start with /. These commands can be sent through the SDK to perform actions like clearing conversation history, compacting messages, or getting help.
The Claude Agent SDK provides information about available slash commands in the system initialization message. Access this information when your session starts:
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "Hello Claude",
options: { maxTurns: 1 }
})) {
if (message.type === "system" && message.subtype === "init") {
console.log("Available slash commands:", message.slash_commands);
// Example output: ["/compact", "/clear", "/help"]
}
}
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, SystemMessage
async def main():
async for message in query(prompt="Hello Claude", options=ClaudeAgentOptions(max_turns=1)):
if isinstance(message, SystemMessage) and message.subtype == "init":
print("Available slash commands:", message.data["slash_commands"])
# Example output: ["/compact", "/clear", "/help"]
asyncio.run(main())
Send slash commands by including them in your prompt string, just like regular text:
import { query } from "@anthropic-ai/claude-agent-sdk";
// Send a slash command
for await (const message of query({
prompt: "/compact",
options: { maxTurns: 1 }
})) {
if (message.type === "result") {
console.log("Command executed:", message.result);
}
}
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, ResultMessage
async def main():
# Send a slash command
async for message in query(prompt="/compact", options=ClaudeAgentOptions(max_turns=1)):
if isinstance(message, ResultMessage):
print("Command executed:", message.result)
asyncio.run(main())
/compact - Compact Conversation HistoryThe /compact command reduces the size of your conversation history by summarizing older messages while preserving important context:
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "/compact",
options: { maxTurns: 1 }
})) {
if (message.type === "system" && message.subtype === "compact_boundary") {
console.log("Compaction completed");
console.log("Pre-compaction tokens:", message.compact_metadata.pre_tokens);
console.log("Trigger:", message.compact_metadata.trigger);
}
}
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, SystemMessage
async def main():
async for message in query(prompt="/compact", options=ClaudeAgentOptions(max_turns=1)):
if isinstance(message, SystemMessage) and message.subtype == "compact_boundary":
print("Compaction completed")
print("Pre-compaction tokens:", message.data["compact_metadata"]["pre_tokens"])
print("Trigger:", message.data["compact_metadata"]["trigger"])
asyncio.run(main())
/clear - Clear ConversationThe /clear command starts a fresh conversation by clearing all previous history:
import { query } from "@anthropic-ai/claude-agent-sdk";
// Clear conversation and start fresh
for await (const message of query({
prompt: "/clear",
options: { maxTurns: 1 }
})) {
if (message.type === "system" && message.subtype === "init") {
console.log("Conversation cleared, new session started");
console.log("Session ID:", message.session_id);
}
}
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, SystemMessage
async def main():
# Clear conversation and start fresh
async for message in query(prompt="/clear", options=ClaudeAgentOptions(max_turns=1)):
if isinstance(message, SystemMessage) and message.subtype == "init":
print("Conversation cleared, new session started")
print("Session ID:", message.data["session_id"])
asyncio.run(main())
In addition to using built-in slash commands, you can create your own custom commands that are available through the SDK. Custom commands are defined as markdown files in specific directories, similar to how subagents are configured.
Note
The .claude/commands/ directory is the legacy format. The recommended format is .claude/skills/<name>/SKILL.md, which supports the same slash-command invocation (/name) plus autonomous invocation by Claude. See Skills for the current format. The CLI continues to support both formats, and the examples below remain accurate for .claude/commands/.
Custom slash commands are stored in designated directories based on their scope:
.claude/commands/ - Available only in the current project (legacy; prefer .claude/skills/)~/.claude/commands/ - Available across all your projects (legacy; prefer ~/.claude/skills/)Each custom command is a markdown file where:
.md extension) becomes the command nameCreate .claude/commands/refactor.md:
Refactor the selected code to improve readability and maintainability.
Focus on clean code principles and best practices.
This creates the /refactor command that you can use through the SDK.
Create .claude/commands/security-check.md:
---
allowed-tools: Read, Grep, Glob
description: Run security vulnerability scan
model: claude-opus-4-6
---
Analyze the codebase for security vulnerabilities including:
- SQL injection risks
- XSS vulnerabilities
- Exposed credentials
- Insecure configurations
Once defined in the filesystem, custom commands are automatically available through the SDK:
import { query } from "@anthropic-ai/claude-agent-sdk";
// Use a custom command
for await (const message of query({
prompt: "/refactor src/auth/login.ts",
options: { maxTurns: 3 }
})) {
if (message.type === "assistant") {
console.log("Refactoring suggestions:", message.message);
}
}
// Custom commands appear in the slash_commands list
for await (const message of query({
prompt: "Hello",
options: { maxTurns: 1 }
})) {
if (message.type === "system" && message.subtype === "init") {
// Will include both built-in and custom commands
console.log("Available commands:", message.slash_commands);
// Example: ["/compact", "/clear", "/help", "/refactor", "/security-check"]
}
}
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, AssistantMessage, SystemMessage
async def main():
# Use a custom command
async for message in query(
prompt="/refactor src/auth/login.py", options=ClaudeAgentOptions(max_turns=3)
):
if isinstance(message, AssistantMessage):
for block in message.content:
if hasattr(block, "text"):
print("Refactoring suggestions:", block.text)
# Custom commands appear in the slash_commands list
async for message in query(prompt="Hello", options=ClaudeAgentOptions(max_turns=1)):
if isinstance(message, SystemMessage) and message.subtype == "init":
# Will include both built-in and custom commands
print("Available commands:", message.data["slash_commands"])
# Example: ["/compact", "/clear", "/help", "/refactor", "/security-check"]
asyncio.run(main())
Custom commands support dynamic arguments using placeholders:
Create .claude/commands/fix-issue.md:
---
argument-hint: [issue-number] [priority]
description: Fix a GitHub issue
---
Fix issue #$1 with priority $2.
Check the issue description and implement the necessary changes.
Use in SDK:
import { query } from "@anthropic-ai/claude-agent-sdk";
// Pass arguments to custom command
for await (const message of query({
prompt: "/fix-issue 123 high",
options: { maxTurns: 5 }
})) {
// Command will process with $1="123" and $2="high"
if (message.type === "result") {
console.log("Issue fixed:", message.result);
}
}
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, ResultMessage
async def main():
# Pass arguments to custom command
async for message in query(prompt="/fix-issue 123 high", options=ClaudeAgentOptions(max_turns=5)):
# Command will process with $1="123" and $2="high"
if isinstance(message, ResultMessage):
print("Issue fixed:", message.result)
asyncio.run(main())
Custom commands can execute bash commands and include their output:
Create .claude/commands/git-commit.md:
---
allowed-tools: Bash(git add:*), Bash(git status:*), Bash(git commit:*)
description: Create a git commit
---
## Context
- Current status: !`git status`
- Current diff: !`git diff HEAD`
## Task
Create a git commit with appropriate message based on the changes.
Include file contents using the @ prefix:
Create .claude/commands/review-config.md:
---
description: Review configuration files
---
Review the following configuration files for issues:
- Package config: @package.json
- TypeScript config: @tsconfig.json
- Environment config: @.env
Check for security issues, outdated dependencies, and misconfigurations.
Organize commands in subdirectories for better structure:
.claude/commands/
├── frontend/
│ ├── component.md # Creates /component (project:frontend)
│ └── style-check.md # Creates /style-check (project:frontend)
├── backend/
│ ├── api-test.md # Creates /api-test (project:backend)
│ └── db-migrate.md # Creates /db-migrate (project:backend)
└── review.md # Creates /review (project)
The subdirectory appears in the command description but doesn't affect the command name itself.
Create .claude/commands/code-review.md:
---
allowed-tools: Read, Grep, Glob, Bash(git diff:*)
description: Comprehensive code review
---
## Changed Files
!`git diff --name-only HEAD~1`
## Detailed Changes
!`git diff HEAD~1`
## Review Checklist
Review the above changes for:
1. Code quality and readability
2. Security vulnerabilities
3. Performance implications
4. Test coverage
5. Documentation completeness
Provide specific, actionable feedback organized by priority.
Create .claude/commands/test.md:
---
allowed-tools: Bash, Read, Edit
argument-hint: [test-pattern]
description: Run tests with optional pattern
---
Run tests matching pattern: $ARGUMENTS
1. Detect the test framework (Jest, pytest, etc.)
2. Run tests with the provided pattern
3. If tests fail, analyze and fix them
4. Re-run to verify fixes
Use these commands through the SDK:
import { query } from "@anthropic-ai/claude-agent-sdk";
// Run code review
for await (const message of query({
prompt: "/code-review",
options: { maxTurns: 3 }
})) {
// Process review feedback
}
// Run specific tests
for await (const message of query({
prompt: "/test auth",
options: { maxTurns: 5 }
})) {
// Handle test results
}
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions
async def main():
# Run code review
async for message in query(prompt="/code-review", options=ClaudeAgentOptions(max_turns=3)):
# Process review feedback
pass
# Run specific tests
async for message in query(prompt="/test auth", options=ClaudeAgentOptions(max_turns=5)):
# Handle test results
pass
asyncio.run(main())
Extend Claude with specialized capabilities using Agent Skills in the Claude Agent SDK
Agent Skills extend Claude with specialized capabilities that Claude autonomously invokes when relevant. Skills are packaged as SKILL.md files containing instructions, descriptions, and optional supporting resources.
For comprehensive information about Skills, including benefits, architecture, and authoring guidelines, see the Agent Skills overview.
When using the Claude Agent SDK, Skills are:
SKILL.md files in specific directories (.claude/skills/)settingSources (TypeScript) or setting_sources (Python) to load Skills from the filesystem"Skill" to your allowed_tools to enable SkillsUnlike subagents (which can be defined programmatically), Skills must be created as filesystem artifacts. The SDK does not provide a programmatic API for registering Skills.
Note
Default behavior: By default, the SDK does not load any filesystem settings. To use Skills, you must explicitly configure settingSources: ['user', 'project'] (TypeScript) or setting_sources=["user", "project"] (Python) in your options.
To use Skills with the SDK, you need to:
"Skill" in your allowed_tools configurationsettingSources/setting_sources to load Skills from the filesystemOnce configured, Claude automatically discovers Skills from the specified directories and invokes them when relevant to the user's request.
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions
async def main():
options = ClaudeAgentOptions(
cwd="/path/to/project", # Project with .claude/skills/
setting_sources=["user", "project"], # Load Skills from filesystem
allowed_tools=["Skill", "Read", "Write", "Bash"], # Enable Skill tool
)
async for message in query(
prompt="Help me process this PDF document", options=options
):
print(message)
asyncio.run(main())
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "Help me process this PDF document",
options: {
cwd: "/path/to/project", // Project with .claude/skills/
settingSources: ["user", "project"], // Load Skills from filesystem
allowedTools: ["Skill", "Read", "Write", "Bash"] // Enable Skill tool
}
})) {
console.log(message);
}
Skills are loaded from filesystem directories based on your settingSources/setting_sources configuration:
.claude/skills/): Shared with your team via git - loaded when setting_sources includes "project"~/.claude/skills/): Personal Skills across all projects - loaded when setting_sources includes "user"Skills are defined as directories containing a SKILL.md file with YAML frontmatter and Markdown content. The description field determines when Claude invokes your Skill.
Example directory structure:
.claude/skills/processing-pdfs/
└── SKILL.md
For complete guidance on creating Skills, including SKILL.md structure, multi-file Skills, and examples, see:
Note
The allowed-tools frontmatter field in SKILL.md is only supported when using Claude Code CLI directly. It does not apply when using Skills through the SDK.
When using the SDK, control tool access through the main allowedTools option in your query configuration.
To control tool access for Skills in SDK applications, use allowedTools to pre-approve specific tools. Without a canUseTool callback, anything not in the list is denied:
Note
Import statements from the first example are assumed in the following code snippets.
options = ClaudeAgentOptions(
setting_sources=["user", "project"], # Load Skills from filesystem
allowed_tools=["Skill", "Read", "Grep", "Glob"],
)
async for message in query(prompt="Analyze the codebase structure", options=options):
print(message)
for await (const message of query({
prompt: "Analyze the codebase structure",
options: {
settingSources: ["user", "project"], // Load Skills from filesystem
allowedTools: ["Skill", "Read", "Grep", "Glob"],
permissionMode: "dontAsk" // Deny anything not in allowedTools
}
})) {
console.log(message);
}
To see which Skills are available in your SDK application, simply ask Claude:
options = ClaudeAgentOptions(
setting_sources=["user", "project"], # Load Skills from filesystem
allowed_tools=["Skill"],
)
async for message in query(prompt="What Skills are available?", options=options):
print(message)
for await (const message of query({
prompt: "What Skills are available?",
options: {
settingSources: ["user", "project"], // Load Skills from filesystem
allowedTools: ["Skill"]
}
})) {
console.log(message);
}
Claude will list the available Skills based on your current working directory and installed plugins.
Test Skills by asking questions that match their descriptions:
options = ClaudeAgentOptions(
cwd="/path/to/project",
setting_sources=["user", "project"], # Load Skills from filesystem
allowed_tools=["Skill", "Read", "Bash"],
)
async for message in query(prompt="Extract text from invoice.pdf", options=options):
print(message)
for await (const message of query({
prompt: "Extract text from invoice.pdf",
options: {
cwd: "/path/to/project",
settingSources: ["user", "project"], // Load Skills from filesystem
allowedTools: ["Skill", "Read", "Bash"]
}
})) {
console.log(message);
}
Claude automatically invokes the relevant Skill if the description matches your request.
Check settingSources configuration: Skills are only loaded when you explicitly configure settingSources/setting_sources. This is the most common issue:
# Wrong - Skills won't be loaded
options = ClaudeAgentOptions(allowed_tools=["Skill"])
# Correct - Skills will be loaded
options = ClaudeAgentOptions(
setting_sources=["user", "project"], # Required to load Skills
allowed_tools=["Skill"],
)
// Wrong - Skills won't be loaded
const options = {
allowedTools: ["Skill"]
};
// Correct - Skills will be loaded
const options = {
settingSources: ["user", "project"], // Required to load Skills
allowedTools: ["Skill"]
};
For more details on settingSources/setting_sources, see the TypeScript SDK reference or Python SDK reference.
Check working directory: The SDK loads Skills relative to the cwd option. Ensure it points to a directory containing .claude/skills/:
# Ensure your cwd points to the directory containing .claude/skills/
options = ClaudeAgentOptions(
cwd="/path/to/project", # Must contain .claude/skills/
setting_sources=["user", "project"], # Required to load Skills
allowed_tools=["Skill"],
)
// Ensure your cwd points to the directory containing .claude/skills/
const options = {
cwd: "/path/to/project", // Must contain .claude/skills/
settingSources: ["user", "project"], // Required to load Skills
allowedTools: ["Skill"]
};
See the "Using Skills with the SDK" section above for the complete pattern.
Verify filesystem location:
# Check project Skills
ls .claude/skills/*/SKILL.md
# Check personal Skills
ls ~/.claude/skills/*/SKILL.md
Check the Skill tool is enabled: Confirm "Skill" is in your allowedTools.
Check the description: Ensure it's specific and includes relevant keywords. See Agent Skills Best Practices for guidance on writing effective descriptions.
For general Skills troubleshooting (YAML syntax, debugging, etc.), see the Claude Code Skills troubleshooting section.
Load custom plugins to extend Claude Code with commands, agents, skills, and hooks through the Agent SDK
Plugins allow you to extend Claude Code with custom functionality that can be shared across projects. Through the Agent SDK, you can programmatically load plugins from local directories to add custom slash commands, agents, skills, hooks, and MCP servers to your agent sessions.
Plugins are packages of Claude Code extensions that can include:
/skill-name)Note
The commands/ directory is a legacy format. Use skills/ for new plugins. Claude Code continues to support both formats for backward compatibility.
For complete information on plugin structure and how to create plugins, see Plugins.
Load plugins by providing their local file system paths in your options configuration. The SDK supports loading multiple plugins from different locations.
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "Hello",
options: {
plugins: [
{ type: "local", path: "./my-plugin" },
{ type: "local", path: "/absolute/path/to/another-plugin" }
]
}
})) {
// Plugin commands, agents, and other features are now available
}
import asyncio
from claude_agent_sdk import query
async def main():
async for message in query(
prompt="Hello",
options={
"plugins": [
{"type": "local", "path": "./my-plugin"},
{"type": "local", "path": "/absolute/path/to/another-plugin"},
]
},
):
# Plugin commands, agents, and other features are now available
pass
asyncio.run(main())
Plugin paths can be:
"./plugins/my-plugin")"/home/user/plugins/my-plugin")Note
The path should point to the plugin's root directory (the directory containing .claude-plugin/plugin.json).
When plugins load successfully, they appear in the system initialization message. You can verify that your plugins are available:
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "Hello",
options: {
plugins: [{ type: "local", path: "./my-plugin" }]
}
})) {
if (message.type === "system" && message.subtype === "init") {
// Check loaded plugins
console.log("Plugins:", message.plugins);
// Example: [{ name: "my-plugin", path: "./my-plugin" }]
// Check available commands from plugins
console.log("Commands:", message.slash_commands);
// Example: ["/help", "/compact", "my-plugin:custom-command"]
}
}
import asyncio
from claude_agent_sdk import query
async def main():
async for message in query(
prompt="Hello", options={"plugins": [{"type": "local", "path": "./my-plugin"}]}
):
if message.type == "system" and message.subtype == "init":
# Check loaded plugins
print("Plugins:", message.data.get("plugins"))
# Example: [{"name": "my-plugin", "path": "./my-plugin"}]
# Check available commands from plugins
print("Commands:", message.data.get("slash_commands"))
# Example: ["/help", "/compact", "my-plugin:custom-command"]
asyncio.run(main())
Skills from plugins are automatically namespaced with the plugin name to avoid conflicts. When invoked as slash commands, the format is plugin-name:skill-name.
import { query } from "@anthropic-ai/claude-agent-sdk";
// Load a plugin with a custom /greet skill
for await (const message of query({
prompt: "/my-plugin:greet", // Use plugin skill with namespace
options: {
plugins: [{ type: "local", path: "./my-plugin" }]
}
})) {
// Claude executes the custom greeting skill from the plugin
if (message.type === "assistant") {
console.log(message.message.content);
}
}
import asyncio
from claude_agent_sdk import query, AssistantMessage, TextBlock
async def main():
# Load a plugin with a custom /greet skill
async for message in query(
prompt="/demo-plugin:greet", # Use plugin skill with namespace
options={"plugins": [{"type": "local", "path": "./plugins/demo-plugin"}]},
):
# Claude executes the custom greeting skill from the plugin
if isinstance(message, AssistantMessage):
for block in message.content:
if isinstance(block, TextBlock):
print(f"Claude: {block.text}")
asyncio.run(main())
Note
If you installed a plugin via the CLI (for example, /plugin install my-plugin@marketplace), you can still use it in the SDK by providing its installation path. Check ~/.claude/plugins/ for CLI-installed plugins.
Here's a full example demonstrating plugin loading and usage:
import { query } from "@anthropic-ai/claude-agent-sdk";
import * as path from "path";
async function runWithPlugin() {
const pluginPath = path.join(__dirname, "plugins", "my-plugin");
console.log("Loading plugin from:", pluginPath);
for await (const message of query({
prompt: "What custom commands do you have available?",
options: {
plugins: [{ type: "local", path: pluginPath }],
maxTurns: 3
}
})) {
if (message.type === "system" && message.subtype === "init") {
console.log("Loaded plugins:", message.plugins);
console.log("Available commands:", message.slash_commands);
}
if (message.type === "assistant") {
console.log("Assistant:", message.message.content);
}
}
}
runWithPlugin().catch(console.error);
#!/usr/bin/env python3
"""Example demonstrating how to use plugins with the Agent SDK."""
from pathlib import Path
import anyio
from claude_agent_sdk import (
AssistantMessage,
ClaudeAgentOptions,
TextBlock,
query,
)
async def run_with_plugin():
"""Example using a custom plugin."""
plugin_path = Path(__file__).parent / "plugins" / "demo-plugin"
print(f"Loading plugin from: {plugin_path}")
options = ClaudeAgentOptions(
plugins=[{"type": "local", "path": str(plugin_path)}],
max_turns=3,
)
async for message in query(
prompt="What custom commands do you have available?", options=options
):
if message.type == "system" and message.subtype == "init":
print(f"Loaded plugins: {message.data.get('plugins')}")
print(f"Available commands: {message.data.get('slash_commands')}")
if isinstance(message, AssistantMessage):
for block in message.content:
if isinstance(block, TextBlock):
print(f"Assistant: {block.text}")
if __name__ == "__main__":
anyio.run(run_with_plugin)
A plugin directory must contain a .claude-plugin/plugin.json manifest file. It can optionally include:
my-plugin/
├── .claude-plugin/
│ └── plugin.json # Required: plugin manifest
├── skills/ # Agent Skills (invoked autonomously or via /skill-name)
│ └── my-skill/
│ └── SKILL.md
├── commands/ # Legacy: use skills/ instead
│ └── custom-cmd.md
├── agents/ # Custom agents
│ └── specialist.md
├── hooks/ # Event handlers
│ └── hooks.json
└── .mcp.json # MCP server definitions
For detailed information on creating plugins, see:
Load plugins during development without installing them globally:
plugins: [{ type: "local", path: "./dev-plugins/my-plugin" }];
Include plugins in your project repository for team-wide consistency:
plugins: [{ type: "local", path: "./project-plugins/team-workflows" }];
Combine plugins from different locations:
plugins: [
{ type: "local", path: "./local-plugin" },
{ type: "local", path: "~/.claude/custom-plugins/shared-plugin" }
];
If your plugin doesn't appear in the init message:
.claude-plugin/)If plugin skills don't work:
plugin-name:skill-name format when invoked as slash commandsslash_commands with the correct namespaceSKILL.md file in its own subdirectory under skills/ (for example, skills/my-skill/SKILL.md)If relative paths don't work:
Control how your agent uses tools with permission modes, hooks, and declarative allow/deny rules.
The Claude Agent SDK provides permission controls to manage how Claude uses tools. Use permission modes and rules to define what's allowed automatically, and the canUseTool callback to handle everything else at runtime.
Note
This page covers permission modes and rules. To build interactive approval flows where users approve or deny tool requests at runtime, see Handle approvals and user input.
When Claude requests a tool, the SDK checks permissions in this order:
Run hooks first, which can allow, deny, or continue to the next step
Check deny rules (from disallowed_tools and settings.json). If a deny rule matches, the tool is blocked, even in bypassPermissions mode.
Apply the active permission mode. bypassPermissions approves everything that reaches this step. acceptEdits approves file operations. Other modes fall through.
Check allow rules (from allowed_tools and settings.json). If a rule matches, the tool is approved.
If not resolved by any of the above, call your canUseTool callback for a decision. In dontAsk mode, this step is skipped and the tool is denied.
This page focuses on allow and deny rules and permission modes. For the other steps:
allowed_tools and disallowed_tools (TypeScript: allowedTools / disallowedTools) add entries to the allow and deny rule lists in the evaluation flow above. They control whether a tool call is approved, not whether the tool is available to Claude.
| Option | Effect |
|---|---|
allowed_tools=["Read", "Grep"] |
Read and Grep are auto-approved. Tools not listed here still exist and fall through to the permission mode and canUseTool. |
disallowed_tools=["Bash"] |
Bash is always denied. Deny rules are checked first and hold in every permission mode, including bypassPermissions. |
For a locked-down agent, pair allowedTools with permissionMode: "dontAsk". Listed tools are approved; anything else is denied outright instead of prompting:
const options = {
allowedTools: ["Read", "Glob", "Grep"],
permissionMode: "dontAsk"
};
Warning
allowed_tools does not constrain bypassPermissions. allowed_tools only pre-approves the tools you list. Unlisted tools are not matched by any allow rule and fall through to the permission mode, where bypassPermissions approves them. Setting allowed_tools=["Read"] alongside permission_mode="bypassPermissions" still approves every tool, including Bash, Write, and Edit. If you need bypassPermissions but want specific tools blocked, use disallowed_tools.
You can also configure allow, deny, and ask rules declaratively in .claude/settings.json. The SDK does not load filesystem settings by default, so you must set setting_sources=["project"] (TypeScript: settingSources: ["project"]) in your options for these rules to apply. See Permission settings for the rule syntax.
Permission modes provide global control over how Claude uses tools. You can set the permission mode when calling query or change it dynamically during streaming sessions.
The SDK supports these permission modes:
| Mode | Description | Tool behavior |
|---|---|---|
default |
Standard permission behavior | No auto-approvals; unmatched tools trigger your canUseTool callback |
dontAsk |
Deny instead of prompting | Anything not pre-approved by allowed_tools or rules is denied; canUseTool is never called |
acceptEdits |
Auto-accept file edits | File edits and filesystem operations (mkdir, rm, mv, etc.) are automatically approved |
bypassPermissions |
Bypass all permission checks | All tools run without permission prompts (use with caution) |
plan |
Planning mode | No tool execution; Claude plans without making changes |
auto (TypeScript only) |
Model-classified approvals | A model classifier approves or denies each tool call. See Auto mode for availability |
Warning
Subagent inheritance: When using bypassPermissions, all subagents inherit this mode and it cannot be overridden. Subagents may have different system prompts and less constrained behavior than your main agent. Enabling bypassPermissions grants them full, autonomous system access without any approval prompts.
You can set the permission mode once when starting a query, or change it dynamically while the session is active.
Pass permission_mode (Python) or permissionMode (TypeScript) when creating a query. This mode applies for the entire session unless changed dynamically.
import asyncio from claude_agent_sdk import query, ClaudeAgentOptionsasync def main: async for message in query( prompt="Help me refactor this code", options=ClaudeAgentOptions( permission_mode="default", # Set the mode here ), ): if hasattr(message, "result"): print(message.result)
asyncio.run(main)
import { query } from "@anthropic-ai/claude-agent-sdk";
async function main() {
for await (const message of query({
prompt: "Help me refactor this code",
options: {
permissionMode: "default" // Set the mode here
}
})) {
if ("result" in message) {
console.log(message.result);
}
}
}
main();
Call set_permission_mode (Python) or setPermissionMode (TypeScript) to change the mode mid-session. The new mode takes effect immediately for all subsequent tool requests. This lets you start restrictive and loosen permissions as trust builds, for example switching to acceptEdits after reviewing Claude's initial approach.
import asyncio from claude_agent_sdk import query, ClaudeAgentOptionsasync def main: q = query( prompt="Help me refactor this code", options=ClaudeAgentOptions( permission_mode="default", # Start in default mode ), )
# Change mode dynamically mid-session await q.set_permission_mode("acceptEdits") # Process messages with the new permission mode async for message in q: if hasattr(message, "result"): print(message.result)
asyncio.run(main)
import { query } from "@anthropic-ai/claude-agent-sdk";
async function main() {
const q = query({
prompt: "Help me refactor this code",
options: {
permissionMode: "default" // Start in default mode
}
});
// Change mode dynamically mid-session
await q.setPermissionMode("acceptEdits");
// Process messages with the new permission mode
for await (const message of q) {
if ("result" in message) {
console.log(message.result);
}
}
}
main();
acceptEdits)Auto-approves file operations so Claude can edit code without prompting. Other tools (like Bash commands that aren't filesystem operations) still require normal permissions.
Auto-approved operations:
mkdir, touch, rm, rmdir, mv, cp, sedBoth apply only to paths inside the working directory or additionalDirectories. Paths outside that scope and writes to protected paths still prompt.
Use when: you trust Claude's edits and want faster iteration, such as during prototyping or when working in an isolated directory.
dontAsk)Converts any permission prompt into a denial. Tools pre-approved by allowed_tools, settings.json allow rules, or a hook run as normal. Everything else is denied without calling canUseTool.
Use when: you want a fixed, explicit tool surface for a headless agent and prefer a hard deny over silent reliance on canUseTool being absent.
bypassPermissions)Auto-approves all tool uses without prompts. Hooks still execute and can block operations if needed.
Warning
Use with extreme caution. Claude has full system access in this mode. Only use in controlled environments where you trust all possible operations.
allowed_tools does not constrain this mode. Every tool is approved, not just the ones you listed. Deny rules (disallowed_tools), explicit ask rules, and hooks are evaluated before the mode check and can still block a tool.
plan)Prevents tool execution entirely. Claude can analyze code and create plans but cannot make changes. Claude may use AskUserQuestion to clarify requirements before finalizing the plan. See Handle approvals and user input for handling these prompts.
Use when: you want Claude to propose changes without executing them, such as during code review or when you need to approve changes before they're made.
For the other steps in the permission evaluation flow:
settings.jsonIntercept and customize agent behavior at key execution points with hooks
Hooks are callback functions that run your code in response to agent events, like a tool being called, a session starting, or execution stopping. With hooks, you can:
This guide covers how hooks work, how to configure them, and provides examples for common patterns like blocking tools, modifying inputs, and forwarding notifications.
Something happens during agent execution and the SDK fires an event: a tool is about to be called (PreToolUse), a tool returned a result (PostToolUse), a subagent started or stopped, the agent is idle, or execution finished. See the full list of events.
The SDK checks for hooks registered for that event type. This includes callback hooks you pass in options.hooks and shell command hooks from settings files, but only if you explicitly load them with settingSources or setting_sources.
If a hook has a matcher pattern (like "Write|Edit"), the SDK tests it against the event's target (for example, the tool name). Hooks without a matcher run for every event of that type.
Each matching hook's callback function receives input about what's happening: the tool name, its arguments, the session ID, and other event-specific details.
After performing any operations (logging, API calls, validation), your callback returns an output object that tells the agent what to do: allow the operation, block it, modify the input, or inject context into the conversation.
The following example puts these steps together. It registers a PreToolUse hook (step 1) with a "Write|Edit" matcher (step 3) so the callback only fires for file-writing tools. When triggered, the callback receives the tool's input (step 4), checks if the file path targets a .env file, and returns permissionDecision: "deny" to block the operation (step 5):
import asyncio
from claude_agent_sdk import (
AssistantMessage,
ClaudeSDKClient,
ClaudeAgentOptions,
HookMatcher,
ResultMessage,
)
# Define a hook callback that receives tool call details
async def protect_env_files(input_data, tool_use_id, context):
# Extract the file path from the tool's input arguments
file_path = input_data["tool_input"].get("file_path", "")
file_name = file_path.split("/")[-1]
# Block the operation if targeting a .env file
if file_name == ".env":
return {
"hookSpecificOutput": {
"hookEventName": input_data["hook_event_name"],
"permissionDecision": "deny",
"permissionDecisionReason": "Cannot modify .env files",
}
}
# Return empty object to allow the operation
return {}
async def main():
options = ClaudeAgentOptions(
hooks={
# Register the hook for PreToolUse events
# The matcher filters to only Write and Edit tool calls
"PreToolUse": [HookMatcher(matcher="Write|Edit", hooks=[protect_env_files])]
}
)
async with ClaudeSDKClient(options=options) as client:
await client.query("Update the database configuration")
async for message in client.receive_response():
# Filter for assistant and result messages
if isinstance(message, (AssistantMessage, ResultMessage)):
print(message)
asyncio.run(main())
import { query, HookCallback, PreToolUseHookInput } from "@anthropic-ai/claude-agent-sdk";
// Define a hook callback with the HookCallback type
const protectEnvFiles: HookCallback = async (input, toolUseID, { signal }) => {
// Cast input to the specific hook type for type safety
const preInput = input as PreToolUseHookInput;
// Cast tool_input to access its properties (typed as unknown in the SDK)
const toolInput = preInput.tool_input as Record<string, unknown>;
const filePath = toolInput?.file_path as string;
const fileName = filePath?.split("/").pop();
// Block the operation if targeting a .env file
if (fileName === ".env") {
return {
hookSpecificOutput: {
hookEventName: preInput.hook_event_name,
permissionDecision: "deny",
permissionDecisionReason: "Cannot modify .env files"
}
};
}
// Return empty object to allow the operation
return {};
};
for await (const message of query({
prompt: "Update the database configuration",
options: {
hooks: {
// Register the hook for PreToolUse events
// The matcher filters to only Write and Edit tool calls
PreToolUse: [{ matcher: "Write|Edit", hooks: [protectEnvFiles] }]
}
}
})) {
// Filter for assistant and result messages
if (message.type === "assistant" || message.type === "result") {
console.log(message);
}
}
The SDK provides hooks for different stages of agent execution. Some hooks are available in both SDKs, while others are TypeScript-only.
| Hook Event | Python SDK | TypeScript SDK | What triggers it | Example use case |
|---|---|---|---|---|
PreToolUse |
Yes | Yes | Tool call request (can block or modify) | Block dangerous shell commands |
PostToolUse |
Yes | Yes | Tool execution result | Log all file changes to audit trail |
PostToolUseFailure |
Yes | Yes | Tool execution failure | Handle or log tool errors |
UserPromptSubmit |
Yes | Yes | User prompt submission | Inject additional context into prompts |
Stop |
Yes | Yes | Agent execution stop | Save session state before exit |
SubagentStart |
Yes | Yes | Subagent initialization | Track parallel task spawning |
SubagentStop |
Yes | Yes | Subagent completion | Aggregate results from parallel tasks |
PreCompact |
Yes | Yes | Conversation compaction request | Archive full transcript before summarizing |
PermissionRequest |
Yes | Yes | Permission dialog would be displayed | Custom permission handling |
SessionStart |
No | Yes | Session initialization | Initialize logging and telemetry |
SessionEnd |
No | Yes | Session termination | Clean up temporary resources |
Notification |
Yes | Yes | Agent status messages | Send agent status updates to Slack or PagerDuty |
Setup |
No | Yes | Session setup/maintenance | Run initialization tasks |
TeammateIdle |
No | Yes | Teammate becomes idle | Reassign work or notify |
TaskCompleted |
No | Yes | Background task completes | Aggregate results from parallel tasks |
ConfigChange |
No | Yes | Configuration file changes | Reload settings dynamically |
WorktreeCreate |
No | Yes | Git worktree created | Track isolated workspaces |
WorktreeRemove |
No | Yes | Git worktree removed | Clean up workspace resources |
To configure a hook, pass it in the hooks field of your agent options (ClaudeAgentOptions in Python, the options object in TypeScript):
options = ClaudeAgentOptions(
hooks={"PreToolUse": [HookMatcher(matcher="Bash", hooks=[my_callback])]}
)
async with ClaudeSDKClient(options=options) as client:
await client.query("Your prompt")
async for message in client.receive_response():
print(message)
for await (const message of query({
prompt: "Your prompt",
options: {
hooks: {
PreToolUse: [{ matcher: "Bash", hooks: [myCallback] }]
}
}
})) {
console.log(message);
}
The hooks option is a dictionary (Python) or object (TypeScript) where:
'PreToolUse', 'PostToolUse', 'Stop')Use matchers to filter when your callbacks fire. The matcher field is a regex string that matches against a different value depending on the hook event type. For example, tool-based hooks match against the tool name, while Notification hooks match against the notification type. See the Claude Code hooks reference for the full list of matcher values for each event type.
| Option | Type | Default | Description |
|---|---|---|---|
matcher |
string |
undefined |
Regex pattern matched against the event's filter field. For tool hooks, this is the tool name. Built-in tools include Bash, Read, Write, Edit, Glob, Grep, WebFetch, Agent, and others (see Tool Input Types for the full list). MCP tools use the pattern mcp__<server>__<action>. |
hooks |
HookCallback[] |
- | Required. Array of callback functions to execute when the pattern matches |
timeout |
number |
60 |
Timeout in seconds |
Use the matcher pattern to target specific tools whenever possible. A matcher with 'Bash' only runs for Bash commands, while omitting the pattern runs your callbacks for every occurrence of the event. Note that for tool-based hooks, matchers only filter by tool name, not by file paths or other arguments. To filter by file path, check tool_input.file_path inside your callback.
Tip
Discovering tool names: See Tool Input Types for the full list of built-in tool names, or add a hook without a matcher to log all tool calls your session makes.
MCP tool naming: MCP tools always start with mcp__ followed by the server name and action: mcp__<server>__<action>. For example, if you configure a server named playwright, its tools will be named mcp__playwright__browser_screenshot, mcp__playwright__browser_click, etc. The server name comes from the key you use in the mcpServers configuration.
Every hook callback receives three arguments:
PreToolUseHookInput includes tool_name and tool_input, while NotificationHookInput includes message). See the full type definitions in the TypeScript and Python SDK references.session_id, cwd, and hook_event_name.agent_id and agent_type are populated when the hook fires inside a subagent. In TypeScript, these are on the base hook input and available to all hook types. In Python, they are on PreToolUse, PostToolUse, and PostToolUseFailure only.str | None / string | undefined): correlates PreToolUse and PostToolUse events for the same tool call.signal property (AbortSignal) for cancellation. In Python, this argument is reserved for future use.Your callback returns an object with two categories of fields:
systemMessage injects a message into the conversation visible to the model, and continue (continue_ in Python) determines whether the agent keeps running after this hook.hookSpecificOutput controls the current operation. The fields inside depend on the hook event type. For PreToolUse hooks, this is where you set permissionDecision ("allow", "deny", or "ask"), permissionDecisionReason, and updatedInput. For PostToolUse hooks, you can set additionalContext to append information to the tool result.Return {} to allow the operation without changes. SDK callback hooks use the same JSON output format as Claude Code shell command hooks, which documents every field and event-specific option. For the SDK type definitions, see the TypeScript and Python SDK references.
Note
When multiple hooks or permission rules apply, deny takes priority over ask, which takes priority over allow. If any hook returns deny, the operation is blocked regardless of other hooks.
By default, the agent waits for your hook to return before proceeding. If your hook performs a side effect (logging, sending a webhook) and doesn't need to influence the agent's behavior, you can return an async output instead. This tells the agent to continue immediately without waiting for the hook to finish:
async def async_hook(input_data, tool_use_id, context):
# Start a background task, then return immediately
asyncio.create_task(send_to_logging_service(input_data))
return {"async_": True, "asyncTimeout": 30000}
const asyncHook: HookCallback = async (input, toolUseID, { signal }) => {
// Start a background task, then return immediately
sendToLoggingService(input).catch(console.error);
return { async: true, asyncTimeout: 30000 };
};
| Field | Type | Description |
|---|---|---|
async |
true |
Signals async mode. The agent proceeds without waiting. In Python, use async_ to avoid the reserved keyword. |
asyncTimeout |
number |
Optional timeout in milliseconds for the background operation |
Note
Async outputs cannot block, modify, or inject context into the operation since the agent has already moved on. Use them only for side effects like logging, metrics, or notifications.
This example intercepts Write tool calls and rewrites the file_path argument to prepend /sandbox, redirecting all file writes to a sandboxed directory. The callback returns updatedInput with the modified path and permissionDecision: 'allow' to auto-approve the rewritten operation:
async def redirect_to_sandbox(input_data, tool_use_id, context):
if input_data["hook_event_name"] != "PreToolUse":
return {}
if input_data["tool_name"] == "Write":
original_path = input_data["tool_input"].get("file_path", "")
return {
"hookSpecificOutput": {
"hookEventName": input_data["hook_event_name"],
"permissionDecision": "allow",
"updatedInput": {
**input_data["tool_input"],
"file_path": f"/sandbox{original_path}",
},
}
}
return {}
const redirectToSandbox: HookCallback = async (input, toolUseID, { signal }) => {
if (input.hook_event_name !== "PreToolUse") return {};
const preInput = input as PreToolUseHookInput;
const toolInput = preInput.tool_input as Record<string, unknown>;
if (preInput.tool_name === "Write") {
const originalPath = toolInput.file_path as string;
return {
hookSpecificOutput: {
hookEventName: preInput.hook_event_name,
permissionDecision: "allow",
updatedInput: {
...toolInput,
file_path: `/sandbox${originalPath}`
}
}
};
}
return {};
};
Note
When using updatedInput, you must also include permissionDecision: 'allow'. Always return a new object rather than mutating the original tool_input.
This example blocks any attempt to write to the /etc directory and uses two output fields together: permissionDecision: 'deny' stops the tool call, while systemMessage injects a reminder into the conversation so the agent receives context about why the operation was blocked and avoids retrying it:
async def block_etc_writes(input_data, tool_use_id, context):
file_path = input_data["tool_input"].get("file_path", "")
if file_path.startswith("/etc"):
return {
# Top-level field: inject guidance into the conversation
"systemMessage": "Remember: system directories like /etc are protected.",
# hookSpecificOutput: block the operation
"hookSpecificOutput": {
"hookEventName": input_data["hook_event_name"],
"permissionDecision": "deny",
"permissionDecisionReason": "Writing to /etc is not allowed",
},
}
return {}
const blockEtcWrites: HookCallback = async (input, toolUseID, { signal }) => {
const preInput = input as PreToolUseHookInput;
const toolInput = preInput.tool_input as Record<string, unknown>;
const filePath = toolInput?.file_path as string;
if (filePath?.startsWith("/etc")) {
return {
// Top-level field: inject guidance into the conversation
systemMessage: "Remember: system directories like /etc are protected.",
// hookSpecificOutput: block the operation
hookSpecificOutput: {
hookEventName: preInput.hook_event_name,
permissionDecision: "deny",
permissionDecisionReason: "Writing to /etc is not allowed"
}
};
}
return {};
};
By default, the agent may prompt for permission before using certain tools. This example auto-approves read-only filesystem tools (Read, Glob, Grep) by returning permissionDecision: 'allow', letting them run without user confirmation while leaving all other tools subject to normal permission checks:
async def auto_approve_read_only(input_data, tool_use_id, context):
if input_data["hook_event_name"] != "PreToolUse":
return {}
read_only_tools = ["Read", "Glob", "Grep"]
if input_data["tool_name"] in read_only_tools:
return {
"hookSpecificOutput": {
"hookEventName": input_data["hook_event_name"],
"permissionDecision": "allow",
"permissionDecisionReason": "Read-only tool auto-approved",
}
}
return {}
const autoApproveReadOnly: HookCallback = async (input, toolUseID, { signal }) => {
if (input.hook_event_name !== "PreToolUse") return {};
const preInput = input as PreToolUseHookInput;
const readOnlyTools = ["Read", "Glob", "Grep"];
if (readOnlyTools.includes(preInput.tool_name)) {
return {
hookSpecificOutput: {
hookEventName: preInput.hook_event_name,
permissionDecision: "allow",
permissionDecisionReason: "Read-only tool auto-approved"
}
};
}
return {};
};
Hooks execute in the order they appear in the array. Keep each hook focused on a single responsibility and chain multiple hooks for complex logic:
options = ClaudeAgentOptions(
hooks={
"PreToolUse": [
HookMatcher(hooks=[rate_limiter]), # First: check rate limits
HookMatcher(hooks=[authorization_check]), # Second: verify permissions
HookMatcher(hooks=[input_sanitizer]), # Third: sanitize inputs
HookMatcher(hooks=[audit_logger]), # Last: log the action
]
}
)
const options = {
hooks: {
PreToolUse: [
{ hooks: [rateLimiter] }, // First: check rate limits
{ hooks: [authorizationCheck] }, // Second: verify permissions
{ hooks: [inputSanitizer] }, // Third: sanitize inputs
{ hooks: [auditLogger] } // Last: log the action
]
}
};
Use regex patterns to match multiple tools. This example registers three matchers with different scopes: the first triggers file_security_hook only for file modification tools, the second triggers mcp_audit_hook for any MCP tool (tools whose names start with mcp__), and the third triggers global_logger for every tool call regardless of name:
options = ClaudeAgentOptions(
hooks={
"PreToolUse": [
# Match file modification tools
HookMatcher(matcher="Write|Edit|Delete", hooks=[file_security_hook]),
# Match all MCP tools
HookMatcher(matcher="^mcp__", hooks=[mcp_audit_hook]),
# Match everything (no matcher)
HookMatcher(hooks=[global_logger]),
]
}
)
const options = {
hooks: {
PreToolUse: [
// Match file modification tools
{ matcher: "Write|Edit|Delete", hooks: [fileSecurityHook] },
// Match all MCP tools
{ matcher: "^mcp__", hooks: [mcpAuditHook] },
// Match everything (no matcher)
{ hooks: [globalLogger] }
]
}
};
Use SubagentStop hooks to monitor when subagents finish their work. See the full input type in the TypeScript and Python SDK references. This example logs a summary each time a subagent completes:
async def subagent_tracker(input_data, tool_use_id, context):
# Log subagent details when it finishes
print(f"[SUBAGENT] Completed: {input_data['agent_id']}")
print(f" Transcript: {input_data['agent_transcript_path']}")
print(f" Tool use ID: {tool_use_id}")
print(f" Stop hook active: {input_data.get('stop_hook_active')}")
return {}
options = ClaudeAgentOptions(
hooks={"SubagentStop": [HookMatcher(hooks=[subagent_tracker])]}
)
import { HookCallback, SubagentStopHookInput } from "@anthropic-ai/claude-agent-sdk";
const subagentTracker: HookCallback = async (input, toolUseID, { signal }) => {
// Cast to SubagentStopHookInput to access subagent-specific fields
const subInput = input as SubagentStopHookInput;
// Log subagent details when it finishes
console.log(`[SUBAGENT] Completed: ${subInput.agent_id}`);
console.log(` Transcript: ${subInput.agent_transcript_path}`);
console.log(` Tool use ID: ${toolUseID}`);
console.log(` Stop hook active: ${subInput.stop_hook_active}`);
return {};
};
const options = {
hooks: {
SubagentStop: [{ hooks: [subagentTracker] }]
}
};
Hooks can perform asynchronous operations like HTTP requests. Catch errors inside your hook instead of letting them propagate, since an unhandled exception can interrupt the agent.
This example sends a webhook after each tool completes, logging which tool ran and when. The hook catches errors so a failed webhook doesn't interrupt the agent:
import asyncio
import json
import urllib.request
from datetime import datetime
def _send_webhook(tool_name):
"""Synchronous helper that POSTs tool usage data to an external webhook."""
data = json.dumps(
{
"tool": tool_name,
"timestamp": datetime.now().isoformat(),
}
).encode()
req = urllib.request.Request(
"https://api.example.com/webhook",
data=data,
headers={"Content-Type": "application/json"},
method="POST",
)
urllib.request.urlopen(req)
async def webhook_notifier(input_data, tool_use_id, context):
# Only fire after a tool completes (PostToolUse), not before
if input_data["hook_event_name"] != "PostToolUse":
return {}
try:
# Run the blocking HTTP call in a thread to avoid blocking the event loop
await asyncio.to_thread(_send_webhook, input_data["tool_name"])
except Exception as e:
# Log the error but don't raise. A failed webhook shouldn't stop the agent
print(f"Webhook request failed: {e}")
return {}
import { query, HookCallback, PostToolUseHookInput } from "@anthropic-ai/claude-agent-sdk";
const webhookNotifier: HookCallback = async (input, toolUseID, { signal }) => {
// Only fire after a tool completes (PostToolUse), not before
if (input.hook_event_name !== "PostToolUse") return {};
try {
await fetch("https://api.example.com/webhook", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
tool: (input as PostToolUseHookInput).tool_name,
timestamp: new Date().toISOString()
}),
// Pass signal so the request cancels if the hook times out
signal
});
} catch (error) {
// Handle cancellation separately from other errors
if (error instanceof Error && error.name === "AbortError") {
console.log("Webhook request cancelled");
}
// Don't re-throw. A failed webhook shouldn't stop the agent
}
return {};
};
// Register as a PostToolUse hook
for await (const message of query({
prompt: "Refactor the auth module",
options: {
hooks: {
PostToolUse: [{ hooks: [webhookNotifier] }]
}
}
})) {
console.log(message);
}
Use Notification hooks to receive system notifications from the agent and forward them to external services. Notifications fire for specific event types: permission_prompt (Claude needs permission), idle_prompt (Claude is waiting for input), auth_success (authentication completed), and elicitation_dialog (Claude is prompting the user). Each notification includes a message field with a human-readable description and optionally a title.
This example forwards every notification to a Slack channel. It requires a Slack incoming webhook URL, which you create by adding an app to your Slack workspace and enabling incoming webhooks:
import asyncio
import json
import urllib.request
from claude_agent_sdk import ClaudeSDKClient, ClaudeAgentOptions, HookMatcher
def _send_slack_notification(message):
"""Synchronous helper that sends a message to Slack via incoming webhook."""
data = json.dumps({"text": f"Agent status: {message}"}).encode()
req = urllib.request.Request(
"https://hooks.slack.com/services/YOUR/WEBHOOK/URL",
data=data,
headers={"Content-Type": "application/json"},
method="POST",
)
urllib.request.urlopen(req)
async def notification_handler(input_data, tool_use_id, context):
try:
# Run the blocking HTTP call in a thread to avoid blocking the event loop
await asyncio.to_thread(_send_slack_notification, input_data.get("message", ""))
except Exception as e:
print(f"Failed to send notification: {e}")
# Return empty object. Notification hooks don't modify agent behavior
return {}
async def main():
options = ClaudeAgentOptions(
hooks={
# Register the hook for Notification events (no matcher needed)
"Notification": [HookMatcher(hooks=[notification_handler])],
},
)
async with ClaudeSDKClient(options=options) as client:
await client.query("Analyze this codebase")
async for message in client.receive_response():
print(message)
asyncio.run(main())
import { query, HookCallback, NotificationHookInput } from "@anthropic-ai/claude-agent-sdk";
// Define a hook callback that sends notifications to Slack
const notificationHandler: HookCallback = async (input, toolUseID, { signal }) => {
// Cast to NotificationHookInput to access the message field
const notification = input as NotificationHookInput;
try {
// POST the notification message to a Slack incoming webhook
await fetch("https://hooks.slack.com/services/YOUR/WEBHOOK/URL", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
text: `Agent status: ${notification.message}`
}),
// Pass signal so the request cancels if the hook times out
signal
});
} catch (error) {
if (error instanceof Error && error.name === "AbortError") {
console.log("Notification cancelled");
} else {
console.error("Failed to send notification:", error);
}
}
// Return empty object. Notification hooks don't modify agent behavior
return {};
};
// Register the hook for Notification events (no matcher needed)
for await (const message of query({
prompt: "Analyze this codebase",
options: {
hooks: {
Notification: [{ hooks: [notificationHandler] }]
}
}
})) {
console.log(message);
}
PreToolUse, not preToolUse)options.hooksStop and SubagentStop, matchers match against different fields (see matcher patterns)max_turns limit because the session ends before hooks can executeMatchers only match tool names, not file paths or other arguments. To filter by file path, check tool_input.file_path inside your hook:
const myHook: HookCallback = async (input, toolUseID, { signal }) => {
const preInput = input as PreToolUseHookInput;
const toolInput = preInput.tool_input as Record<string, unknown>;
const filePath = toolInput?.file_path as string;
if (!filePath?.endsWith(".md")) return {}; // Skip non-markdown files
// Process markdown files...
return {};
};
timeout value in the HookMatcher configurationAbortSignal from the third callback argument to handle cancellation gracefully in TypeScriptPreToolUse hooks for permissionDecision: 'deny' returnspermissionDecisionReason they're returningEnsure updatedInput is inside hookSpecificOutput, not at the top level:
return { hookSpecificOutput: { hookEventName: "PreToolUse", permissionDecision: "allow", updatedInput: { command: "new command" } }
};
You must also return permissionDecision: 'allow' for the input modification to take effect
Include hookEventName in hookSpecificOutput to identify which hook type the output is for
SessionStart and SessionEnd can be registered as SDK callback hooks in TypeScript, but are not available in the Python SDK (HookEvent omits them). In Python, they are only available as shell command hooks defined in settings files (for example, .claude/settings.json). To load shell command hooks from your SDK application, include the appropriate setting source with setting_sources or settingSources:
options = ClaudeAgentOptions(
setting_sources=["project"], # Loads .claude/settings.json including hooks
)
const options = {
settingSources: ["project"] // Loads .claude/settings.json including hooks
};
To run initialization logic as a Python SDK callback instead, use the first message from client.receive_response as your trigger.
When spawning multiple subagents, each one may request permissions separately. Subagents do not automatically inherit parent agent permissions. To avoid repeated prompts, use PreToolUse hooks to auto-approve specific tools, or configure permission rules that apply to subagent sessions.
A UserPromptSubmit hook that spawns subagents can create infinite loops if those subagents trigger the same hook. To prevent this:
The systemMessage field adds context to the conversation that the model sees, but it may not appear in all SDK output modes. If you need to surface hook decisions to your application, log them separately or use a dedicated output channel.
Track file changes during agent sessions and restore files to any previous state
File checkpointing tracks file modifications made through the Write, Edit, and NotebookEdit tools during an agent session, allowing you to rewind files to any previous state. Want to try it out? Jump to the interactive example.
With checkpointing, you can:
Warning
Only changes made through the Write, Edit, and NotebookEdit tools are tracked. Changes made through Bash commands (like echo > file.txt or sed -i) are not captured by the checkpoint system.
When you enable file checkpointing, the SDK creates backups of files before modifying them through the Write, Edit, or NotebookEdit tools. User messages in the response stream include a checkpoint UUID that you can use as a restore point.
Checkpoint works with these built-in tools that the agent uses to modify files:
| Tool | Description |
|---|---|
| Write | Creates a new file or overwrites an existing file with new content |
| Edit | Makes targeted edits to specific parts of an existing file |
| NotebookEdit | Modifies cells in Jupyter notebooks (.ipynb files) |
Note
File rewinding restores files on disk to a previous state. It does not rewind the conversation itself. The conversation history and context remain intact after calling rewindFiles (TypeScript) or rewind_files (Python).
The checkpoint system tracks:
When you rewind to a checkpoint, created files are deleted and modified files are restored to their content at that point.
To use file checkpointing, enable it in your options, capture checkpoint UUIDs from the response stream, then call rewindFiles (TypeScript) or rewind_files (Python) when you need to restore.
The following example shows the complete flow: enable checkpointing, capture the checkpoint UUID and session ID from the response stream, then resume the session later to rewind files. Each step is explained in detail below.
import asyncio
from claude_agent_sdk import (
ClaudeSDKClient,
ClaudeAgentOptions,
UserMessage,
ResultMessage,
)
async def main():
# Step 1: Enable checkpointing
options = ClaudeAgentOptions(
enable_file_checkpointing=True,
permission_mode="acceptEdits", # Auto-accept file edits without prompting
extra_args={
"replay-user-messages": None
}, # Required to receive checkpoint UUIDs in the response stream
)
checkpoint_id = None
session_id = None
# Run the query and capture checkpoint UUID and session ID
async with ClaudeSDKClient(options) as client:
await client.query("Refactor the authentication module")
# Step 2: Capture checkpoint UUID from the first user message
async for message in client.receive_response():
if isinstance(message, UserMessage) and message.uuid and not checkpoint_id:
checkpoint_id = message.uuid
if isinstance(message, ResultMessage) and not session_id:
session_id = message.session_id
# Step 3: Later, rewind by resuming the session with an empty prompt
if checkpoint_id and session_id:
async with ClaudeSDKClient(
ClaudeAgentOptions(enable_file_checkpointing=True, resume=session_id)
) as client:
await client.query("") # Empty prompt to open the connection
async for message in client.receive_response():
await client.rewind_files(checkpoint_id)
break
print(f"Rewound to checkpoint: {checkpoint_id}")
asyncio.run(main())
import { query } from "@anthropic-ai/claude-agent-sdk";
async function main() {
// Step 1: Enable checkpointing
const opts = {
enableFileCheckpointing: true,
permissionMode: "acceptEdits" as const, // Auto-accept file edits without prompting
extraArgs: { "replay-user-messages": null } // Required to receive checkpoint UUIDs in the response stream
};
const response = query({
prompt: "Refactor the authentication module",
options: opts
});
let checkpointId: string | undefined;
let sessionId: string | undefined;
// Step 2: Capture checkpoint UUID from the first user message
for await (const message of response) {
if (message.type === "user" && message.uuid && !checkpointId) {
checkpointId = message.uuid;
}
if ("session_id" in message && !sessionId) {
sessionId = message.session_id;
}
}
// Step 3: Later, rewind by resuming the session with an empty prompt
if (checkpointId && sessionId) {
const rewindQuery = query({
prompt: "", // Empty prompt to open the connection
options: { ...opts, resume: sessionId }
});
for await (const msg of rewindQuery) {
await rewindQuery.rewindFiles(checkpointId);
break;
}
console.log(`Rewound to checkpoint: ${checkpointId}`);
}
}
main();
Configure your SDK options to enable checkpointing and receive checkpoint UUIDs:
| Option | Python | TypeScript | Description |
|---|---|---|---|
| Enable checkpointing | enable_file_checkpointing=True |
enableFileCheckpointing: true |
Tracks file changes for rewinding |
| Receive checkpoint UUIDs | extra_args={"replay-user-messages": None} |
extraArgs: { 'replay-user-messages': null } |
Required to get user message UUIDs in the stream |
options = ClaudeAgentOptions(
enable_file_checkpointing=True,
permission_mode="acceptEdits",
extra_args={"replay-user-messages": None},
)
async with ClaudeSDKClient(options) as client:
await client.query("Refactor the authentication module")
const response = query({
prompt: "Refactor the authentication module",
options: {
enableFileCheckpointing: true,
permissionMode: "acceptEdits" as const,
extraArgs: { "replay-user-messages": null }
}
});
With the replay-user-messages option set (shown above), each user message in the response stream has a UUID that serves as a checkpoint.
For most use cases, capture the first user message UUID (message.uuid); rewinding to it restores all files to their original state. To store multiple checkpoints and rewind to intermediate states, see Multiple restore points.
Capturing the session ID (message.session_id) is optional; you only need it if you want to rewind later, after the stream completes. If you're calling rewindFiles immediately while still processing messages (as the example in Checkpoint before risky operations does), you can skip capturing the session ID.
checkpoint_id = None
session_id = None
async for message in client.receive_response():
# Update checkpoint on each user message (keeps the latest)
if isinstance(message, UserMessage) and message.uuid:
checkpoint_id = message.uuid
# Capture session ID from the result message
if isinstance(message, ResultMessage):
session_id = message.session_id
let checkpointId: string | undefined;
let sessionId: string | undefined;
for await (const message of response) {
// Update checkpoint on each user message (keeps the latest)
if (message.type === "user" && message.uuid) {
checkpointId = message.uuid;
}
// Capture session ID from any message that has it
if ("session_id" in message) {
sessionId = message.session_id;
}
}
To rewind after the stream completes, resume the session with an empty prompt and call rewind_files (Python) or rewindFiles (TypeScript) with your checkpoint UUID. You can also rewind during the stream; see Checkpoint before risky operations for that pattern.
async with ClaudeSDKClient(
ClaudeAgentOptions(enable_file_checkpointing=True, resume=session_id)
) as client:
await client.query("") # Empty prompt to open the connection
async for message in client.receive_response():
await client.rewind_files(checkpoint_id)
break
const rewindQuery = query({
prompt: "", // Empty prompt to open the connection
options: { ...opts, resume: sessionId }
});
for await (const msg of rewindQuery) {
await rewindQuery.rewindFiles(checkpointId);
break;
}
If you capture the session ID and checkpoint ID, you can also rewind from the CLI:
claude -p --resume <session-id> --rewind-files <checkpoint-uuid>
These patterns show different ways to capture and use checkpoint UUIDs depending on your use case.
This pattern keeps only the most recent checkpoint UUID, updating it before each agent turn. If something goes wrong during processing, you can immediately rewind to the last safe state and break out of the loop.
import asyncio
from claude_agent_sdk import ClaudeSDKClient, ClaudeAgentOptions, UserMessage
async def main():
options = ClaudeAgentOptions(
enable_file_checkpointing=True,
permission_mode="acceptEdits",
extra_args={"replay-user-messages": None},
)
safe_checkpoint = None
async with ClaudeSDKClient(options) as client:
await client.query("Refactor the authentication module")
async for message in client.receive_response():
# Update checkpoint before each agent turn starts
# This overwrites the previous checkpoint. Only keep the latest
if isinstance(message, UserMessage) and message.uuid:
safe_checkpoint = message.uuid
# Decide when to revert based on your own logic
# For example: error detection, validation failure, or user input
if your_revert_condition and safe_checkpoint:
await client.rewind_files(safe_checkpoint)
# Exit the loop after rewinding, files are restored
break
asyncio.run(main())
import { query } from "@anthropic-ai/claude-agent-sdk";
async function main() {
const response = query({
prompt: "Refactor the authentication module",
options: {
enableFileCheckpointing: true,
permissionMode: "acceptEdits" as const,
extraArgs: { "replay-user-messages": null }
}
});
let safeCheckpoint: string | undefined;
for await (const message of response) {
// Update checkpoint before each agent turn starts
// This overwrites the previous checkpoint. Only keep the latest
if (message.type === "user" && message.uuid) {
safeCheckpoint = message.uuid;
}
// Decide when to revert based on your own logic
// For example: error detection, validation failure, or user input
if (yourRevertCondition && safeCheckpoint) {
await response.rewindFiles(safeCheckpoint);
// Exit the loop after rewinding, files are restored
break;
}
}
}
main();
If Claude makes changes across multiple turns, you might want to rewind to a specific point rather than all the way back. For example, if Claude refactors a file in turn one and adds tests in turn two, you might want to keep the refactor but undo the tests.
This pattern stores all checkpoint UUIDs in an array with metadata. After the session completes, you can rewind to any previous checkpoint:
import asyncio
from dataclasses import dataclass
from datetime import datetime
from claude_agent_sdk import (
ClaudeSDKClient,
ClaudeAgentOptions,
UserMessage,
ResultMessage,
)
# Store checkpoint metadata for better tracking
@dataclass
class Checkpoint:
id: str
description: str
timestamp: datetime
async def main():
options = ClaudeAgentOptions(
enable_file_checkpointing=True,
permission_mode="acceptEdits",
extra_args={"replay-user-messages": None},
)
checkpoints = []
session_id = None
async with ClaudeSDKClient(options) as client:
await client.query("Refactor the authentication module")
async for message in client.receive_response():
if isinstance(message, UserMessage) and message.uuid:
checkpoints.append(
Checkpoint(
id=message.uuid,
description=f"After turn {len(checkpoints) + 1}",
timestamp=datetime.now(),
)
)
if isinstance(message, ResultMessage) and not session_id:
session_id = message.session_id
# Later: rewind to any checkpoint by resuming the session
if checkpoints and session_id:
target = checkpoints[0] # Pick any checkpoint
async with ClaudeSDKClient(
ClaudeAgentOptions(enable_file_checkpointing=True, resume=session_id)
) as client:
await client.query("") # Empty prompt to open the connection
async for message in client.receive_response():
await client.rewind_files(target.id)
break
print(f"Rewound to: {target.description}")
asyncio.run(main())
import { query } from "@anthropic-ai/claude-agent-sdk";
// Store checkpoint metadata for better tracking
interface Checkpoint {
id: string;
description: string;
timestamp: Date;
}
async function main() {
const opts = {
enableFileCheckpointing: true,
permissionMode: "acceptEdits" as const,
extraArgs: { "replay-user-messages": null }
};
const response = query({
prompt: "Refactor the authentication module",
options: opts
});
const checkpoints: Checkpoint[] = [];
let sessionId: string | undefined;
for await (const message of response) {
if (message.type === "user" && message.uuid) {
checkpoints.push({
id: message.uuid,
description: `After turn ${checkpoints.length + 1}`,
timestamp: new Date()
});
}
if ("session_id" in message && !sessionId) {
sessionId = message.session_id;
}
}
// Later: rewind to any checkpoint by resuming the session
if (checkpoints.length > 0 && sessionId) {
const target = checkpoints[0]; // Pick any checkpoint
const rewindQuery = query({
prompt: "", // Empty prompt to open the connection
options: { ...opts, resume: sessionId }
});
for await (const msg of rewindQuery) {
await rewindQuery.rewindFiles(target.id);
break;
}
console.log(`Rewound to: ${target.description}`);
}
}
main();
This complete example creates a small utility file, has the agent add documentation comments, shows you the changes, then asks if you want to rewind.
Before you begin, make sure you have the Claude Agent SDK installed.
Create a new file called utils.py (Python) or utils.ts (TypeScript) and paste the following code:
def add(a, b):
return a + b
def subtract(a, b):
return a - b
def multiply(a, b):
return a * b
def divide(a, b):
if b == 0:
raise ValueError("Cannot divide by zero")
return a / b
export function add(a: number, b: number): number {
return a + b;
}
export function subtract(a: number, b: number): number {
return a - b;
}
export function multiply(a: number, b: number): number {
return a * b;
}
export function divide(a: number, b: number): number {
if (b === 0) {
throw new Error("Cannot divide by zero");
}
return a / b;
}
Create a new file called try_checkpointing.py (Python) or try_checkpointing.ts (TypeScript) in the same directory as your utility file, and paste the following code.
This script asks Claude to add doc comments to your utility file, then gives you the option to rewind and restore the original.
import asyncio
from claude_agent_sdk import (
ClaudeSDKClient,
ClaudeAgentOptions,
UserMessage,
ResultMessage,
)
async def main():
# Configure the SDK with checkpointing enabled
# - enable_file_checkpointing: Track file changes for rewinding
# - permission_mode: Auto-accept file edits without prompting
# - extra_args: Required to receive user message UUIDs in the stream
options = ClaudeAgentOptions(
enable_file_checkpointing=True,
permission_mode="acceptEdits",
extra_args={"replay-user-messages": None},
)
checkpoint_id = None # Store the user message UUID for rewinding
session_id = None # Store the session ID for resuming
print("Running agent to add doc comments to utils.py...\n")
# Run the agent and capture checkpoint data from the response stream
async with ClaudeSDKClient(options) as client:
await client.query("Add doc comments to utils.py")
async for message in client.receive_response():
# Capture the first user message UUID - this is our restore point
if isinstance(message, UserMessage) and message.uuid and not checkpoint_id:
checkpoint_id = message.uuid
# Capture the session ID so we can resume later
if isinstance(message, ResultMessage):
session_id = message.session_id
print("Done! Open utils.py to see the added doc comments.\n")
# Ask the user if they want to rewind the changes
if checkpoint_id and session_id:
response = input("Rewind to remove the doc comments? (y/n): ")
if response.lower() == "y":
# Resume the session with an empty prompt, then rewind
async with ClaudeSDKClient(
ClaudeAgentOptions(enable_file_checkpointing=True, resume=session_id)
) as client:
await client.query("") # Empty prompt opens the connection
async for message in client.receive_response():
await client.rewind_files(checkpoint_id) # Restore files
break
print(
"\n✓ File restored! Open utils.py to verify the doc comments are gone."
)
else:
print("\nKept the modified file.")
asyncio.run(main())
import { query } from "@anthropic-ai/claude-agent-sdk";
import * as readline from "readline";
async function main() {
// Configure the SDK with checkpointing enabled
// - enableFileCheckpointing: Track file changes for rewinding
// - permissionMode: Auto-accept file edits without prompting
// - extraArgs: Required to receive user message UUIDs in the stream
const opts = {
enableFileCheckpointing: true,
permissionMode: "acceptEdits" as const,
extraArgs: { "replay-user-messages": null }
};
let sessionId: string | undefined; // Store the session ID for resuming
let checkpointId: string | undefined; // Store the user message UUID for rewinding
console.log("Running agent to add doc comments to utils.ts...\n");
// Run the agent and capture checkpoint data from the response stream
const response = query({
prompt: "Add doc comments to utils.ts",
options: opts
});
for await (const message of response) {
// Capture the first user message UUID - this is our restore point
if (message.type === "user" && message.uuid && !checkpointId) {
checkpointId = message.uuid;
}
// Capture the session ID so we can resume later
if ("session_id" in message) {
sessionId = message.session_id;
}
}
console.log("Done! Open utils.ts to see the added doc comments.\n");
// Ask the user if they want to rewind the changes
if (checkpointId && sessionId) {
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout
});
const answer = await new Promise<string>((resolve) => {
rl.question("Rewind to remove the doc comments? (y/n): ", resolve);
});
rl.close();
if (answer.toLowerCase() === "y") {
// Resume the session with an empty prompt, then rewind
const rewindQuery = query({
prompt: "", // Empty prompt opens the connection
options: { ...opts, resume: sessionId }
});
for await (const msg of rewindQuery) {
await rewindQuery.rewindFiles(checkpointId); // Restore files
break;
}
console.log("\n✓ File restored! Open utils.ts to verify the doc comments are gone.");
} else {
console.log("\nKept the modified file.");
}
}
}
main();
This example demonstrates the complete checkpointing workflow:
enable_file_checkpointing=True and permission_mode="acceptEdits" to auto-approve file editsrewind_files to restore the original fileRun the script from the same directory as your utility file.
Tip
Open your utility file (utils.py or utils.ts) in your IDE or editor before running the script. You'll see the file update in real-time as the agent adds doc comments, then revert back to the original when you choose to rewind.
python try_checkpointing.py
npx tsx try_checkpointing.ts
You'll see the agent add doc comments, then a prompt asking if you want to rewind. If you choose yes, the file is restored to its original state.
File checkpointing has the following limitations:
| Limitation | Description |
|---|---|
| Write/Edit/NotebookEdit tools only | Changes made through Bash commands are not tracked |
| Same session | Checkpoints are tied to the session that created them |
| File content only | Creating, moving, or deleting directories is not undone by rewinding |
| Local files | Remote or network files are not tracked |
If enableFileCheckpointing or rewindFiles isn't available, you may be on an older SDK version.
Solution: Update to the latest SDK version:
pip install --upgrade claude-agent-sdknpm install @anthropic-ai/claude-agent-sdk@latestIf message.uuid is undefined or missing, you're not receiving checkpoint UUIDs.
Cause: The replay-user-messages option isn't set.
Solution: Add extra_args={"replay-user-messages": None} (Python) or extraArgs: { 'replay-user-messages': null } (TypeScript) to your options.
This error occurs when the checkpoint data doesn't exist for the specified user message UUID.
Common causes:
enable_file_checkpointing or enableFileCheckpointing was not set to true)Solution: Ensure enable_file_checkpointing=True (Python) or enableFileCheckpointing: true (TypeScript) was set on the original session, then use the pattern shown in the examples: capture the first user message UUID, complete the session fully, then resume with an empty prompt and call rewindFiles once.
This error occurs when you call rewindFiles or rewind_files after you've finished iterating through the response. The connection to the CLI process closes when the loop completes.
Solution: Resume the session with an empty prompt, then call rewind on the new query:
# Resume session with empty prompt, then rewind
async with ClaudeSDKClient(
ClaudeAgentOptions(enable_file_checkpointing=True, resume=session_id)
) as client:
await client.query("")
async for message in client.receive_response():
await client.rewind_files(checkpoint_id)
break
// Resume session with empty prompt, then rewind
const rewindQuery = query({
prompt: "",
options: { ...opts, resume: sessionId }
});
for await (const msg of rewindQuery) {
await rewindQuery.rewindFiles(checkpointId);
break;
}
query and the rewindFiles method.ClaudeAgentOptions and the rewind_files method.Learn how to track token usage, deduplicate parallel tool calls, and calculate costs with the Claude Agent SDK.
The Claude Agent SDK provides detailed token usage information for each interaction with Claude. This guide explains how to properly track costs and understand usage reporting, especially when dealing with parallel tool uses and multi-step conversations.
For complete API documentation, see the TypeScript SDK reference and Python SDK reference.
The TypeScript and Python SDKs expose the same usage data with different field names:
message.message.id, message.message.usage), per-model cost via modelUsage on the result message, and a cumulative total on the result message.message.usage, message.message_id), per-model cost via model_usage on the result message, and the accumulated total on the result message (total_cost_usd and usage dict).Both SDKs use the same underlying cost model and expose the same granularity. The difference is in field naming and where per-step usage is nested.
Cost tracking depends on understanding how the SDK scopes usage data:
query call: one invocation of the SDK's query function. A single call can involve multiple steps (Claude responds, uses tools, gets results, responds again). Each call produces one result message at the end.query call. Each step produces assistant messages with token usage.query calls linked by a session ID (using the resume option). Each query call within a session reports its own cost independently.The following diagram shows the message stream from a single query call, with token usage reported at each step and the authoritative total at the end:
When Claude responds, it sends one or more assistant messages. In TypeScript, each assistant message contains a nested BetaMessage (accessed via message.message) with an id and a usage object with token counts (input_tokens, output_tokens). In Python, the AssistantMessage dataclass exposes the same data directly via message.usage and message.message_id. When Claude uses multiple tools in one turn, all messages in that turn share the same ID, so deduplicate by ID to avoid double-counting.
When the query call completes, the SDK emits a result message with total_cost_usd and cumulative usage. This is available in both TypeScript (SDKResultMessage) and Python (ResultMessage). If you make multiple query calls (for example, in a multi-turn session), each result only reflects the cost of that individual call. If you only need the total cost, you can ignore the per-step usage and read this single value.
The result message (TypeScript, Python) is the last message in every query call. It includes total_cost_usd, the cumulative cost across all steps in that call. This works for both success and error results. If you use sessions to make multiple query calls, each result only reflects the cost of that individual call.
The following examples iterate over the message stream from a query call and print the total cost when the result message arrives:
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({ prompt: "Summarize this project" })) {
if (message.type === "result") {
console.log(`Total cost: $${message.total_cost_usd}`);
}
}
from claude_agent_sdk import query, ResultMessage
import asyncio
async def main():
async for message in query(prompt="Summarize this project"):
if isinstance(message, ResultMessage):
print(f"Total cost: ${message.total_cost_usd or 0}")
asyncio.run(main())
The examples in this section use TypeScript field names. In Python, the equivalent fields are AssistantMessage.usage and AssistantMessage.message_id for per-step usage, and ResultMessage.model_usage for per-model breakdowns.
Each assistant message contains a nested BetaMessage (accessed via message.message) with an id and usage object with token counts. When Claude uses tools in parallel, multiple messages share the same id with identical usage data. Track which IDs you've already counted and skip duplicates to avoid inflated totals.
Warning
Parallel tool calls produce multiple assistant messages whose nested BetaMessage shares the same id and identical usage. Always deduplicate by ID to get accurate per-step token counts.
The following example accumulates input and output tokens across all steps, counting each unique message ID only once:
import { query } from "@anthropic-ai/claude-agent-sdk";
const seenIds = new Set<string>();
let totalInputTokens = 0;
let totalOutputTokens = 0;
for await (const message of query({ prompt: "Summarize this project" })) {
if (message.type === "assistant") {
const msgId = message.message.id;
// Parallel tool calls share the same ID, only count once
if (!seenIds.has(msgId)) {
seenIds.add(msgId);
totalInputTokens += message.message.usage.input_tokens;
totalOutputTokens += message.message.usage.output_tokens;
}
}
}
console.log(`Steps: ${seenIds.size}`);
console.log(`Input tokens: ${totalInputTokens}`);
console.log(`Output tokens: ${totalOutputTokens}`);
The result message includes modelUsage, a map of model name to per-model token counts and cost. This is useful when you run multiple models (for example, Haiku for subagents and Opus for the main agent) and want to see where tokens are going.
The following example runs a query and prints the cost and token breakdown for each model used:
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({ prompt: "Summarize this project" })) {
if (message.type !== "result") continue;
for (const [modelName, usage] of Object.entries(message.modelUsage)) {
console.log(`${modelName}: $${usage.costUSD.toFixed(4)}`);
console.log(` Input tokens: ${usage.inputTokens}`);
console.log(` Output tokens: ${usage.outputTokens}`);
console.log(` Cache read: ${usage.cacheReadInputTokens}`);
console.log(` Cache creation: ${usage.cacheCreationInputTokens}`);
}
}
Each query call returns its own total_cost_usd. The SDK does not provide a session-level total, so if your application makes multiple query calls (for example, in a multi-turn session or across different users), accumulate the totals yourself.
The following examples run two query calls sequentially, add each call's total_cost_usd to a running total, and print both the per-call and combined cost:
import { query } from "@anthropic-ai/claude-agent-sdk";
// Track cumulative cost across multiple query() calls
let totalSpend = 0;
const prompts = [
"Read the files in src/ and summarize the architecture",
"List all exported functions in src/auth.ts"
];
for (const prompt of prompts) {
for await (const message of query({ prompt })) {
if (message.type === "result") {
totalSpend += message.total_cost_usd;
console.log(`This call: $${message.total_cost_usd}`);
}
}
}
console.log(`Total spend: $${totalSpend.toFixed(4)}`);
from claude_agent_sdk import query, ResultMessage
import asyncio
async def main():
# Track cumulative cost across multiple query() calls
total_spend = 0.0
prompts = [
"Read the files in src/ and summarize the architecture",
"List all exported functions in src/auth.ts",
]
for prompt in prompts:
async for message in query(prompt=prompt):
if isinstance(message, ResultMessage):
cost = message.total_cost_usd or 0
total_spend += cost
print(f"This call: ${cost}")
print(f"Total spend: ${total_spend:.4f}")
asyncio.run(main())
For accurate cost tracking, account for failed conversations, cache token pricing, and occasional reporting inconsistencies.
In rare cases, you might observe different output_tokens values for messages with the same ID. When this occurs:
total_cost_usd in the result message is authoritative.Both success and error result messages include usage and total_cost_usd. If a conversation fails mid-way, you still consumed tokens up to the point of failure. Always read cost data from the result message regardless of its subtype.
The Agent SDK automatically uses prompt caching to reduce costs on repeated content. You do not need to configure caching yourself. The usage object includes two additional fields for cache tracking:
cache_creation_input_tokens: tokens used to create new cache entries (charged at a higher rate than standard input tokens).cache_read_input_tokens: tokens read from existing cache entries (charged at a reduced rate).Track these separately from input_tokens to understand caching savings. In TypeScript, these fields are typed on the Usage object. In Python, they appear as keys in the ResultMessage.usage dict (for example, message.usage.get("cache_read_input_tokens", 0)).
Export traces, metrics, and events from the Agent SDK to your observability backend using OpenTelemetry.
When you run agents in production, you need visibility into what they did:
The Agent SDK can export this data as OpenTelemetry traces, metrics, and log events to any backend that accepts the OpenTelemetry Protocol (OTLP), such as Honeycomb, Datadog, Grafana, Langfuse, or a self-hosted collector.
This guide explains how the SDK emits telemetry, how to configure the export, and how to tag and filter the data once it reaches your backend. To read token usage and cost directly from the SDK response stream instead of exporting to a backend, see Track cost and usage.
The Agent SDK runs the Claude Code CLI as a child process and communicates with it over a local pipe. The CLI has OpenTelemetry instrumentation built in: it records spans around each model request and tool execution, emits metrics for token and cost counters, and emits structured log events for prompts and tool results. The SDK does not produce telemetry of its own. Instead, it passes configuration through to the CLI process, and the CLI exports directly to your collector.
Configuration is passed as environment variables. By default, the child process inherits your application's environment, so you can configure telemetry in either of two places:
query call picks them up automatically with no code change. This is the recommended approach for production deployments.ClaudeAgentOptions.env (Python) or options.env (TypeScript). Use this when different agents in the same process need different telemetry settings. In Python, env is merged on top of the inherited environment. In TypeScript, env replaces the inherited environment entirely, so include ...process.env in the object you pass.The CLI exports three independent OpenTelemetry signals. Each has its own enable switch and its own exporter, so you can turn on only the ones you need.
| Signal | What it contains | Enable with |
|---|---|---|
| Metrics | Counters for tokens, cost, sessions, lines of code, and tool decisions | OTEL_METRICS_EXPORTER |
| Log events | Structured records for each prompt, API request, API error, and tool result | OTEL_LOGS_EXPORTER |
| Traces | Spans for each interaction, model request, tool call, and hook (beta) | OTEL_TRACES_EXPORTER plus CLAUDE_CODE_ENHANCED_TELEMETRY_BETA=1 |
For the complete list of metric names, event names, and attributes, see the Claude Code Monitoring reference. The Agent SDK emits the same data because it runs the same CLI. Span names are listed in Read agent traces below.
Telemetry is off until you set CLAUDE_CODE_ENABLE_TELEMETRY=1 and choose at least one exporter. The most common configuration sends all three signals over OTLP HTTP to a collector.
The following example sets the variables in a dictionary and passes them through options.env. The agent runs a single task, and the CLI exports spans, metrics, and events to the collector at collector.example.com while the loop consumes the response stream:
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions
OTEL_ENV = {
"CLAUDE_CODE_ENABLE_TELEMETRY": "1",
# Required for traces, which are in beta. Metrics and log events do not need this.
"CLAUDE_CODE_ENHANCED_TELEMETRY_BETA": "1",
# Choose an exporter per signal. Use otlp for the SDK; see the Note below.
"OTEL_TRACES_EXPORTER": "otlp",
"OTEL_METRICS_EXPORTER": "otlp",
"OTEL_LOGS_EXPORTER": "otlp",
# Standard OTLP transport configuration.
"OTEL_EXPORTER_OTLP_PROTOCOL": "http/protobuf",
"OTEL_EXPORTER_OTLP_ENDPOINT": "http://collector.example.com:4318",
"OTEL_EXPORTER_OTLP_HEADERS": "Authorization=Bearer your-token",
}
async def main():
options = ClaudeAgentOptions(env=OTEL_ENV)
async for message in query(
prompt="List the files in this directory", options=options
):
print(message)
asyncio.run(main())
import { query } from "@anthropic-ai/claude-agent-sdk";
const otelEnv = {
CLAUDE_CODE_ENABLE_TELEMETRY: "1",
// Required for traces, which are in beta. Metrics and log events do not need this.
CLAUDE_CODE_ENHANCED_TELEMETRY_BETA: "1",
// Choose an exporter per signal. Use otlp for the SDK; see the Note below.
OTEL_TRACES_EXPORTER: "otlp",
OTEL_METRICS_EXPORTER: "otlp",
OTEL_LOGS_EXPORTER: "otlp",
// Standard OTLP transport configuration.
OTEL_EXPORTER_OTLP_PROTOCOL: "http/protobuf",
OTEL_EXPORTER_OTLP_ENDPOINT: "http://collector.example.com:4318",
OTEL_EXPORTER_OTLP_HEADERS: "Authorization=Bearer your-token",
};
for await (const message of query({
prompt: "List the files in this directory",
// env replaces the inherited environment in TypeScript, so spread
// process.env first to keep PATH, ANTHROPIC_API_KEY, and other variables.
options: { env: { ...process.env, ...otelEnv } },
})) {
console.log(message);
}
Because the child process inherits your application's environment by default, you can achieve the same result by exporting these variables in a Dockerfile, Kubernetes manifest, or shell profile and omitting options.env entirely.
Note
The console exporter writes telemetry to standard output, which the SDK uses
as its message channel. Do not set console as an exporter value when running
through the SDK. To inspect telemetry locally, point
OTEL_EXPORTER_OTLP_ENDPOINT at a local collector or an all-in-one Jaeger
container instead.
The CLI batches telemetry and exports on an interval. It flushes any pending data when the process exits cleanly, so a query call that completes normally does not lose spans. However, if your process is killed before the CLI shuts down, anything still in the batch buffer is lost. Lowering the export intervals reduces that window.
By default, metrics export every 60 seconds and traces and logs export every 5 seconds. The following example shortens all three intervals so that data reaches the collector while a short task is still running:
OTEL_ENV = {
# ... exporter configuration from the previous example ...
"OTEL_METRIC_EXPORT_INTERVAL": "1000",
"OTEL_LOGS_EXPORT_INTERVAL": "1000",
"OTEL_TRACES_EXPORT_INTERVAL": "1000",
}
const otelEnv = {
// ... exporter configuration from the previous example ...
OTEL_METRIC_EXPORT_INTERVAL: "1000",
OTEL_LOGS_EXPORT_INTERVAL: "1000",
OTEL_TRACES_EXPORT_INTERVAL: "1000",
};
Traces give you the most detailed view of an agent run. With CLAUDE_CODE_ENHANCED_TELEMETRY_BETA=1 set, each step of the agent loop becomes a span you can inspect in your tracing backend:
claude_code.interaction: wraps a single turn of the agent loop, from receiving a prompt to producing a response.claude_code.llm_request: wraps each call to the Claude API, with model name, latency, and token counts as attributes.claude_code.tool: wraps each tool invocation, with child spans for the permission wait (claude_code.tool.blocked_on_user) and the execution itself (claude_code.tool.execution).claude_code.hook: wraps each hook execution.Every span carries a session.id attribute. When you make several query calls against the same session, filter on session.id in your backend to see them as one timeline.
Note
Tracing is in beta. Span names and attributes may change between releases. See Traces (beta) in the Monitoring reference for the trace exporter configuration variables.
By default, the CLI reports service.name as claude-code. If you run several agents, or run the SDK alongside other services that export to the same collector, override the service name and add resource attributes so you can filter by agent in your backend.
The following example renames the service and attaches deployment metadata. These values are applied as OpenTelemetry resource attributes on every span, metric, and event the agent emits:
options = ClaudeAgentOptions(
env={
# ... exporter configuration ...
"OTEL_SERVICE_NAME": "support-triage-agent",
"OTEL_RESOURCE_ATTRIBUTES": "service.version=1.4.0,deployment.environment=production",
},
)
const options = {
env: {
...process.env,
// ... exporter configuration ...
OTEL_SERVICE_NAME: "support-triage-agent",
OTEL_RESOURCE_ATTRIBUTES:
"service.version=1.4.0,deployment.environment=production",
},
};
Telemetry is structural by default. Token counts, durations, model names, and tool names are always recorded, but the content your agent reads and writes is not. Three opt-in variables add content to the exported data:
| Variable | Adds |
|---|---|
OTEL_LOG_USER_PROMPTS=1 |
Prompt text on claude_code.user_prompt events and on the claude_code.interaction span |
OTEL_LOG_TOOL_DETAILS=1 |
Tool input arguments (file paths, shell commands, search patterns) on claude_code.tool_result events |
OTEL_LOG_TOOL_CONTENT=1 |
Full tool input and output bodies as span events on claude_code.tool, truncated at 60 KB. Requires tracing to be enabled |
Leave these unset unless your observability pipeline is approved to store the data your agent handles. See Security and privacy in the Monitoring reference for the full list of attributes and redaction behavior.
These guides cover adjacent topics for monitoring and deploying agents:
Track and display todos using the Claude Agent SDK for organized task management
Todo tracking provides a structured way to manage tasks and display progress to users. The Claude Agent SDK includes built-in todo functionality that helps organize complex workflows and keep users informed about task progression.
Todos follow a predictable lifecycle:
pending when tasks are identifiedin_progress when work beginsThe SDK automatically creates todos for:
import { query } from "@anthropic-ai/claude-agent-sdk";
for await (const message of query({
prompt: "Optimize my React app performance and track progress with todos",
options: { maxTurns: 15 }
})) {
// Todo updates are reflected in the message stream
if (message.type === "assistant") {
for (const block of message.message.content) {
if (block.type === "tool_use" && block.name === "TodoWrite") {
const todos = block.input.todos;
console.log("Todo Status Update:");
todos.forEach((todo, index) => {
const status =
todo.status === "completed" ? "✅" : todo.status === "in_progress" ? "🔧" : "❌";
console.log(`${index + 1}. ${status} ${todo.content}`);
});
}
}
}
}
from claude_agent_sdk import query, AssistantMessage, ToolUseBlock
async for message in query(
prompt="Optimize my React app performance and track progress with todos",
options={"max_turns": 15},
):
# Todo updates are reflected in the message stream
if isinstance(message, AssistantMessage):
for block in message.content:
if isinstance(block, ToolUseBlock) and block.name == "TodoWrite":
todos = block.input["todos"]
print("Todo Status Update:")
for i, todo in enumerate(todos):
status = (
"✅"
if todo["status"] == "completed"
else "🔧"
if todo["status"] == "in_progress"
else "❌"
)
print(f"{i + 1}. {status} {todo['content']}")
import { query } from "@anthropic-ai/claude-agent-sdk";
class TodoTracker {
private todos: any[] = [];
displayProgress() {
if (this.todos.length === 0) return;
const completed = this.todos.filter((t) => t.status === "completed").length;
const inProgress = this.todos.filter((t) => t.status === "in_progress").length;
const total = this.todos.length;
console.log(`\nProgress: ${completed}/${total} completed`);
console.log(`Currently working on: ${inProgress} task(s)\n`);
this.todos.forEach((todo, index) => {
const icon =
todo.status === "completed" ? "✅" : todo.status === "in_progress" ? "🔧" : "❌";
const text = todo.status === "in_progress" ? todo.activeForm : todo.content;
console.log(`${index + 1}. ${icon} ${text}`);
});
}
async trackQuery(prompt: string) {
for await (const message of query({
prompt,
options: { maxTurns: 20 }
})) {
if (message.type === "assistant") {
for (const block of message.message.content) {
if (block.type === "tool_use" && block.name === "TodoWrite") {
this.todos = block.input.todos;
this.displayProgress();
}
}
}
}
}
}
// Usage
const tracker = new TodoTracker();
await tracker.trackQuery("Build a complete authentication system with todos");
from claude_agent_sdk import query, AssistantMessage, ToolUseBlock
from typing import List, Dict
class TodoTracker:
def __init__(self):
self.todos: List[Dict] = []
def display_progress(self):
if not self.todos:
return
completed = len([t for t in self.todos if t["status"] == "completed"])
in_progress = len([t for t in self.todos if t["status"] == "in_progress"])
total = len(self.todos)
print(f"\nProgress: {completed}/{total} completed")
print(f"Currently working on: {in_progress} task(s)\n")
for i, todo in enumerate(self.todos):
icon = (
"✅"
if todo["status"] == "completed"
else "🔧"
if todo["status"] == "in_progress"
else "❌"
)
text = (
todo["activeForm"]
if todo["status"] == "in_progress"
else todo["content"]
)
print(f"{i + 1}. {icon} {text}")
async def track_query(self, prompt: str):
async for message in query(prompt=prompt, options={"max_turns": 20}):
if isinstance(message, AssistantMessage):
for block in message.content:
if isinstance(block, ToolUseBlock) and block.name == "TodoWrite":
self.todos = block.input["todos"]
self.display_progress()
# Usage
tracker = TodoTracker()
await tracker.track_query("Build a complete authentication system with todos")
Deploy and host Claude Agent SDK in production environments
The Claude Agent SDK differs from traditional stateless LLM APIs in that it maintains conversational state and executes commands in a persistent environment. This guide covers the architecture, hosting considerations, and best practices for deploying SDK-based agents in production.
Info
For security hardening beyond basic sandboxing (including network controls, credential management, and isolation options), see Secure Deployment.
For security and isolation, the SDK should run inside a sandboxed container environment. This provides process isolation, resource limits, network control, and ephemeral filesystems.
The SDK also supports programmatic sandbox configuration for command execution.
Each SDK instance requires:
Runtime dependencies
Resource allocation
Network access
api.anthropic.comUnlike stateless API calls, the Claude Agent SDK operates as a long-running process that:
Several providers specialize in secure container environments for AI code execution:
For self-hosted options (Docker, gVisor, Firecracker) and detailed isolation configuration, see Isolation Technologies.
Create a new container for each user task, then destroy it when complete.
Best for one-off tasks, the user may still interact with the AI while the task is completing, but once completed the container is destroyed.
Examples:
Maintain persistent container instances for long running tasks. Often times running multiple Claude Agent processes inside of the container based on demand.
Best for proactive agents that take action without the users input, agents that serve content or agents that process high amounts of messages.
Examples:
Ephemeral containers that are hydrated with history and state, possibly from a database or from the SDK's session resumption features.
Best for containers with intermittent interaction from the user that kicks off work and spins down when the work is completed but can be continued.
Examples:
Run multiple Claude Agent SDK processes in one global container.
Best for agents that must collaborate closely together. This is likely the least popular pattern because you will have to prevent agents from overwriting each other.
Examples:
When hosting in containers, expose ports to communicate with your SDK instances. Your application can expose HTTP/WebSocket endpoints for external clients while the SDK runs internally within the container.
The dominant cost of serving agents is the tokens; containers vary based on what you provision, but a minimum cost is roughly 5 cents per hour running.
This is likely provider dependent, different sandbox providers will let you set different criteria for idle timeouts after which a sandbox might spin down. You will want to tune this timeout based on how frequent you think user response might be.
The Claude Code CLI is versioned with semver, so any breaking changes will be versioned.
Since containers are just servers the same logging infrastructure you use for the backend will work for containers.
An agent session will not timeout, but consider setting a 'maxTurns' property to prevent Claude from getting stuck in a loop.
A guide to securing Claude Code and Agent SDK deployments with isolation, credential management, and network controls
Claude Code and the Agent SDK are powerful tools that can execute code, access files, and interact with external services on your behalf. Like any tool with these capabilities, deploying them thoughtfully ensures you get the benefits while maintaining appropriate controls.
Unlike traditional software that follows predetermined code paths, these tools generate their actions dynamically based on context and goals. This flexibility is what makes them useful, but it also means their behavior can be influenced by the content they process: files, webpages, or user input. This is sometimes called prompt injection. For example, if a repository's README contains unusual instructions, Claude Code might incorporate those into its actions in ways the operator didn't anticipate. This guide covers practical ways to reduce this risk.
The good news is that securing an agent deployment doesn't require exotic infrastructure. The same principles that apply to running any semi-trusted code apply here: isolation, least privilege, and defense in depth. Claude Code includes several security features that help with common concerns, and this guide walks through these along with additional hardening options for those who need them.
Not every deployment needs maximum security. A developer running Claude Code on their laptop has different requirements than a company processing customer data in a multi-tenant environment. This guide presents options ranging from Claude Code's built-in security features to hardened production architectures, so you can choose what fits your situation.
Agents can take unintended actions due to prompt injection (instructions embedded in content they process) or model error. Claude models are designed to resist this, and as analyzed in the model card, Claude Opus 4.6 is the most robust frontier model available.
Defense in depth is still good practice though. For example, if an agent processes a malicious file that instructs it to send customer data to an external server, network controls can block that request entirely.
Claude Code includes several security features that address common concerns. See the security documentation for full details.
For deployments that require additional hardening beyond Claude Code's defaults, these principles guide the available options.
A security boundary separates components with different trust levels. For high-security deployments, you can place sensitive resources (like credentials) outside the boundary containing the agent. If something goes wrong in the agent's environment, resources outside that boundary remain protected.
For example, rather than giving an agent direct access to an API key, you could run a proxy outside the agent's environment that injects the key into requests. The agent can make API calls, but it never sees the credential itself. This pattern is useful for multi-tenant deployments or when processing untrusted content.
When needed, you can restrict the agent to only the capabilities required for its specific task:
| Resource | Restriction options |
|---|---|
| Filesystem | Mount only needed directories, prefer read-only |
| Network | Restrict to specific endpoints via proxy |
| Credentials | Inject via proxy rather than exposing directly |
| System capabilities | Drop Linux capabilities in containers |
For high-security environments, layering multiple controls provides additional protection. Options include:
The right combination depends on your threat model and operational requirements.
Different isolation technologies offer different tradeoffs between security strength, performance, and operational complexity.
Info
In all of these configurations, Claude Code (or your Agent SDK application) runs inside the isolation boundary (the sandbox, container, or VM). The security controls described below restrict what the agent can access from within that boundary.
| Technology | Isolation strength | Performance overhead | Complexity |
|---|---|---|---|
| Sandbox runtime | Good (secure defaults) | Very low | Low |
| Containers (Docker) | Setup dependent | Low | Medium |
| gVisor | Excellent (with correct setup) | Medium/High | Medium |
| VMs (Firecracker, QEMU) | Excellent (with correct setup) | High | Medium/High |
For lightweight isolation without containers, sandbox-runtime enforces filesystem and network restrictions at the OS level.
The main advantage is simplicity: no Docker configuration, container images, or networking setup required. The proxy and filesystem restrictions are built in. You provide a settings file specifying allowed domains and paths.
How it works:
bubblewrap on Linux, sandbox-exec on macOS) to restrict read/write access to configured pathsSetup:
npm install @anthropic-ai/sandbox-runtime
Then create a configuration file specifying allowed paths and domains.
Security considerations:
Same-host kernel: Unlike VMs, sandboxed processes share the host kernel. A kernel vulnerability could theoretically enable escape. For some threat models this is acceptable, but if you need kernel-level isolation, use gVisor or a separate VM.
No TLS inspection: The proxy allowlists domains but doesn't inspect encrypted traffic. If the agent has permissive credentials for an allowed domain, ensure it isn't possible to use that domain to trigger other network requests or to exfiltrate data.
For many single-developer and CI/CD use cases, sandbox-runtime raises the bar significantly with minimal setup. The sections below cover containers and VMs for deployments requiring stronger isolation.
Containers provide isolation through Linux namespaces. Each container has its own view of the filesystem, process tree, and network stack, while sharing the host kernel.
A security-hardened container configuration might look like this:
docker run \
--cap-drop ALL \
--security-opt no-new-privileges \
--security-opt seccomp=/path/to/seccomp-profile.json \
--read-only \
--tmpfs /tmp:rw,noexec,nosuid,size=100m \
--tmpfs /home/agent:rw,noexec,nosuid,size=500m \
--network none \
--memory 2g \
--cpus 2 \
--pids-limit 100 \
--user 1000:1000 \
-v /path/to/code:/workspace:ro \
-v /var/run/proxy.sock:/var/run/proxy.sock:ro \
agent-image
Here's what each option does:
| Option | Purpose |
|---|---|
--cap-drop ALL |
Removes Linux capabilities like NET_ADMIN and SYS_ADMIN that could enable privilege escalation |
--security-opt no-new-privileges |
Prevents processes from gaining privileges through setuid binaries |
--security-opt seccomp=... |
Restricts available syscalls; Docker's default blocks ~44, custom profiles can block more |
--read-only |
Makes the container's root filesystem immutable, preventing the agent from persisting changes |
--tmpfs /tmp:... |
Provides a writable temporary directory that's cleared when the container stops |
--network none |
Removes all network interfaces; the agent communicates through the mounted Unix socket below |
--memory 2g |
Limits memory usage to prevent resource exhaustion |
--pids-limit 100 |
Limits process count to prevent fork bombs |
--user 1000:1000 |
Runs as a non-root user |
-v ...:/workspace:ro |
Mounts code read-only so the agent can analyze but not modify it. Avoid mounting sensitive host directories like ~/.ssh, ~/.aws, or ~/.config |
-v .../proxy.sock:... |
Mounts a Unix socket connected to a proxy running outside the container (see below) |
Unix socket architecture:
With --network none, the container has no network interfaces at all. The only way for the agent to reach the outside world is through the mounted Unix socket, which connects to a proxy running on the host. This proxy can enforce domain allowlists, inject credentials, and log all traffic.
This is the same architecture used by sandbox-runtime. Even if the agent is compromised via prompt injection, it cannot exfiltrate data to arbitrary servers. It can only communicate through the proxy, which controls what domains are reachable. For more details, see the Claude Code sandboxing blog post.
Additional hardening options:
| Option | Purpose |
|---|---|
--userns-remap |
Maps container root to unprivileged host user; requires daemon configuration but limits damage from container escape |
--ipc private |
Isolates inter-process communication to prevent cross-container attacks |
Standard containers share the host kernel: when code inside a container makes a system call, it goes directly to the same kernel that runs the host. This means a kernel vulnerability could allow container escape. gVisor addresses this by intercepting system calls in userspace before they reach the host kernel, implementing its own compatibility layer that handles most syscalls without involving the real kernel.
If an agent runs malicious code (perhaps due to prompt injection), that code runs in the container and could attempt kernel exploits. With gVisor, the attack surface is much smaller: the malicious code would need to exploit gVisor's userspace implementation first and would have limited access to the real kernel.
To use gVisor with Docker, install the runsc runtime and configure the daemon:
// /etc/docker/daemon.json
{
"runtimes": {
"runsc": {
"path": "/usr/local/bin/runsc"
}
}
}
Then run containers with:
docker run --runtime=runsc agent-image
Performance considerations:
| Workload | Overhead |
|---|---|
| CPU-bound computation | ~0% (no syscall interception) |
| Simple syscalls | ~2× slower |
| File I/O intensive | Up to 10-200× slower for heavy open/close patterns |
For multi-tenant environments or when processing untrusted content, the additional isolation is often worth the overhead.
VMs provide hardware-level isolation through CPU virtualization extensions. Each VM runs its own kernel, creating a strong boundary. A vulnerability in the guest kernel doesn't directly compromise the host. However, VMs aren't automatically "more secure" than alternatives like gVisor. VM security depends heavily on the hypervisor and device emulation code.
Firecracker is designed for lightweight microVM isolation. It can boot VMs in under 125ms with less than 5 MiB memory overhead, stripping away unnecessary device emulation to reduce attack surface.
With this approach, the agent VM has no external network interface. Instead, it communicates through vsock (virtual sockets). All traffic routes through vsock to a proxy on the host, which enforces allowlists and injects credentials before forwarding requests.
For cloud deployments, you can combine any of the above isolation technologies with cloud-native network controls:
credential_injector filter) that validates requests, enforces domain allowlists, injects credentials, and forwards to external APIsAgents often need credentials to call APIs, access repositories, or interact with cloud services. The challenge is providing this access without exposing the credentials themselves.
The recommended approach is to run a proxy outside the agent's security boundary that injects credentials into outgoing requests. The agent sends requests without credentials, the proxy adds them, and forwards the request to its destination.
This pattern has several benefits:
Claude Code supports two methods for routing sampling requests through a proxy:
Option 1: ANTHROPIC_BASE_URL (simple but only for sampling API requests)
export ANTHROPIC_BASE_URL="http://localhost:8080"
This tells Claude Code and the Agent SDK to send sampling requests to your proxy instead of the Claude API directly. Your proxy receives plaintext HTTP requests, can inspect and modify them (including injecting credentials), then forwards to the real API.
Option 2: HTTP_PROXY / HTTPS_PROXY (system-wide)
export HTTP_PROXY="http://localhost:8080"
export HTTPS_PROXY="http://localhost:8080"
Claude Code and the Agent SDK respect these standard environment variables, routing all HTTP traffic through the proxy. For HTTPS, the proxy creates an encrypted CONNECT tunnel: it cannot see or modify request contents without TLS interception.
You can build your own proxy or use an existing one:
credential_injector filter for adding auth headersBeyond sampling from the Claude API, agents often need authenticated access to other services, such as git repositories, databases, and internal APIs. There are two main approaches:
Provide access through an MCP server or custom tool that routes requests to a service running outside the agent's security boundary. The agent calls the tool, but the actual authenticated request happens outside. The tool calls to a proxy which injects the credentials.
For example, a git MCP server could accept commands from the agent but forward them to a git proxy running on the host, which adds authentication before contacting the remote repository. The agent never sees the credentials.
Advantages:
For Claude API calls, ANTHROPIC_BASE_URL lets you route requests to a proxy that can inspect and modify them in plaintext. But for other HTTPS services (GitHub, npm registries, internal APIs), the traffic is often encrypted end-to-end. Even if you route it through a proxy via HTTP_PROXY, the proxy only sees an opaque TLS tunnel and can't inject credentials.
To modify HTTPS traffic to arbitrary services, without using a custom tool, you need a TLS-terminating proxy that decrypts traffic, inspects or modifies it, then re-encrypts it before forwarding. This requires:
HTTP_PROXY/HTTPS_PROXY to route traffic through the proxyThis approach handles any HTTP-based service without writing custom tools, but adds complexity around certificate management.
Note that not all programs respect HTTP_PROXY/HTTPS_PROXY. Most tools (curl, pip, npm, git) do, but some may bypass these variables and connect directly. For example, Node.js fetch ignores these variables by default; in Node 24+ you can set NODE_USE_ENV_PROXY=1 to enable support. For comprehensive coverage, you can use proxychains to intercept network calls, or configure iptables to redirect outbound traffic to a transparent proxy.
Info
A transparent proxy intercepts traffic at the network level, so the client doesn't need to be configured to use it. Regular proxies require clients to explicitly connect and speak HTTP CONNECT or SOCKS. Transparent proxies (like Squid or mitmproxy in transparent mode) can handle raw redirected TCP connections.
Both approaches still require the TLS-terminating proxy and trusted CA certificate. They just ensure traffic actually reaches the proxy.
Filesystem controls determine what files the agent can read and write.
When the agent needs to analyze code but not modify it, mount the directory read-only:
docker run -v /path/to/code:/workspace:ro agent-image
Warning
Even read-only access to a code directory can expose credentials. Common files to exclude or sanitize before mounting:
| File | Risk |
|---|---|
.env, .env.local |
API keys, database passwords, secrets |
~/.git-credentials |
Git passwords/tokens in plaintext |
~/.aws/credentials |
AWS access keys |
~/.config/gcloud/application_default_credentials.json |
Google Cloud ADC tokens |
~/.azure/ |
Azure CLI credentials |
~/.docker/config.json |
Docker registry auth tokens |
~/.kube/config |
Kubernetes cluster credentials |
.npmrc, .pypirc |
Package registry tokens |
*-service-account.json |
GCP service account keys |
*.pem, *.key |
Private keys |
Consider copying only the source files needed, or using .dockerignore-style filtering.
If the agent needs to write files, you have a few options depending on whether you want changes to persist:
For ephemeral workspaces in containers, use tmpfs mounts that exist only in memory and are cleared when the container stops:
docker run \
--read-only \
--tmpfs /tmp:rw,noexec,nosuid,size=100m \
--tmpfs /workspace:rw,noexec,size=500m \
agent-image
If you want to review changes before persisting them, an overlay filesystem lets the agent write without modifying underlying files. Changes are stored in a separate layer you can inspect, apply, or discard. For fully persistent output, mount a dedicated volume but keep it separate from sensitive directories.
Complete API reference for the TypeScript Agent SDK, including all functions, types, and interfaces.