The Tools System: Teaching Claude How to Touch the World
If the query engine is Claude Code's brain, tools are its hands. An LLM on its own is just a text generator confined to a vacuum. To interact with your file system, run scripts, or connect to the internet, it needs tools. Let's dive into exactly how Anthropic engineered an incredibly robust, deeply typed, and dynamically loaded tool ecosystem for Claude.
The Root: Tool.ts Interface
If you have ever built a simple GPT wrapper, you might have just defined tools as a massive JSON array of schemas. Anthropic took a much more sophisticated, object-oriented approach. In claude-code-src/Tool.ts, every single tool implements a generic interface: Tool<Input, Output, Progress>.
Why this complexity? Because in a terminal application, a tool is not just a schema. It needs to know how to validate inputs, check user permissions before doing something dangerous, render progress spinners using React (Ink), and serialize its output cleanly back to the API. Tying all these concerns perfectly into one single object eliminates an enormous amount of boilerplate and hard-to-track UI bugs.
// Tool.ts - The architecture of a tool (from the actual source)
export type Tool<
Input extends AnyObject = AnyObject,
Output = unknown,
P extends ToolProgressData = ToolProgressData,
> = {
readonly name: string
readonly inputSchema: Input
// The actual action takes place here:
call(args, context, canUseTool, parentMessage, onProgress?): Promise<ToolResult<Output>>
// Crucial Security Flags:
isConcurrencySafe(input): boolean // Can multiple instances run at once?
isReadOnly(input): boolean // Does it change anything on disk?
isDestructive?(input): boolean // Could it cause irreversible damage?
// Terminal React UI (Ink):
renderToolUseMessage(input, options): React.ReactNode
renderToolResultMessage?(output, progressMessages, options): React.ReactNode
// Memory bounds:
maxResultSizeChars: number // When to spill output to a tmp file
readonly shouldDefer?: boolean // Hide from initial prompt to save tokens
}
Why does every tool have render methods? Claude Code is not a web app. Its UI runs entirely inside your terminal using Ink (React for CLIs). When Claude calls BashTool, you see a nicely formatted command block with a spinner. That spinner, the progress bar, the diff preview for file edits: all of it is rendered by the tool's own renderToolUseMessage and renderToolResultMessage methods. The tool owns its entire visual lifecycle.
The Safety Net: buildTool() Factory
When you have dozens of tools, forgetting to implement a security check in just one of them can be disastrous. Imagine shipping a tool that accidentally defaults to "no permission needed" and lets Claude run destructive commands silently. To prevent this, Anthropic routes every single tool through the buildTool() factory function.
If a developer creates a new tool but forgets to specify whether it writes to disk, buildTool steps in and applies fail-closed defaults. It assumes the tool is NOT concurrency safe and NOT read-only, which means the permission system will automatically flag it and force user approval. You have to explicitly opt out of safety checks, not opt into them. That is a brilliant design decision.
// These are the actual defaults from Tool.ts (line 757)
const TOOL_DEFAULTS = {
isEnabled: () => true,
isConcurrencySafe: (_input?) => false, // Assume NOT safe (fail-closed)
isReadOnly: (_input?) => false, // Assume it writes (fail-closed)
isDestructive: (_input?) => false,
checkPermissions: (input, _ctx?) =>
Promise.resolve({ behavior: 'allow', updatedInput: input }),
toAutoClassifierInput: (_input?) => '', // Skip classifier for most tools
userFacingName: (_input?) => '',
}
// Every tool in the codebase uses this pattern:
export const GrepTool = buildTool({
name: 'Grep',
inputSchema: z.object({ pattern: z.string(), path: z.string() }),
maxResultSizeChars: 50_000,
isReadOnly: () => true, // Override: grep is safe
isConcurrencySafe: () => true, // Override: multiple greps are fine
async call(args) { /* ... */ },
renderToolUseMessage(input) { /* ... */ },
})
The Roster: How tools.ts Assembles Everything
All 40+ tool implementations live in their own directories under tools/. The file tools.ts is the central registry that pulls them all together into a single ordered list. But it is far from a simple import-and-export. This file contains some of the most interesting engineering decisions in the entire codebase.
Dead-Code Elimination with Feature Flags
Not every tool ships in every build. Anthropic uses Bun's compile-time feature() macro to strip entire tools from the final binary. If a feature flag is off, the tool's code literally does not exist in the compiled output. Zero bytes. Zero overhead.
// tools.ts - conditional tool loading (from the actual source)
import { feature } from 'bun:bundle'
// These tools only exist when their feature flag is enabled at BUILD time:
const SleepTool = feature('PROACTIVE') || feature('KAIROS')
? require('./tools/SleepTool/SleepTool.js').SleepTool
: null
const cronTools = feature('AGENT_TRIGGERS')
? [
require('./tools/ScheduleCronTool/CronCreateTool.js').CronCreateTool,
require('./tools/ScheduleCronTool/CronDeleteTool.js').CronDeleteTool,
require('./tools/ScheduleCronTool/CronListTool.js').CronListTool,
]
: []
// Internal-only tools gated by environment variable:
const REPLTool = process.env.USER_TYPE === 'ant'
? require('./tools/REPLTool/REPLTool.js').REPLTool
: null
Takeaway for your own projects: If you maintain a product with internal-only features or experimental capabilities, compile-time dead-code elimination is far more secure than runtime checks. A runtime if statement can be patched out by a determined user. Code that was never compiled into the binary simply cannot be accessed.
Why Tool Order Matters
The order of tools in the getAllBaseTools() array is not random. It directly affects the system prompt sent to the Claude API. Tools listed first appear more prominently in the prompt, which means Claude is statistically more likely to reach for them. That is why AgentTool, BashTool, FileReadTool, and FileEditTool are always at the top of the list.
// tools.ts - the master tool list (simplified from source)
export function getAllBaseTools(): Tools {
return [
AgentTool, // First: orchestrate subtasks
TaskOutputTool, // Monitor background work
BashTool, // Execute shell commands
GlobTool, GrepTool, // Search and discover
FileReadTool, // Read files
FileEditTool, // Surgical edits
FileWriteTool, // Create new files
// ... 30+ more tools, ordered by importance
BriefTool, // Last: session handoff (rarely needed)
]
}
Token Budgeting: Deferred Tools
Claude Code ships with over 46 built-in tools. Sending all 46 API schemas in the system prompt for every single conversation turn would consume thousands of tokens and drastically increase API costs and latency. To solve this, Anthropic implemented Deferred Tools.
By defining readonly shouldDefer: true, a tool is temporarily hidden from Claude's initial prompt. The model only knows about the core alwaysLoad tools directly (like Bash, File Edit, File Read). If Claude encounters a task needing a niche tool (like creating a Scheduled Cron job), it first calls the ToolSearchTool with a keyword query. The system returns the hidden schema, and then Claude can use it.
Each deferred tool includes a searchHint property (a short phrase like "create recurring cron scheduled agent task") that the ToolSearch algorithm matches against. Think of it as the tool advertising its capabilities to an internal search engine.
Design lesson: If you are building LLM applications with more than 10-15 tools, implement a tool-search discovery mechanism like this. It keeps your system prompt slim, reduces token costs, and paradoxically improves tool selection accuracy because the model is not overwhelmed by irrelevant options.
Managing Memory: maxResultSizeChars
What happens if Claude asks to run ls -laR / and the bash tool outputs 50 megabytes of text? Feeding that back into the LLM context window would instantly blow up the token limit and crash the session.
Anthropic enforces a strict maximum cap via the maxResultSizeChars property on every tool. If a tool's output string exceeds this limit, the system gracefully traps the output, saves it to a temporary file on the user's disk, and instead returns a compressed summary to Claude: "Output was too large. I saved the full result to /tmp/claude-xyz.txt. You can read specific sections with FileReadTool."
The one exception is FileReadTool itself, which sets maxResultSizeChars: Infinity. Think about why: if a file read produces output that gets persisted to another file, Claude would need to use FileReadTool to read that file, which could again be too large, creating an infinite loop. So FileReadTool manages its own output size internally with line-range parameters.
The Full Tool Catalog
Here is a breakdown of every tool category in the codebase, with explanations of why each tool exists and how it fits into the larger architecture.
Filesystem Tools
The bread and butter for any coding agent. These let Claude navigate and manipulate your repository.
| Tool | What it does and why it matters |
|---|---|
| FileReadTool | Reads file contents with optional line ranges. Handles images, PDFs, and binary files automatically. Sets maxResultSizeChars: Infinity to avoid the circular persist-to-file problem described above. |
| FileEditTool | The surgical knife. Instead of replacing an entire file (which is slow and token-expensive), it uses precise string-match replacement to patch specific sections. Shows a diff preview in the terminal before applying. |
| FileWriteTool | Creates new files or completely overwrites existing ones. The permission system enforces that Claude must read a file before overwriting it, preventing accidental data loss. |
| GlobTool | Fast file-path matching. Returns results sorted by modification time, feeding the model the most recently touched (and therefore most relevant) files first. |
| GrepTool | Regex search across file contents using ripgrep under the hood. The backbone of code discovery: Claude uses this to find function definitions and variable usages before editing. |
| LSPTool | Language Server Protocol integration: hover info, go-to-definition, find-references. Gives Claude IDE-level understanding of your code's type system. |
Execution Tools
When the model writes code, it needs to test it. This is what sets Claude Code apart from basic web-based chatbots.
| Tool | What it does and why it matters |
|---|---|
| BashTool | Spawns a persistent shell session (PTY), so environment variables and working directories persist across calls. Features strict execution timeouts to prevent infinite loops, and integrates with the security classifier to flag dangerous commands before they run. |
| PowerShellTool | The Windows equivalent. Mirrors BashTool's API but targets PowerShell. Only loaded when running on Windows (dead-code eliminated on other platforms). |
| NotebookEditTool | Edits Jupyter Notebook cells without corrupting the underlying JSON structure. Supports adding, modifying, and deleting cells while preserving execution outputs. |
Agent & Task Tools
This is where Claude Code gets truly powerful: the ability to spawn copies of itself for parallel work.
| Tool | What it does and why it matters |
|---|---|
| AgentTool | Spawns a child agent with its own isolated tool context and message history. The parent agent can run multiple subagents concurrently, breaking massive refactors into digestible, parallel chunks. |
| TaskCreateTool | Creates background tasks (long-running subagents) that execute independently. Returns a Task ID so the main thread can poll progress asynchronously, just like async programming patterns. |
| TaskGetTool | Retrieves the status and output of a background task by its ID. Read-only. |
| TaskStopTool | Terminates a running background task. Think of it as kill for agent processes. |
| SendMessageTool | Sends a message to another agent in a swarm/team configuration. Enables multi-agent collaboration on complex problems. |
| TeamCreateTool | Creates a team of coordinated agents. Only available when the AGENT_SWARMS feature is enabled. |
Web & Network Tools
| Tool | What it does and why it matters |
|---|---|
| WebFetchTool | Fetches web pages and converts HTML to clean markdown for Claude to consume. Handles redirects, authentication, and even PDF rendering. |
| WebSearchTool | Performs web searches and returns ranked results with snippets. Backed by search APIs. Read-only. |
MCP & Integration Tools
Anthropic knew they could not build every tool imaginable. Instead of a messy plugin system, they implemented the Model Context Protocol (MCP) as the universal extension mechanism.
| Tool | What it does and why it matters |
|---|---|
| MCPTool | A dynamic proxy. At runtime, it constructs a full Tool interface for any external server that speaks MCP. Claude Code can instantly "learn" to use your Postgres database or Stripe API without Anthropic changing a single line of core code. |
| McpAuthTool | Handles mid-session authentication flows (OAuth redirects, token exchange) when an MCP server requires credentials. |
| ListMcpResourcesTool | Enumerates available resources from connected MCP servers so Claude knows what data sources are accessible. |
| ReadMcpResourceTool | Reads a specific resource from an MCP server by URI. |
Utility & Meta Tools
| Tool | What it does and why it matters |
|---|---|
| ToolSearchTool | The discovery engine for deferred tools. Claude searches by keyword, and the system returns matching tool schemas on demand. |
| SkillTool | Executes a "skill" (a bundled script or slash-command workflow) programmatically from within a tool call. |
| TodoWriteTool | Manages a live todo list in the REPL sidebar. Interestingly, it renders nothing in the chat transcript because its output goes to the sidebar panel instead. |
| AskUserQuestionTool | Pauses execution to ask the user a clarifying question and wait for a response. Only works in interactive mode. |
| BriefTool | Creates a compressed summary of the current session for handoff to another agent. Think of it as "context serialization" for agent-to-agent communication. |
| EnterPlanModeTool | Switches Claude into "plan mode" where every tool call requires explicit user approval. The model can enter this mode voluntarily when it detects a high-risk situation. |
| ConfigTool | Reads and writes Claude Code configuration (user settings, project settings). Only available in internal builds. |
Source Deep Dive: files.ts — Binary File Detection
Before any filesystem tool reads a file, Claude Code needs to know whether it's text or binary. Feeding raw binary data to an LLM as if it were source code wastes tokens and produces garbled output. The logic lives in src/utils/files.ts, which exports a two-stage detection system: a fast extension blocklist and a content-heuristic fallback.
Stage 1: The Extension Blocklist
BINARY_EXTENSIONS is a JavaScript Set containing roughly 80 file extensions across every category of binary content. It's organized by type, giving a clear picture of what Claude Code considers "untouchable as text":
| Category | Extensions included |
|---|---|
| Images | .png, .jpg, .jpeg, .gif, .bmp, .ico, .webp, .tiff, .tif |
| Video | .mp4, .mov, .avi, .mkv, .webm, .wmv, .flv, .m4v, .mpeg, .mpg |
| Audio | .mp3, .wav, .ogg, .flac, .aac, .m4a, .wma, .aiff, .opus |
| Archives | .zip, .tar, .gz, .bz2, .7z, .rar, .xz, .z, .tgz, .iso |
| Executables | .exe, .dll, .so, .dylib, .bin, .o, .a, .obj, .lib, .app, .msi, .deb, .rpm |
| Documents | .pdf, .doc, .docx, .xls, .xlsx, .ppt, .pptx, .odt, .ods, .odp |
| Fonts | .ttf, .otf, .woff, .woff2, .eot |
| Bytecode / VM | .pyc, .pyo, .class, .jar, .war, .ear, .node, .wasm, .rlib |
| Databases | .sqlite, .sqlite3, .db, .mdb, .idx |
| Design / 3D | .psd, .ai, .eps, .sketch, .fig, .xd, .blend, .3ds, .max |
| Flash | .swf, .fla |
| Lock / profiling | .lockb, .dat, .data |
The lookup function is a one-liner:
export function hasBinaryExtension(filePath: string): boolean {
const ext = filePath.slice(filePath.lastIndexOf('.')).toLowerCase()
return BINARY_EXTENSIONS.has(ext)
}
This runs in O(1) time because it's a Set lookup, making the extension check essentially free on every file open. Notice that .pdf is included in the binary list, which would seem to prevent Claude from reading PDFs. The source comment explains the exception: "PDF is here; FileReadTool excludes it at the call site." FileReadTool has special-cased PDF to invoke a PDF extraction pipeline rather than raw file reading, so the extension check is bypassed for PDFs specifically.
Stage 2: The Content Heuristic
Extension matching isn't enough — files can be misnamed, and new binary formats appear regularly. The second check, isBinaryContent(), inspects the actual file bytes:
const BINARY_CHECK_SIZE = 8192 // Only check first 8KB
export function isBinaryContent(buffer: Buffer): boolean {
const checkSize = Math.min(buffer.length, BINARY_CHECK_SIZE)
let nonPrintable = 0
for (let i = 0; i < checkSize; i++) {
const byte = buffer[i]!
// Null byte is a strong indicator of binary
if (byte === 0) {
return true // Early exit: single null byte = binary
}
// Count non-printable, non-whitespace bytes
// Printable ASCII is 32-126, plus tab(9), newline(10), CR(13)
if (byte < 32 && byte !== 9 && byte !== 10 && byte !== 13) {
nonPrintable++
}
}
// If more than 10% non-printable, likely binary
return nonPrintable / checkSize > 0.1
}
The heuristic has two tiers. The first is an immediate early exit: a single null byte (\0) anywhere in the first 8 KB is treated as a definitive binary indicator. Nearly all binary formats (ELF executables, PE headers, compiled JARs, ZIP archives) have null bytes in their headers, while valid UTF-8 text essentially never does. The second tier counts all bytes below ASCII 32 that aren't standard whitespace (tab, newline, carriage return) — if more than 10% of the sampled bytes are non-printable, the file is classified as binary.
Why only 8 KB? Reading the full file just to determine if it's readable would be expensive for large files. The 8 KB sample is a well-established heuristic: if a file is text, its first 8 KB will reflect that reliably. Tools like git, file, and most IDEs use a similar approach. The cost is a small false-negative rate for pathological cases (e.g., a text file with binary headers), which is acceptable given how rarely it occurs.
Where Binary Detection Runs
These two functions are called in sequence in FileReadTool before any file content is returned to the model. If hasBinaryExtension() returns true (and the file isn't PDF), the tool returns an informative error message like "This file has a binary extension and cannot be read as text." If the extension passes but isBinaryContent() returns true on the first 8 KB, the same error is returned. The model never sees the raw bytes, preventing both garbage output and context window contamination from binary data.
The Safety Protocol
Of course, arming an AI with BashTool without safeguards would be reckless. Every single tool.call() invocation passes through a multi-layered permissions interceptor before anything actually executes.
validateInput()checkPermissions()tool.call()First, validateInput() checks that the arguments are structurally valid. Then checkPermissions() runs tool-specific security logic. The result is one of three outcomes:
{ behavior: 'allow' }- Proceed immediately. Used for read-only tools like GrepTool.{ behavior: 'ask' }- Freeze execution and show a permission dialog in the terminal. The model waits until you explicitly approve.{ behavior: 'deny' }- Reject the tool call entirely and tell Claude why it was denied.
Tools that declare isDestructive: () => true trigger an even more prominent confirmation dialog. This is how Claude Code ensures you always stay in control, even when the agent is running autonomously.
For a complete deep dive into the permission modes (Interactive, Auto, Bypass) and the security classifier, see the Security & Permissions page.