Claude Code hides a Tamagotchi pet system with 18 species and a rarity system straight out of Pokémon. It has a dreaming agent that consolidates memories while you sleep. It runs an undercover stealth mode that scrubs every trace of AI from commits. And buried in the source code, an obfuscated codename reveals an unreleased model family. This article exposes all of it, plus the full architecture behind Anthropic's 1,900-file AI coding agent. No code is reproduced. Only architecture, flows, and design decisions.
The codebase is massive: roughly 1,900 TypeScript files totalling approximately 45 MB of source. It spans 35+ subsystems, 30+ built-in tools, around 80 slash commands, and about 15 bundled skills. But before we dive into the core architecture, let's start with the features that nobody expected to find.
Hidden in src/buddy/ lies a complete virtual pet system. Every Claude Code user gets a unique companion, generated deterministically from their user ID. It was designed as an April 1, 2026 easter egg, and it is remarkably deep.
The companion uses a two-level data model. The CompanionBones (species, rarity, eye, hat, shiny status, stats) are deterministic, recalculated from your user ID every session. The CompanionSoul (name, personality, hatch date) is persisted and generated only once, the first time you "hatch" your pet. Because Bones are always recomputed from a hash, editing config files cannot grant you a Legendary.
Stats are distributed based on rarity. Each companion gets a peak stat (floor + 50 + random, capped at 100), a dump stat (floor - 10 + random, minimum 1), and three scattered stats (floor + random). The base floor ranges from 5 (Common) to 50 (Legendary).
Sprites are ASCII art: 5 lines tall, 12 characters wide, with multiple idle animation frames per species. Line 0 is reserved for hat placement. Common pets get no hat; all higher rarities receive a random one (crown \^^^/, tophat [___], wizard /^\, halo, propeller, beanie, or the delightful ,> tinyduck).
The launch timeline is precise: April 1–7, 2026 is the teaser window with a rainbow notification prompting /buddy. From April 8 onward, the command becomes permanently available. Anthropic employees (USER_TYPE === 'ant') have permanent access regardless of date.
KAIROS transforms Claude Code from a session-based tool into a persistent assistant. It maintains daily logs, runs continuously between sessions, and, most remarkably, dreams. During periods of inactivity, a background process called AutoDream automatically consolidates your accumulated daily logs into structured, thematic memory files.
AutoDream runs in four phases. Orient: scan the memory directory, read the index and existing topic files. Gather: collect signal from daily logs, detect drifted memories (facts contradicted by current code), and grep transcripts for relevant patterns. Consolidate: merge new signal into topic files, convert relative dates to absolute, and remove contradicted facts. Prune & Index: update the MEMORY.md index while respecting the 25KB cap.
The consolidation runs as a forked sub-agent with read-only Bash (only ls, find, grep, cat, stat, wc, head, tail are allowed). Progress is tracked via a DreamTask state machine visible in the terminal footer, transitioning through pending → running → completed|failed|killed.
When KAIROS is active, Bash and PowerShell are forced into run_in_background=true on the main thread, and scheduled background tasks are enabled. The team memory sync mode (TEAMMEM) is incompatible; KAIROS takes precedence.
When Claude Code operates in a public or open-source repository, an internal security mechanism called Undercover Mode silently activates. Its purpose: scrub every trace of AI from commits, PRs, and prompts. The model literally receives an instruction saying: "You are operating UNDERCOVER in a PUBLIC/OPEN-SOURCE repository. Do not blow your cover."
The mode activates automatically unless the repository's remote URL matches an internal allowlist of approximately 20 Anthropic repos. There is no force-OFF mechanism, a deliberate design choice to prevent accidental codename leaks. The only way to override is to set the environment variable CLAUDE_CODE_UNDERCOVER=1, which forces the mode on, not off.
When active, commit attribution becomes an empty string: no Co-Authored-By, no "Generated with Claude Code", nothing. The entire dead code elimination pipeline ensures that external builds of Claude Code contain none of this undercover logic; it only exists in internal Anthropic builds.
Scattered across the Claude Code source are references to "Capybara", an internal codename for a new model family. The name is protected by multiple layers of obfuscation to prevent leaks in public builds.
In the BUDDY pet system, the species name "capybara" collides with the model codename canary in excluded-strings.txt. To avoid triggering it, the name is encoded as String.fromCharCode(0x63, 0x61, 0x70, 0x79, 0x62, 0x61, 0x72, 0x61).
Code comments across query.ts, permissionSetup.ts, messages.ts, and toolResultStorage.ts reveal several technical characteristics. Capybara uses protected thinking (replaying its thinking blocks to a standard Opus model causes 400 errors). It has specific sampling behaviour where ~10% of responses trigger the stop sequence. The model ID format encodes speed: capybara-v2-fast[1m] indicates a 1M-context fast variant.
For Anthropic employees, the model name is masked in the UI using maskModelCodename(): the first 3 characters remain visible, the rest are replaced with asterisks. capybara-v2-fast becomes cap*****-v2-fast.
ULTRAPLAN offloads complex planning to a remote multi-agent session running on "Claude Code on the web" (CCR). Your local terminal stays free while the cloud session runs an Opus 4.6 agent for up to 30 minutes.
Invocation is simple: type /ultraplan <prompt> or just include the word "ultraplan" in any message; keyword detection handles the rest. The system teleports the session to a remote CCR instance, where multi-agent planning executes with a 30-minute timeout. An ExitPlanModeScanner polls the remote session for a result, supports iterative rejection (you can request modifications), and detects terminal conditions.
When the plan is ready, you choose: execute remotely (outputs a pull request) or send it back locally to continue in your terminal.
Claude Code uses two complementary feature flag systems. Compile-time flags via import { feature } from 'bun:bundle' enable dead code elimination: disabled features are completely removed from the build. Runtime flags via GrowthBook (prefixed tengu_) allow dynamic activation and A/B testing without rebuilds.
Notable runtime flags include tengu_ultraplan_model (controls ULTRAPLAN's model), tengu_onyx_plover (gates AutoDream), tengu_ant_model_override (model override for Anthropic employees), and tengu_streaming_tool_execution2 (streaming tool execution). The compile-time system is critical for security: code gated behind USER_TYPE === 'ant' is entirely eliminated from public builds via Bun's constant-folding, ensuring internal features never leak.
That covers the hidden gems. Now let's walk through the full architecture that powers all of this, from the eight-layer stack to the triple memory system.
Claude Code is organised in a clean layered architecture. Every user message flows down through these layers, and every tool result flows back up. Understanding these layers is the key to understanding the entire system.
Every interaction starts at the top (the UI), descends through the conversation loop where Claude's API is called, triggers tools that go through security checks, leverages services, persists memory, and can be accessed remotely through the bridge. Let's walk through each layer in detail.
The first surprise for most developers: Claude Code doesn't run on Node.js. It runs on Bun, a modern JavaScript/TypeScript runtime written in Zig that offers significantly faster startup times and native TypeScript support.
The stack looks like this:
A notable design choice: the code uses import { feature } from 'bun:bundle' for compile-time feature flags. This enables dead code elimination at build time, allowing features like coordinator mode, fork sub-agents, workflow scripts, and cached microcompact to be toggled on or off without any runtime overhead.
Claude Code's entry point (cli.tsx) is obsessed with startup speed. The architecture uses fast paths: for flags like --version or --dump-system-prompt, the process exits before loading any heavy dependency.
The main bootstrap sequence in main.tsx (~500+ lines) does something clever: it launches multiple expensive operations in parallel before the imports even finish. Keychain reads on macOS (~65ms saved), MDM subprocess spawning, and credential prefetching all start as side-effects at the very top of the file, before any other module is loaded.
The full sequence: parse CLI arguments, load OAuth config, initialise GrowthBook feature flags, prefetch credentials (AWS, GCP, OAuth), check fast mode status, load policy limits, and then finally launch the REPL, which is the main interactive screen powered by Ink/React.
This is the core of the entire system, orchestrated by three files: QueryEngine.ts, query.ts, and claude.ts.
The QueryEngine is the session manager. It maintains the full message history (mutableMessages[]), token counters per message and cumulative usage, and handles user input processing: detecting slash commands, managing attachments, and formatting text.
When a user submits a message, the QueryEngine delegates to the query() function, which is essentially a while(true) loop. Each iteration follows a strict five-phase cycle:
The API client (claude.ts, ~3,400 lines, the largest file in the entire project) handles SSE streaming event by event. It implements a watchdog timer (30-second warning, 60-second hard timeout before abort), automatic retry with backoff on rate limits and overload errors, and a non-streaming fallback mode (10-minute timeout, 64K max tokens) if streaming fails.
Every tool in Claude Code is a TypeScript object that conforms to a rich interface with roughly 50 properties. Each tool declares its identity (name, aliases, search hints), execution logic (call(), input/output Zod schemas), capabilities (concurrency safety, read-only, destructive), permission hooks, and UI rendering methods.
Here is the complete tool inventory:
An important mechanism is deferred tools. Not all tools are loaded into the initial prompt. Tools marked with shouldDefer are only fetched when the model requests them via ToolSearch, keeping the prompt size manageable while still making every tool available on demand.
Tool discovery follows a pipeline: getAllBaseTools() is the single source of truth, respecting feature flags and conditional imports. Then getTools() filters by deny rules, REPL mode, and simple mode. Finally, assembleToolPool() combines built-in tools with MCP tools, sorts by name for prompt cache stability, and deduplicates (built-in tools win on conflict).
When Claude's response contains multiple tool calls, the orchestration layer partitions them into batches based on safety:
Each individual tool call goes through a six-step pipeline: resolve the tool by name or alias, check permissions (validateInput then checkPermissions then the general permission system), validate inputs against the Zod schema, invoke tool.call(), enforce a size budget on the result (persisting to disk if too large), and finally yield the result as a UserMessage.
Security is not an afterthought in Claude Code. It is the architectural theme. The permission system defines seven modes, from fully interactive to fully autonomous:
The permission decision flow is a three-phase pipeline. First, the tool's own validateInput() can reject a call before permissions are even checked. Then checkPermissions() returns allow, deny, ask, or passthrough. Finally, the general permission system checks rules in priority order (policy > user > project > local > CLI > session) and applies the current mode.
Three specialised permission handlers exist depending on context. The Interactive Handler (for the main CLI) races hooks, a classifier, user interaction, and bridge responses; the first to resolve wins via an atomic claim() guard, with a 200ms grace period to prevent accidental keystrokes. The Coordinator Handler runs hooks first (fast, local), then the classifier (slower, inference), with a fallback to dialogue. The Swarm Worker Handler tries the classifier, then transfers the request to the swarm leader via mailbox.
Permission rules use a compact format: Bash(git status) for exact match, Bash(npm run:*) for prefix match, Bash(docker *) for wildcard, Bash for the entire tool, and FileEdit(/path/**) for path-based matching.
The BashTool is by far the most complex tool in terms of security. It implements a defence-in-depth strategy with 23 individual security checks, powered by two parsers in cascade.
The primary parser is Tree-Sitter via WASM, which performs full AST tokenisation including quote parsing. It detects malformed tokens, quoting bugs, and shell metacharacters. When Tree-Sitter is unavailable, a legacy shell-quote parser handles fallback detection.
The permission rule matching for Bash is itself a pipeline: strip safe redirections (>/dev/null, 2>&1), strip safe wrappers (timeout, time, nice), strip safe environment variables, then iterate to a fixed point for nested wrappers. For deny/ask rules, all environment variables are stripped as a defence-in-depth measure. Compound commands are blocked against prefix rules to prevent cd /path && evil from matching a safe prefix.
Claude Code's sandbox uses a dedicated library (@anthropic-ai/sandbox-runtime) with platform-specific backends:
The sandbox configuration supports fine-grained filesystem access (allowRead, allowWrite, denyRead, denyWrite), network domain filtering (allowedDomains, denyDomains), and command exclusions. Critical paths are always protected: settings files, the skills directory, git hooks, git config. The current directory and Claude's temp directory are auto-allowed.
Managing the context window is one of the hardest problems in agentic AI. Claude Code solves it with a three-tier compaction pipeline that runs before every API call.
The effective context window is calculated as contextWindow - min(maxOutputTokens, 20,000). The auto-compact buffer is 13,000 tokens. Only specific tool outputs get microcompacted: Read, Shell, Grep, Glob, WebSearch, WebFetch, Edit, and Write.
Claude Code doesn't just compress old context; it remembers. Three complementary memory systems work together, each with a different scope and trigger.
Auto Memory lives at ~/.claude/projects/{project}/memory/. Each memory is a YAML-frontmattered Markdown file with a type (user, feedback, project, reference), indexed by MEMORY.md (capped at 200 lines, ~25KB). This index is loaded into the system prompt at every conversation start, giving Claude persistent knowledge across sessions.
Session Memory is automatic, intra-session note-taking. It writes structured markdown (title, current state, task spec, files, workflow, errors, learnings, key results, worklog) via a forked agent. Updates are throttled by token growth and tool call count.
Extract Memories is triggered when the main agent produces a final text response. A background forked agent (max 5 turns) reads the conversation, checks the existing memory manifest, and writes new long-term memories to the auto-memory directory. Only the main agent triggers this; sub-agents don't.
The AgentTool is Claude Code's central mechanism for multi-agent execution. It supports three modes:
When a sub-agent is created, it clones the parent's file state cache (LRU), initialises MCP servers from its definition's frontmatter, and registers with session hooks. The permission context is configured with the allowed tools creating session-scoped rules, while deny/ask rules are inherited from the parent.
The fork mechanism is particularly clever. It creates a copy of the agent with the parent's full context and identical tools, running with bubble permission mode (prompts escalate to the parent terminal). Forks are limited to 200 turns and include anti-recursion protection (detecting the <fork_boilerplate> tag). The fork messages are carefully constructed to maximise prompt cache hits between siblings — only the final text block differs.
MCP is the protocol that allows Claude Code to connect to external tool servers. Configuration comes from six sources: project (.mcp.json), user (~/.claude/mcp.json), enterprise (managed config), Claude.ai proxy, plugin manifests, and marketplace configs.
Seven transport types are supported: stdio (local subprocess), sse (Server-Sent Events), http (streamable HTTP), ws (WebSocket), sse-ide/ws-ide (IDE integrations), sdk (in-process), and claudeai-proxy.
When an MCP tool is invoked, the model generates a tool_use block with the prefixed name. The MCPTool wrapper validates inputs against the server's schema, calls tools/call on the MCP server, and transforms the result: downsampling images, truncating large content to 100KB, persisting binaries to disk, and handling auth errors (session expired, retry with URL elicitation).
Claude Code supports four scopes for plugin installation: user (~/.claude/plugins/), project (.claude/plugins/), local (temp directory), and managed (policy, read-only). The installation process resolves the plugin from the marketplace or a git URL, downloads and extracts to a versioned cache, loads the manifest (commands, skills, MCP servers, hooks), activates in settings.json, and creates a symlink.
Skills come in four flavours:
Disk-based skills use a rich YAML frontmatter: name, description, whenToUse, argument hints, target model, hooks (pre/post-command), and effort level. The markdown body contains the actual instructions for the model.
The bridge is the layer that connects the CLI engine to desktop, web, and mobile clients. The same conversation engine powers every surface; the terminal is just one of them.
Two bridge modes exist. The Environment-Based Bridge requires OAuth authentication and uses GrowthBook gating, polling the Environments API. It can spawn N parallel sessions with exponential backoff (max 2 minutes). The Env-Less Bridge uses JWT tokens refreshed proactively, with SSE for server-to-client and HTTP for client-to-server, with automatic JWT rebuild on 401 errors.
The UI is built with React rendered in the terminal via Ink. Three main screens exist: the REPL (main interactive screen), Doctor (system diagnostic), and ResumeConversation (session recovery). About 100 React components power the interface, from chat messages and diff viewers to permission dialogs and agent progress lines.
The ~80 slash commands come in three types: PromptCommand (markdown, invocable by the model), LocalJSXCommand (UI component), and LocalJSCommand (JS function). They cover git (/commit, /diff, /branch), sessions (/resume, /compact, /clear), configuration (/config, /permissions, /model), plugins (/plugin, /mcp, /skills), and more.
Commands are loaded from five sources: hardcoded builtins, disk-based skills, plugin skills, bundled skills, and MCP. They're filtered by availability requirements (auth, provider), enable/disable status, and deduplicated against base commands.
Two notable features live in the UI layer. A full Vim mode is implemented with normal, insert, visual, and command modes, complete with motions, operators, registers, marks, and search/replace. And voice support provides streaming speech-to-text with keyword detection, integrated directly into the REPL.
After analysing every major subsystem, from the BUDDY pet system to the conversation loop, from undercover mode to the triple memory system, seven architectural choices stand out as particularly distinctive:
The overarching design principles are fail-closed security (when in doubt, ask the user), defence in depth (multiple validation layers stack from exact rules to path matching to classifier), streaming-first (SSE with non-streaming fallback), cache-optimised (the entire architecture is designed to maximise prompt caching, from tool sort order to fork message construction), and resumable (transcripts for recovery, session memory for context continuity).
This analysis is educational and does not reproduce source code. The codebase analysed is the property of Anthropic. What's described here is the architecture and the design decisions: the what and the why, not the how.
We help engineering teams integrate Claude Code into their development workflows.