Claude Code Doesn't Index Your Codebase—It Walks Through It

Claude Code is running in production across multi-million-line monorepos, decades-old legacy systems, and distributed architectures spanning dozens of repositories. It doesn't build an index. Instead, it traverses the file system, reads files, uses grep, and follows references—just like a human engineer. This agentic approach avoids the staleness problem of RAG-based tools, where an embedding pipeline might return a function that was renamed two weeks ago.

But there's a tradeoff: Claude needs enough starting context to know where to look. Without it, asking it to find a vague pattern across a billion-line codebase will hit context-window limits before any work begins.

The Harness Matters More Than the Model

A common misconception is that Claude Code's capabilities are defined by the model alone. In practice, the harness—the ecosystem built around the model—determines performance more than the model. The harness has five extension points:

  • CLAUDE.md files – context files read automatically at session start. Root file for big picture, subdirectory files for local conventions.
  • Hooks – scripts triggered by events. Stop hooks can reflect on a session and propose CLAUDE.md updates. Start hooks load team-specific context dynamically.
  • Skills – packaged instructions for specific task types, loaded on demand. A security review skill loads when assessing vulnerabilities; a document processing skill loads when documentation needs updating. Skills can be scoped to paths.
  • Plugins – bundles of skills, hooks, and MCP configurations distributed across the org. Example: a retail org built a skill connecting Claude to their internal analytics platform and distributed it as a plugin.
  • MCP servers – connections to external tools and data. Teams built MCP servers exposing structured search as a tool Claude can call directly.

Two additional capabilities round out the setup:

  • LSP integrations – give Claude symbol-level navigation: "go to definition", "find all references". One enterprise company deployed LSP org-wide before their Claude Code rollout, specifically to make C and C++ navigation reliable.
  • Subagents – isolated Claude instances with their own context window. Some teams spin up a read-only subagent to map a subsystem, then have the main agent edit with the full picture.

Three Configuration Patterns from Successful Deployments

Making the Codebase Navigable at Scale

  • Keep CLAUDE.md files lean and layered. Root file should be pointers and critical gotchas only. Claude loads them additively as it moves through subdirectories.
  • Initialize in subdirectories, not the repo root. Claude automatically walks up the directory tree and loads every CLAUDE.md it finds, so root-level context is never lost.
  • Scope test and lint commands per subdirectory. Running the full suite when Claude changed one service causes timeouts. CLAUDE.md files at subdirectory level should specify commands that apply there.
  • Use .ignore files to exclude generated files, build artifacts, and third-party code. Commit permissions.deny rules in .claude/settings.json so exclusions are version-controlled.
  • Build codebase maps. For organizations where code isn't consolidated in a conventional directory structure, a lightweight markdown file at the repo root listing each top-level folder with a one-line description gives Claude a table of contents.

Setting Up a Service-Oriented Codebase

In a service-oriented monorepo where each service has its own conventions and tooling:

  • Root CLAUDE.md contains conventions shared across services (coding style, PR guidelines).
  • Each service directory has its own CLAUDE.md with local build/test commands and domain-specific knowledge.
  • Hooks enforce linting and formatting deterministically. A stop hook captures session learnings and proposes updates to the relevant CLAUDE.md.
  • A deployment skill is bound to the payments service directory so it never auto-loads when working elsewhere.

Setting Up a Legacy Monolith

For a decades-old legacy system with no clear directory structure:

  • Start with a codebase map at the root, listing every top-level folder with a one-line description.
  • Add a root CLAUDE.md with critical gotchas: "Never edit files in /legacy/v1 directly; use the adapter pattern."
  • Use .ignore files to exclude generated code and third-party libraries.
  • Enable LSP for symbol-level navigation, especially for C/C++ codebases where pattern-matching can land on the wrong symbol.
  • Create a skill for the build system (e.g., a custom Makefile structure) so Claude knows how to compile and test.

Common Mistakes

The article includes a table summarizing common confusion:

  • Using CLAUDE.md for reusable expertise that belongs in a skill.
  • Using prompts for things that should run automatically via hooks.
  • Loading everything into CLAUDE.md instead of using skills.
  • Letting good setups stay tribal instead of distributing via plugins.
  • Assuming LSP is automatic (it must be configured).
  • Building MCP connections before the basics (CLAUDE.md, hooks) are working.
  • Running exploration and editing in the same session instead of using subagents.

Why This Matters

Claude Code's agentic search is a fundamentally different approach from RAG-based tools. For teams working in large, actively developed codebases, this means no embedding pipeline to maintain, no stale index, and the ability to work from the live codebase. But it requires upfront investment in making the codebase legible—CLAUDE.md files, hooks, skills, and plugins. Teams that invest see better results.

Next Steps

If you're adopting Claude Code in a large codebase, start with a root CLAUDE.md file and a codebase map. Then add per-subdirectory CLAUDE.md files with local commands. Configure hooks for linting and formatting. Build one skill for a common task (e.g., running tests for a specific service). Distribute the setup as a plugin. Only then consider MCP servers or LSP integrations.