v0.9.4 · re-validated 2026-05-24 MIT 100% local

Pre-indexed code knowledge graph
for your AI coding agent.

Stop paying for agents to re-discover your codebase with grep and Read. CodeGraph parses every supported file into a symbol+edge graph, then exposes it through MCP tools that Claude Code, Cursor, Codex, opencode, and Hermes Agent can query directly.

~35%cheaper
~57%fewer tokens
~46%faster
~71%fewer tool calls
$ curl -fsSL https://raw.githubusercontent.com/colbymchenry/codegraph/main/install.sh | sh

Self-contained — Node runtime bundled. No native build, no API keys, no telemetry.

Configures automatically for
Claude Code Cursor Codex CLI opencode Hermes Agent

The problem

Agents waste tokens re-discovering your codebase.

When Claude Code answers "how does X work?" it spawns Explore sub-agents that fan out across grep, glob, and Read. Every file scanned costs tokens. The same discovery happens again next session. CodeGraph pre-builds that map once and lets agents query it directly.

Without CodeGraph

  1. Spawn an Explore sub-agent
  2. Glob across the repo
  3. Grep for likely symbol names
  4. Read 20+ files to find the right one
  5. Recurse into call sites
  6. Synthesize an answer
Dozens of tool calls · millions of tokens · 2-3 minutes

With CodeGraph

  1. codegraph_context("how requests reach the DB")
  2. codegraph_explore([handler, router, repo])
  3. Synthesize an answer
A handful of calls · zero file reads · seconds

Benchmark: 7 real-world repos · 4 runs/arm · median reported

CodebaseLanguageCostTokensTimeTool calls
VS CodeTS · ~10k26%78%52%85%
ExcalidrawTS · ~64052%90%73%96%
DjangoPython · ~3k12%36%19%53%
TokioRust · ~79082%86%71%92%
OkHttpJava · ~6452%13%31%45%
GinGo · ~11021%34%27%40%
AlamofireSwift · ~11047%64%48%83%

Each cell = savings vs no-CodeGraph baseline at the median of 4 runs. Bigger codebases benefit most — agents answer from the index with zero file reads, while the baseline thrashes through grep/find/Read.

How it works

Four stages: extract → store → resolve → auto-sync.

Your agent
Claude Code · Cursor · Codex · opencode · Hermes
"How does a request reach the database?"
Calls CodeGraph tools directly — no Explore sub-agent fan-out.
CodeGraph MCP server
codegraph serve --mcp
contexttraceexplore searchcallerscallees impactnodefilesstatus
SQLite knowledge graph
.codegraph/codegraph.db · WAL · FTS5
nodes: function · class · method · route · file
edges: calls · imports · extends · implements · references
01

Extraction

tree-sitter parses every source file into an AST. Per-language queries lift out nodes (functions, classes, methods, types) and edges (calls, imports, extends, implements). 19+ languages share the same pipeline.

// per-language query (TypeScript)
(call_expression
  function: (identifier) @callee)            ; emits a call edge

(class_declaration
  name: (type_identifier) @name
  body: (class_body) @body)                  ; emits a class node
02

Storage — SQLite + FTS5

Everything goes into a single file at .codegraph/codegraph.db. Symbol names are mirrored into an FTS5 virtual table so name lookups are instant across millions of symbols. WAL journaling lets concurrent reads never block on writes.

CREATE TABLE nodes (
  id        INTEGER PRIMARY KEY,
  kind      TEXT,        -- function | class | method | route ...
  name      TEXT,
  file_id   INTEGER,
  range     TEXT         -- start/end line+col
);

CREATE TABLE edges (
  src       INTEGER,
  dst       INTEGER,
  rel       TEXT         -- calls | imports | extends | references ...
);

CREATE VIRTUAL TABLE nodes_fts
  USING fts5(name, content='nodes');
03

Resolution

After extraction, references are resolved across files: foo() → its definition; import { X } → source module; class inheritance chains; framework-specific routing (Django path(), Express app.get(), NestJS @Controller + @Get, Rails routes, Spring @GetMapping, …).

The result is a traversable graph: callers(X) returns the set of functions that actually call X, including hops through dynamic dispatch, interfaces, callbacks, and re-render boundaries that grep misses.

04

Auto-sync

The MCP server subscribes to native OS file events (FSEvents on macOS, inotify on Linux, ReadDirectoryChangesW on Windows). Changes are debounced with a 2-second quiet window, filtered to source extensions, and incrementally re-indexed — only touched files re-parse. The graph stays current as you code, zero config.

watcher.on('change', debounce(2000, files => {
  const sources = files.filter(isSupportedExt);
  cg.syncFiles(sources);   // re-parse + re-link only the dirty set
}));

What an agent actually does, end-to-end

  1. turn 1
    Agent reads its system prompt and sees: "if .codegraph/ exists, answer from the graph — don't delegate to a file-reading sub-agent."
  2. turn 1
    Calls codegraph_context("how requests reach the database"). The server runs FTS5 search → resolves top candidates → walks callers + callees 1-2 hops → returns a packed relationship map with the relevant source inline.
  3. turn 2
    If a specific hop needs more depth, agent calls codegraph_trace(handler, db.query) — the server returns each hop's body inline, following dynamic-dispatch edges (interface → impl, callback wiring) that grep can't traverse.
  4. turn 3
    Synthesize the answer. Typical total: 3-8 tool calls, zero Read calls, ~80% fewer tokens than the no-CodeGraph baseline.

The interface

Ten MCP tools — one for each kind of question.

CodeGraph deliberately exposes a small, intent-shaped surface. Each tool maps to a question agents actually ask, so the model picks the right one without re-deriving what the graph already knows.

codegraph_context
"Map this task / feature / area first"

Composes search + node + callers + callees in one call. The default first step.

codegraph_trace
"How does X reach Y?"

Returns the full call path with each hop's body inline — follows dynamic dispatch grep can't.

codegraph_explore
"Survey several related symbols at once"

One budget-capped call returns source for N related symbols grouped by file + a relationship map.

codegraph_search
"Find a symbol by name"

FTS5-backed name search across the whole graph. Returns ranked candidates with kind + file.

codegraph_callers
"What calls this?"

Walk inbound call edges one hop at a time. Filter by file, kind, depth.

codegraph_callees
"What does this call?"

Walk outbound call edges. The mirror of callers.

codegraph_impact
"What breaks if I change this?"

Transitive call closure with depth control. Surfaces every site touched by a change.

codegraph_node
"Show me this exact symbol"

Single symbol fetch — signature, source, range, edges.

codegraph_files
"What's in this project?"

Indexed file structure. Faster than filesystem scanning and respects .gitignore.

codegraph_status
"Is the index healthy?"

Counts, last-sync timestamp, journal mode, watcher state.

Agents are told to answer directly. The installer writes instructions into each agent's rules file (CLAUDE.md, .cursor/rules/codegraph.mdc, ~/.codex/AGENTS.md, …) that steer the model away from delegating exploration to a file-reading sub-agent — because if it does, the sub-agent will grep + Read anyway and the index becomes overhead.

What you get

Built for the way agents actually work.

Smart Context Building

One tool call returns entry points, related symbols, and code snippets — no exploration agents.

🔎

Full-Text Search

FTS5-powered symbol search across the whole codebase. Instant, even on huge monorepos.

🧭

Impact Analysis

Trace callers, callees, and the full impact radius of any symbol before making changes.

🔁

Always Fresh

Native OS file watchers (FSEvents / inotify / ReadDirectoryChangesW) with debounced auto-sync.

🌐

19+ Languages

TypeScript, JavaScript, Python, Go, Rust, Java, C#, PHP, Ruby, C, C++, Swift, Kotlin, Scala, Dart, Vue, Svelte, Liquid, Pascal, Lua, Luau.

🛣️

Framework-aware Routes

14 frameworks recognized — links URL patterns to handlers so "callers of view X" surfaces the route binding it.

🔒

100% Local

No data leaves your machine. No API keys. No external services. SQLite database only.

📦

Zero-config

Bundled Node runtime. Respects .gitignore automatically. Nothing to wire up per language.

Coverage

Languages & frameworks.

Languages

TypeScriptJavaScript PythonGo RustJava C#PHP RubyC C++Swift KotlinScala DartSvelte Vue / NuxtLiquid Pascal / DelphiLua Luau

Framework-aware routes

DjangoFlask FastAPIExpress NestJSLaravel DrupalRails SpringGin / chi / mux Axum / actix / RocketASP.NET VaporReact Router SvelteKit

CLI

One binary, everything you need.

codegraphRun interactive installer
codegraph installConfigure agents (--target, --yes, --location)
codegraph uninstallReverse the installer for every agent it touched
codegraph init [path]Initialize a project (--index to also index)
codegraph index [path]Full index (--force to re-index)
codegraph sync [path]Incremental update
codegraph status [path]Counts + journal mode + watcher state
codegraph query <search>Search symbols (--kind, --limit, --json)
codegraph context <task>Build context for an AI
codegraph callers <sym>Inbound calls (--limit, --json)
codegraph callees <sym>Outbound calls
codegraph impact <sym>Transitive impact (--depth)
codegraph affected [files…]Tests affected by changed files — pipe from git diff
codegraph serve --mcpStart the MCP server

CI hook: run only the tests your diff actually touches

#!/usr/bin/env bash
AFFECTED=$(git diff --name-only HEAD | codegraph affected --stdin --quiet)
if [ -n "$AFFECTED" ]; then
  npx vitest run $AFFECTED
fi

Programmatic use

Embed CodeGraph in your own tools.

The same engine ships as a TypeScript library. Index a project, query it, watch for changes — all without spinning up the MCP server.

import CodeGraph from '@colbymchenry/codegraph';

const cg = await CodeGraph.init('/path/to/project');
await cg.indexAll({
  onProgress: p => console.log(`${p.phase}: ${p.current}/${p.total}`)
});

const hits      = cg.searchNodes('UserService');
const callers   = cg.getCallers(hits[0].node.id);
const context   = await cg.buildContext('fix login bug', { maxNodes: 20, includeCode: true, format: 'markdown' });
const impact    = cg.getImpactRadius(hits[0].node.id, 2);

cg.watch();    // auto-sync on file changes
cg.unwatch();  // stop watching
cg.close();

Install

Three commands. One project. Done.

  1. 1 · Install the binary

    # macOS / Linux
    curl -fsSL https://raw.githubusercontent.com/colbymchenry/codegraph/main/install.sh | sh
    
    # Windows (PowerShell)
    irm https://raw.githubusercontent.com/colbymchenry/codegraph/main/install.ps1 | iex
    
    # or via npm
    npx @colbymchenry/codegraph

    The interactive installer auto-detects Claude Code, Cursor, Codex CLI, opencode, and Hermes Agent — writes the MCP config + instructions file for each one you pick. Use --yes --target=auto in CI.

  2. 2 · Restart your agent

    So the MCP server loads. Claude Code, Cursor, Codex CLI, opencode, Hermes — pick whichever you use.

  3. 3 · Initialize your project

    cd your-project
    codegraph init -i

    Builds the per-project knowledge graph at .codegraph/codegraph.db and wires up any project-local agent surfaces (e.g. Cursor's .cursor/rules/codegraph.mdc). Once .codegraph/ exists in a project, your agent uses CodeGraph automatically.