Skip to content

CLI vs. MCP for a coding agent

A coding agent (Claude Code, Cursor, an in-house agent over an LLM) has two ways to read and write a Hadron memory:

  1. The hadron CLI — the agent shells out: hadron node get <urn>, hadron node add, hadron edge ls <urn>, …
  2. The h-* MCP tools — the agent calls h-read-node, h-find-nodes, h-add-node, … over the Model Context Protocol.

Both speak to the same Hadron server over the same GraphQL API and return the same content. A node read through hadron node get and the same node read through h-read-node carry the same bytes of payload. So the choice is rarely about capability — it's about what the access costs the agent, and the currency that matters most is context-window tokens.

This page explains the trade-off and gives a decision rule. For the command surface and the tool surface themselves, see hadron CLI and MCP tools; for which credential each path uses, see Authentication.

The short answer

Your session looks like… Reach for
A one-off read — pull one node, check one fact, then move on CLI, plain text. Lowest token footprint, zero setup.
Many reads/writes across the Hadron graph in one session — traverse edges, add nodes, validate MCP. The native, structured path; its fixed cost amortizes.
You need to parse fields programmatically (loc, tags, edge targets) Either — CLI --json to pipe through jq, or MCP for already-structured results.
You're in a token-constrained context and the host loads every MCP tool up front CLI — it adds no resident tool-schema tax (see below).

The rest of this page is why.

The two paths, briefly

CLI. The agent runs a shell command through its Bash tool. Authentication is a one-time hadron auth login; the token lives in the OS keychain and every later invocation is already signed in. Output is human-readable text by default, or machine-readable JSON with --json. It is one tool (Bash) the agent already has, driving a self-documenting binary (hadron agentic-usage, hadron <cmd> --help).

MCP. The agent calls a Hadron tool directly. Authentication rides the MCP connection (OAuth or an API key configured once in the host). Results come back in a fixed, agent-oriented format. There are around thirty h-* tools covering memory ops, node/edge CRUD, sessions, validation, actions, and chat.

The important thing both share: the payload is identical. What differs is everything around the payload — and that overhead is where the tokens go.

Token usage — the part that usually decides it

There are three distinct token costs, and conflating them is the most common mistake. Pull them apart and the decision gets easy.

How these numbers were measured

Figures below come from reading one real node — the mmdata::preflight router, a deliberately large node with ~30 outgoing edges — through each path. Character and word counts are measured exactly. Token figures are estimates (no local tokenizer; ≈ 4 chars/token), and run a little higher in practice because Hadron content is dense with URNs and hyphenated identifiers, which tokenize into many pieces. The absolute numbers scale with node size; the ratios below hold regardless.

1. The payload dominates — and it's the same either way

Reading the preflight node returns the node's meta block (name, URN, abstract, type, tags, edges, referenced-by) plus its body:

Path Measured output Est. tokens
hadron node get <urn> (plain text) 7,687 chars / 982 words ~2,000–2,400
h-read-node (default text result) renders the same meta-block + body ~2,000–2,400
hadron node get <urn> --json 17,185 chars ~4,500–5,500

The CLI's plain text and the MCP result are content-equivalent — both render the same node, so their size tracks each other, not the JSON. For the actual content of a read, CLI-vs-MCP is a wash: you're paying for the node either way.

2. Output format moves more tokens than transport

The sharpest lever in that table isn't CLI vs. MCP — it's plain text vs. JSON: --json was 2.2× larger (17,185 vs. 7,687 chars) for the same node, because it serializes every field, timestamps, and both edge directions as structured keys.

Don't reach for --json unless you'll parse it

If the agent is going to read the node, take plain text (the CLI default, and what h-read-node returns). Use --json only when a script or jq pipeline consumes specific fields — then the structure earns its tokens. The MCP path sidesteps this by returning a compact text rendering by default and reserving structure for where it's needed.

A second format lever: when an agent only needs the gist of a node, h-read-node accepts contentScope: 'abstract' to return the abstract instead of the full body — a much smaller payload for a "what is this node about?" check.

3. The standing tool-schema tax — the real, often-hidden MCP cost

This is the cost people forget, and it's the one that actually separates the two paths.

Every MCP tool a host exposes carries a JSON schema (name, description, parameter shape). Those schemas sit in the model's context as resident tool definitions. How much that costs depends on how the host loads tools:

  • Eager-loading hosts (Claude Desktop, Cursor, and most MCP clients) inject the server's entire tool list into context at session start, and it stays resident on every turn for the whole conversation. Hadron exposes ~30 h-* tools; at a few hundred tokens of schema each, that's a fixed cost on the order of 10,000+ tokens paid before the agent reads a single node — and paid again, as resident context, on every subsequent turn.
  • Lazy / deferred-loading harnesses (e.g. Claude Code with tool search) load a tool's schema only when the agent first needs it. You pay per tool actually used, once each — not for the whole surface.

The CLI adds none of this. It's one tool (Bash) the agent already has. Subcommand discovery (hadron node get --help, hadron agentic-usage) is the rough analog of a schema, but it's optional, one-off, and only as large as what you actually ask for.

In a lazy-loading harness, the MCP tax is per tool type, not global

Loading h-read-node does not load h-find-nodes or h-add-node — each new tool type costs its own one-time schema load. So "I'll do more MCP calls this session" only amortizes if those calls reuse tools you've already loaded, or if you'll use enough distinct tools that the native integration pays for itself. Reaching for one new MCP tool to make one more read can cost more tokens than just shelling out to the CLI you already have.

4. Per-call invocation overhead is negligible

The call itself — a command string (hadron node get <urn>) versus a tool call with a tiny argument ({"urn": "…"}) — is a handful of tokens either way. It never drives the decision.

Putting the three costs together

                    Fixed (resident all session)        Per read
CLI                 ~0 (Bash already present)            payload (plain text)  \ same
MCP (lazy host)     per tool-type, once each             payload (text result) / payload
MCP (eager host)    ~10k+ (whole ~30-tool surface)       payload (text result)
  • One-off read, eager host: the CLI can be dramatically cheaper — you avoid paying for ~30 tool schemas to read one node.
  • Many graph operations in a session: the MCP fixed cost amortizes across calls, and its native structuring and traversal (next section) start to pay off.
  • Either way, per-read tokens are governed by node size and output format, not by the transport.

Beyond tokens

Tokens usually decide it, but a few non-token factors can tip a close call:

Factor CLI MCP
Auth One-time hadron auth login; token in keychain, every call pre-authenticated Configured once in the host (OAuth / API key); rides the connection
Output Human text by default; --json (stable field contract) for machines Agent-oriented format; structured where it counts
Graph traversal hadron edge ls <urn>, then follow up with more gets h-read-next-node / h-find-nodes are built for walking edges and semantic search — more natural for multi-hop traversal
Trimming output Pipe through head, jq, grep to cut a huge node down before it hits context Fixed result shape (mitigated by contentScope: 'abstract')
Scriptability Composes in a shell pipeline; exit codes are a stable contract Call-by-call; no shell composition
Discoverability Self-documenting: hadron agentic-usage, --help Tool schemas describe each call inline (that's also the tax above)
Active-memory ergonomics Pass the memory/URN per call h-set-active-memory binds it once for the session

A decision rule for agents

  1. Default to the CLI for reads, in plain text. It's the lowest-token, zero-resident-cost path, and the agent is already authenticated via the keychain.
  2. Switch to MCP when the session is graph-heavy — many reads, edge traversal, semantic search (h-find-nodes), or writes — and the host keeps those tool schemas loaded anyway. Then the fixed cost is already sunk and the structured, traversal-native tools win.
  3. Use --json only to parse, never just to read. It roughly doubles the payload.
  4. In a token-constrained or eager-loading host, prefer the CLI unless you'll genuinely exercise enough of the MCP surface to amortize the ~10k-token resident tax.

The underlying principle: the node content costs the same through either door — optimize the overhead, not the payload. For a single read that means the CLI; for a session that lives in the graph it means MCP.

See also