Claude Code MCP-CLI Mode for Token Optimization

Claude Code MCP-CLI Mode for Token Optimization
Dec 25, 2025
- ai
- claude-code
If you use multiple MCP servers with Claude Code, you probably noticed your context window filling up fast. Like, 40-50k tokens gone before you even ask a question. That’s because every MCP tool schema gets loaded into the system prompt at startup.

The Problem

MCP servers are great for extending Claude’s capabilities, but the default behavior dumps all tool definitions upfront. Got five servers with 20 tools each? That’s a lot of JSON schemas eating your context. Some users reported starting sessions at 63% context usage—leaving barely any room for actual work.

The Fix

There’s an experimental feature that switches to on-demand tool loading. Instead of preloading everything, Claude uses bash commands to fetch tool info only when needed.

Enable it by adding this to your shell config:
```
export ENABLE_EXPERIMENTAL_MCP_CLI=true
```
Restart your terminal and Claude Code. Run /context to verify—your MCP token usage should drop significantly.

How It Works

Instead of having all tool schemas in memory, Claude now uses mcp-cli commands:
```
# List available tools
mcp-cli tools

# Get schema for a specific tool
mcp-cli info server/tool

# Call the tool
mcp-cli call server/tool '{"param": "value"}'
```
The tradeoff is an extra step before using any MCP tool, but the context savings are substantial—going from 63% to 11% at session start in some cases. That’s roughly 100k tokens reclaimed for actual work.

When to Use This

Enable MCP-CLI mode if:
- You have 3+ MCP servers configured
- Your sessions start at >30% context usage
- You use MCP tools occasionally, not constantly
Stick with default mode if:
- You have 1-2 small MCP servers
- You use MCP tools heavily every session
- The extra mcp-cli info step bothers you
Caveats
1. Extra latency — Each tool use requires an info lookup first. For heavy MCP users, this adds up.
2. Experimental status — The feature may change or break. Check Claude Code release notes.
3. Schema discovery — Claude has to “learn” tools exist via mcp-cli tools before using them. Sometimes it forgets.
Checking Your Savings

Before and after enabling, run /context to see token breakdown:
```
Context: 11% (22k/200k tokens)
├── System prompt: 8k
├── Conversation: 12k
└── MCP tools: 2k  ← was 50k+
```
TL;DR

Set ENABLE_EXPERIMENTAL_MCP_CLI=true in your shell config to load MCP tools on-demand instead of upfront. Saves a ton of context if you’re using multiple MCP servers.

Unlike DDR5 prices in 2025, this memory is actually getting cheaper.

Claude Code MCP-CLI Mode for Token Optimization

The Problem

The Fix

How It Works

When to Use This

Caveats

Checking Your Savings

TL;DR