← all skills+ Submit a skill
Budget
Save money. Same agent.
Eight practical ways to drop agent cost or context waste. Five are installable tools; three are pure technique.
#namesourceproof
1
cavemangeneraltool
Use when you want Claude Code (or Codex/Gemini/Cursor/30+ agents) to cut ~75% of output tokens by replying in fragment/telegraphic style — full technical accuracy, smaller mouth.
github.com
JuliusBrussee/cavemaninspect★ 59k
2
free-claude-codegeneraltool
Use when you want Claude Code to route its Anthropic Messages API traffic through a local proxy to free or cheap providers — NVIDIA NIM, Kimi, Wafer, OpenRouter, DeepSeek, LM Studio, llama.cpp, or Ollama.
github.com
Alishahryar1/free-claude-codeinspect★ 24k
3
claude-code-routergeneraltool
Put OpenRouter in front of Claude Code: one API key reaches 300+ models, so the easy turns route to a backend at 2–5% of Sonnet's price while the hard ones stay on a frontier model — and spend is visible per request from one dashboard.
github.com/musistudio
musistudio/claude-code-routeropen★ 35k
4
headroomgeneraltool
Compress tool outputs, logs, files, and RAG chunks before they reach the LLM — 60–95% fewer tokens with the same answers. Ships as a library, a proxy, and an MCP server.
github.com/zereight
zereight/headroomopen★ 12k
5
code modegeneraltool
Cloudflare's Code Mode: instead of handing MCP tools to the model directly, it converts them into a TypeScript API the model writes code against — run in a Workers sandbox. Tool schemas and intermediate results stay out of the context window, so token use drops sharply.
github.com/cloudflare
cloudflare/agentsopen★ 5.1k
How they compose
caveman cuts output tokens, free-claude-code and claude-code-router change the model bill — the latter routing through OpenRouter to the cheapest capable model, headroom strips 60–95% of the tokens out of context before they ever reach the model, code mode keeps MCP tool schemas and intermediate results in a sandbox instead of the context window, fan out subagents keeps parent context smaller, /goal keeps the work from drifting, and ask-expert-mcp reserves frontier-model spend for the hard 5% of decisions.