TokenPak¶
A local proxy that compresses your LLM context before it hits the API — fewer tokens, lower cost, same results.
TokenPak sits between your AI tools and the upstream LLM provider, running entirely on 127.0.0.1. It deterministically packages context (Prompt Packing), routes requests, blocks runaway spend before it hits the wire (Spend Guard), and logs every saving locally. No cloud, no credentials stored, no code changes.
OSS beta
These docs describe the OSS beta of TokenPak (pip install tokenpak, currently v1.9.3, Apache 2.0). Anything not listed here is not part of the beta surface. See Known Issues for current limitations.
What ships in the OSS beta¶
- Prompt Packing pipeline — deterministic context reduction on real agent workloads; reduction pinned to an agent-style CI fixture (reproduce with
make benchmark-headline); provider-cached flows show lower incremental gains. Measure your own withtokenpak savings. - Local proxy on 127.0.0.1 — byte-preserved passthrough; your prompts and credentials never leave your machine.
- Spend Guard — pre-send circuit breaker with rolling caps; blocks runaway requests before they reach the provider and returns a clear release directive.
- Nine client integrations — Claude Code, Cursor, Cline, Continue, Aider, Codex CLI, Gemini CLI, OpenAI SDK, Anthropic SDK.
- Savings Ledger + local dashboard — every request logged to a local SQLite store with causal attribution; TUI + web dashboard.
- Vault indexing + semantic search — index your codebase, search without an LLM call.
- TIP-1.0 protocol contracts — canonical headers, metadata fields, capability labels, manifest schemas. Conformance gate runnable via
tokenpak doctor --conformance. - Pak recall (read-only) — storage, FTS,
tokenpak pak inspect. Scoring and assembly are not part of the OSS beta. - 50 built-in compression profiles — YAML, customizable.
Quick start¶
pip install tokenpak
tokenpak setup
# Then point your client at http://127.0.0.1:8766
→ Full installation guide → 5-minute Quick Start
Documentation map¶
| Section | What it covers |
|---|---|
| Installation | pip install, system requirements, first run |
| Quick Start | Setup wizard, client integration, first savings in 5 minutes |
| Configuration | How configuration works (env vars + YAML, precedence) |
| Environment Variables | Complete TOKENPAK_* reference |
| CLI Reference | Every verb, flag, and exit code (auto-generated) |
| Architecture | Three planes, modular subsystems, proxy-centered design |
| Savings | How TokenPak attributes savings causally |
| Security | Auth tokens, TLS, audit logging, data privacy |
| Troubleshooting | Common symptoms and fixes that work |
| Known Issues | Current OSS-beta limitations |
| FAQ | General questions |
| Recall overview | Paks, reason codes, risk flags — the OSS data plane |
| Client Guides | Per-client integration walkthroughs (Claude Code, Cursor, Cline, Continue, Aider, Codex CLI, Gemini CLI, OpenAI/Anthropic SDK) |
Source + package¶
- GitHub: github.com/tokenpak/tokenpak
- PyPI: pypi.org/project/tokenpak
- License: Apache 2.0