TokenPak¶
The intelligent LLM proxy for context compression and vault injection.
TokenPak sits between your AI tools and the upstream LLM API, automatically compressing context, injecting vault knowledge, and caching tokens to slash costs and latency.
✨ Key Features¶
- Context compression — Up to 40% token reduction on large payloads
- Vault injection — Automatically enrich requests with relevant knowledge
- Token caching — Reduce repeat API costs with smart cache hits
- Drop-in proxy — Works with any OpenAI-compatible client
- Multi-provider — Routes to Anthropic, OpenAI, Gemini, and more
🚀 Quick Start¶
pip install tokenpak
tokenpak start --port 8766
# Point your client to http://localhost:8766
→ Full installation guide
→ Quick start guide
📚 Documentation¶
| Section | Description |
|---|---|
| Getting Started | Install and configure TokenPak |
| Configuration | All configuration options |
| CLI Reference | Command-line interface |
| Architecture | How TokenPak works |
| FAQ | Common questions |