Getting Started¶
Get TokenPak running in under 5 minutes.
Requirements¶
- Python 3.11+
- An existing LLM client (Claude Code, OpenAI client, etc.)
- Your provider API key (Anthropic, OpenAI, etc.)
Install¶
Start the Proxy¶
The proxy starts on http://localhost:8766 and is ready to accept requests immediately.
Connect Your LLM Client¶
Point your existing tool at the TokenPak proxy instead of the provider directly.
Your credentials pass through unchanged. TokenPak never stores them.
Verify It's Working¶
Expected output:
✓ Proxy: running on :8766
✓ Compression: enabled (balanced mode)
✓ Cost tracking: active
✓ Session: 0 requests
Make a test request through your client, then:
Index Your Vault (Optional, Zero Tokens)¶
If you work with a large codebase or notes vault, index it for instant semantic search:
This uses a local SQLite registry — no LLM calls, no cost.
Auto-Calibration (Recommended)¶
Let TokenPak calibrate optimal parallelism for your hardware:
This runs once and saves a profile to ~/.tokenpak/calibration.json. Future indexing runs use it automatically.
Set a Budget (Optional)¶
Protect yourself from runaway costs:
Next Steps¶
- Proxy Setup — advanced proxy configuration, SSL, multi-provider
- CLI Reference — full command reference
- Recipe Development — custom compression recipes